Robust circuitry-based scores of structural importance of human brain areas

We consider the 1015-vertex human consensus connectome computed from the diffusion MRI data of 1064 subjects. We define seven different orders on these 1015 graph vertices, where the orders depend on parameters derived from the brain circuitry, that is, from the properties of the edges (or connections) incident to the vertices ordered. We order the vertices according to their degree, the sum, the maximum, and the average of the fiber counts on the incident edges, and the sum, the maximum and the average length of the fibers in the incident edges. We analyze the similarities of these seven orders by the Spearman correlation coefficient and by their inversion numbers and have found that all of these seven orders have great similarities. In other words, if we interpret the orders as scoring of the importance of the vertices in the consensus connectome, then the scores of the vertices will be similar in all seven orderings. That is, important vertices of the human connectome typically have many neighbors connected with long and thick axonal fibers (where thickness is measured by fiber numbers), and their incident edges have high maximum and average values of length and fiber-number parameters, too. Therefore, these parameters may yield robust ways of deciding which vertices are more important in the anatomy of our brain circuitry than the others.


Introduction
Identifying the most important nodes in large networks solely from their graph-theoretical properties was an important problem in the late 1990s, applied in scoring the web search engine hits.The most well-known solutions, the PageRank of Google [1] and the HITS algorithm of Kleinberg [2], fundamentally influenced the related areas.
Both the PageRank and the HITS algorithms score the nodes of a directed, unweighted graph, originally corresponded to the graph of the World Wide Web, but later, those algorithms were successfully applied for directed and undirected biological, social and chemical graphs, among other applications [3,4,5,6].
Since all human activities are governed by the cooperation of the cells in our brain, the study of the connections of these cells has specific interest.Unfortunately, the connections of the 80 billion neurons of the human brain are not mapped yet, and will not be mapped in the foreseeable future: to date, the only adult organism with completely mapped connections between its neurons (also called the connectome or braingraph) is the nematode C. elegans, having only 302 neurons [7].Recently, after many years of concentrated efforts, the neuronal-level connectome of a part of the brain of the adult fruit fly Drosophila melanogaster, its central brain, is mapped and published [8].Out of the 100,000 neurons of the fruit fly, the central brain contains around 25,000 neurons.The whole Drosophila melanogaster connectome is not published yet.
Instead of the neuronal-level connections, the imaging methods are capable today of mapping the human connectome on a much coarser scale than the level of the neurons.Due to the technical developments of magnetic resonance imaging (MRI) in the last fifteen years [9,10], today we can map the macroscopic connections between 1000 anatomically identified brain areas.These developments have opened up a new area of brain science called "connectomics", which examines the connections between the brain areas, and, instead of comparing the volumes of brain areas between healthy or diseased, old and young, male or female subjects, as in hundreds of previous cerebral volumetric studies, it concentrates to a more central question: the connections between those areas.
Our research group has studied the mathematical properties of human connectomes by applying strict graph-theoretical methods, terms, and approaches.We have used the public release imaging data sets of the Human Connectome Project [11], and prepared publicly available braingraphs from the imaging data, downloadable at the address https://braingraph.org in five different resolutions [12,13,14,15].The vertices of the braingraphs correspond to the anatomically identified areas of the cortical and sub-cortical gray matter, and two of the vertices are connected by an edge if the tractography phase [16,17] of the processing identified axonal fibers between the areas, mapped to the vertices.
In the present contribution, we consider an averaged consensus connectome, computed from the imaging data of 1064 subjects, and we order the vertices of the consensus braingraph, intended to catch their order of "importance" and compare the order of vertices in those lists.Our main result is that orders generated from the • degree of the nodes, • the sum of the numbers, • the maximum number, • and the average number of fibers in the incident edges, • the sum of the fiber lengths, • the maximum fiber length, • the average fiber length in the incident edges are similar to one another, their Spearman's rank correlation is high, and their inversion numbers are low.
This result means that ordering by any of the seven parameters above produces similar orders of the nodes, where the "similar" word is explained in detail later in this work.
In other words, the result can be interpreted that the most important nodes in our braingraph statistically have numerous and long incident axonal fibers, with high maximum and averaged values either for the length or for the fiber numbers.That is, if a node is in front of others in one of the seven parameters above, then, typically, it will have high values in the remaining six parameters, too.Therefore, all of these seven orders are robust in comparison with the other six ones.
In what follows, we describe precisely our methods and results.

Graph construction
The data source of the present work is the 1200-subject public release of the Human Connectome Project [11].The 3 Tesla diffusion magnetic resonance imaging data were processed with the help of the Connectome Mapper Tool Kit [16].
We have computed five different graphs for each subject with 83, 129, 234, 463, and 1015 nodes, where each node corresponded to an anatomic area of the cortical-and sub-cortical gray matter.The parcellation tool FreeSurfer was applied here [33,34,17].
The details of the workflow we followed are described in [14].Concisely, the axonal fibers were mapped by the MRtrix 0.2 tractography software, and repeated ten times for each subject.We connected two graph vertices, which corresponded to two gray matter areas, by an edge if, in all the 10 runs, axonal fibers were found running between the two areas.In this case, the maximum and the minimum number of fibers were deleted, and the remaining eight integer values were averaged and assigned to the edge as the fiber number weight.The length of the edge is also determined as the average length of the defining fibers.Consequently, all graph edges carry a positive weight (meaning the average of 8 fiber numbers) and a positive length (in millimeters).
Next, we constructed one single consensus graph on 1015 vertices from the 1064 individual graphs as follows.We have averaged the weight and the length for each edge, but we have followed different strategies.For averaging the weight, we added up the edge weight in all subject's graph and divided the sum by 1064; if an edge was not present in a subject, then we counted it as an edge with (an artificial) weight of 0. In the case of computing the average length, we counted the existing edges and divided the length-sum by this integer (for vertex pairs, which do not appear as an edge, 0 lengths were assigned).
Consequently, if #{i, j} denotes the number of appearance of edge {i, j}, and s i,j,k and h i,j,k denote in subject k the weight and the length of edge {i, j}, respectively, then We note that our earlier works [35,36] also describe parameterizable consensus graphs by user-selectable parameters at the website of the Budapest Reference Connectome https://pitgroup.org/connectome.In contrast, the dataset of the present contribution is a static graph.

Ordering the nodes
Here we consider seven different orders on the set of the 1015 vertices of our consensus graph, with abbreviations: SUM-weight is the weighted version of the Degree parameter.Here the fiber numbers of the incident edges are added up.Edges with higher fiber number or weight may connect more small gray matter areas than those with less weight.Consequently, we think that the SUM-weight parameter is more relevant in deciding the importance of a node than just the Degree.
MAX-weight -in a certain sense -is a simplification of the SUM-weight parameter.It describes only the weight of the largest-weight incident edge instead of adding up all the weights of the incident edges.Theoretically, it may happen that the ordering according to the MAX-weight differs a lot from the order defined by SUM-weight if, in many vertices, the incident edges have a small number of large a large number of small weights.It turns out later that in the case of our graph, it is not true; the orders are similar.
AVG-weight: Clearly, for each vertex, the Degree times AVG-weight is the SUM-weight.Therefore, the AVG-weight-based vertex-order may differ strongly from both the Degree-order and the SUM-weight order.As we show, the AVGweight based order is also similar to the Degree and to the SUM-weight order.SUM-length is the length-weighted version of the Degree.The Degree and the SUM-length values may differ a lot if a node is adjacent to many other vertices by short edges or few other nodes but with very long edges.If an important node usually has numerous and long incident edges, then the orders by Degree and SUM-length would not differ a lot.We show that in the case of our consensus braingraph, this is the situation.
MAX-length can be large, while SUM-length is small, so the order according to these parameters can be different in numerous positions.We show that this is not the case in our graph.
AVG-length times the Degree is the SUM-length for each vertex.Therefore, the AVG-length-based order can strongly differ from both the Degree and the SUM-length based order.We show the opposite for the case of the consensus braingraph.
The seven orders are explicitly given in the Appendix.
In the following subsections, we introduce two tools for the analysis of the similarity of these orders: the Spearman correlation and the inversion numbers.

Spearman's rank correlation coefficient
The Spearman coefficient [37] is an ideal tool for comparing different orders on the same base set.In our case, the base set is the set of vertices, and the seven different orders are defined by the seven parameters Degree, SUM-weight, MAX-weight, AVG-weight, SUM-length, MAX-length, and AVG-length.
The coefficient gives information about the correlation of two attributes, using the two orderings by the two attributes of the elements.Two indices are associated with every element, telling what its index is in the given ordering.There is a simple equation that calculates the coefficient if the indices are unique, meaning that no two attributes are the same.(Luckily, this is true for the consensus braingraph.)If we calculate two attributes of n elements and d i is the difference of the i-th element's two indices, then: The value of Spearman's correlation coefficient satisfies −1 ≤ ≤ 1, where = 1 means the perfect correlation and = −1 means the perfect opposition.
Remark 1.The bounds for can be proven by using the cubic formula for the sum of the first n square numbers: n(n+1)(2n+1)

6
. In the case of perfect opposition, we should calculate the sum of the first n 2 odd square numbers.This can easily be acquired from the sum of (not counted) even square numbers, as this is exactly four times the sum of the first that many square numbers.
Remark 2. The closer the coefficient is to 0, the less we can say about the predicted correlation.As n grows, the coefficients with smaller absolute values can also be significant.So the p-value is not only defined by , but it also depends on n.The acquired p-value is the probability of the correlation being that extreme under the assumption that the null hypothesis is true.
Inversion numbers Definition 1.In two given permutations of n elements, two different elements are in inversion with each other if their order is opposite in the permutations.Definition 2. Two permutations' inversion number is the number of (not ordered) pairs of elements which are in inversion.An element's inversion is the number of elements with which it is in inversion.Lemma 3. The expected value of the inversion number of two permutations of length n is n(n−1)

.
Proof.Look at an arbitrary unordered {i, j} pair of elements.Because of symmetric reasoning, the expected value of the contribution of this pair to the inversion number is 1  2 .(As every permutation has a bijective pair which differs only in the (i, j) transposition.Transposition is the function that only swaps two elements in a permutation.)Since n 2 = n(n−1)

2
unordered pair of elements exist in the set of n elements, by using the linearity of the expectation, we get that the expected value of the inversion number is the desired
Corollary 4. The expected value of the inversion of any element is n−1 2 .Proof.The linearity of the expected value can be used again, but now for the result of Lemma 1.As there are n elements and each inversion is counted twice (for both elements of the pair), the expected value by elements is n(n−1) Alternative proof.Make a bijection between the permutations: the pair of a permutation is the opposite permutation.In every pair of permutations, every unordered pair of elements {i, j} has each of its two orderings in exactly one of the members of the permutation pair.So for every element i, there are n − 1 different j elements, and the expected value of their inversion one-by-one is 1  2 .So the expected value of the inversion of the arbitrary element i is n−1 2 .

Spearman-correlations of different orderings
Correlations with the degree  The first row contains the coefficients, and the second the significance-characterizing p values.We note that the weakest correlation in the case of MAX-length is still very far from 0, and its p-value is very small.2 Spearman-correlations between the SUM-weight and MAX-weight orders and the three length-based orders.The first row contains the coefficients, the second the significance-characterizing p-values.We note that the weakest correlations in the case of MAX-length are still very far from 0, and their p-values are very small.3 Spearman-correlations between the AVG-weight order and the three length-based orders.The first row contains the coefficients, the second the significance-characterizing p values.We note that the weakest correlation in the case of MAX-length is still very far from 0, and its p-value is very small.

A simple control
For a simple control, we have computed the Spearman correlation between two obviously unrelated orders.Namely, we have taken the ordinal numbers of the vertices assigned by the parcellation software and the AVG-weight-defined order.The ordinal numbers are assigned in the way that around the first 500 numbered vertices are situated in the left and the second 500 vertices in the right hemisphere of the brain, in the same order.For these two orders = 0.01 and p = 0.65, therefore, our results in Tables 1-4 present a biological rule.

Analysis of the order-similarity by inversion numbers
In the Methods section, we have defined the inversion numbers and listed some of their fundamental properties.Here we present a graphical evaluation of the inversion numbers between the pairs of the seven orders studied by the Spearman correlation in the previous section.The number of vertices with high inversion-numbers according to distinct parameterpairs.Point n on axis x correspond to the most important n vertices in one of the pair-defined orders, while the height (i.e. the y coordinate) of the point correspond to the number with higher-than-expectation inversion number between the most important n pairs of orderings.On each panel the black line shows the n/2 expectation.It is easy to see on all panels that the higher-than-expectation inversion numbers appear in very few pairs, almost independently from the examined pairs of orderings.

Conclusions
We have analyzed the order of vertex-importance in the anatomically labeled consensus graph of the human brain, defined by circuitry-based parameters of the vertices: the degree (Degree), and the following parameters, computed on the incident edges for the vertex: Sum of fiber counts (SUM-weight), Max of fiber counts (MAX-weight), Average of fiber counts (AVG-weight), Sum of fiber lengths (SUM-length), Max of fiber lengths (MAX-length) and the Average of fiber lengths (AVG-length).For the analysis, we have used the Spearman correlation coefficient and the inversion numbers between the orders.We have found that the seven orders f vertex importance, defined by these seven circuitrybased parameters of the vertices, have a great similarity: i.e., the most important vertices -statistically -have many neighbors, connected with long and numerous fibers.We also have shown that orders defined by the maximum weight or length of the incident edges or the orders defined by the average weight or length of the incident edges do not differ very much from the orders defined by the sum of these parameters.
The results show the robustness of the orders by these seven parameters and also shows that vertex-importance in the human brain can be characterized by numerous parameters, but the list of the important vertices (or anatomical brain areas) will not be changed much.

Figure 1 :
Figure 1:The number of vertices with high inversion-numbers according to distinct parameterpairs.Point n on axis x correspond to the most important n vertices in one of the pair-defined orders, while the height (i.e. the y coordinate) of the point correspond to the number with higher-than-expectation inversion number between the most important n pairs of orderings.On each panel the black line shows the n/2 expectation.It is easy to see on all panels that the higher-than-expectation inversion numbers appear in very few pairs, almost independently from the examined pairs of orderings.

Table 1
Spearman-correlations between the Degree-based and the six other orders.

Table 4
Spearman correlations and p-values between the weight-weight based and the length-length-based orders.All the correlations are high, and the lowest value belongs to the SUM-length vs. MAX-length correlations.