A Curve Shaped Description of Large Networks, with an Application to the Evaluation of Network Models

Background Understanding the structure of complex networks is a continuing challenge, which calls for novel approaches and models to capture their structure and reveal the mechanisms that shape the networks. Although various topological measures, such as degree distributions or clustering coefficients, have been proposed to characterize network structure from many different angles, a comprehensive and intuitive representation of large networks that allows quantitative analysis is still difficult to achieve. Methodology/Principal Findings Here we propose a mesoscopic description of large networks which associates networks of different structures with a set of particular curves, using breadth-first search. After deriving the expressions of the curves of the random graphs and a small-world-like network, we found that the curves possess a number of network properties together, including the size of the giant component and the local clustering. Besides, the curve can also be used to evaluate the fit of network models to real-world networks. We describe a simple evaluation method based on the curve and apply it to the Drosophila melanogaster protein interaction network. The evaluation method effectively identifies which model better reproduces the topology of the real network among the given models and help infer the underlying growth mechanisms of the Drosophila network. Conclusions/Significance This curve-shaped description of large networks offers a wealth of possibilities to develop new approaches and applications including network characterization, comparison, classification, modeling and model evaluation, differing from using a large bag of topological measures.


RANDOM NETWORKS WITH ARBITRARY DEGREE DISTRIBUTIONS
A. Explanation of Eq. 6 based on exploring the network vertex by vertex Suppose a network has N vertices and M edges. In Fig. S1, at time t , vertex A is firstly touched by its parent C and is pushed at the end of QueueT with position y. After that, the process of exploring is going on till time t, all the vertices between C and A in QueueT are explored, and there are xN vertices have been touched before we explore A. During this period, as illustrated in Fig. S1, one part of the stubs belonging to the vertices between C and A meet the vertices touched before A. While the other part of their stubs meet the vertices touched after A and including A, that is, this part of stubs have chance to meet the stubs of A, we denote the number of this part of stubs by E .
When vertex A is on exploring at time t, it also has chance to meet the vertices untouched or touched after A (gray part of QueueT at time t in Fig. S1, including A itself). Use E x1 and FIG. S1: Pictorial explanation of Eq. 6 in the main text based on exploring the network vertex by vertex. At a given time, the black part of QueueT are the vertices that have been explored, the gray part are the pending vertices that have been touched but not explored, and the white part represents the untouched vertices though they are not in QueueT yet. One copy corresponds to one stub sticking out of the vertex to meet other stub uniformly at random, and two connected stubs form an edge. E yx to represent the number of stubs belonging to the vertices untouched and touched after A, respectively. Since at time t the E yx stubs of the vertices touched after A are partially occupied by the E stubs come from the vertices touched before A, there are E yx − E stubs left which are able to meet the stubs of A.
Therefore, vertex A has G(yN ) − 1 stubs (except one stub connecting its parent C) to meet the stubs belonging to three parts of vertices: touched before A (E stubs), touched after A (E yx − E stubs) and untouched (E x1 stubs). Since the stubs of vertices are coupled uniformly at random and the probability of the appearance of self-loops and multi-edges of A goes as N −1 , then in the limit of large N , the expected number of untouched vertices that A will meet is The explanation of Eq. 6 based on exploring the network vertex by vertex works well if N was much larger than the maximal degree of the graph. But it is not suitable for a random network with extremely dense edges, where the average degree k ∼ N . Since there are too many self-loops and multi-edges in such a network, the edges of a same vertex have high correlation which can not be omitted. To overcome the limitation of the explanation of Eq.
6 by exploring the network vertex by vertex, here we explore the network edge by edge: where ∆y means searching one edge of the vertex on exploring at one step (except the edge which brought this vertex into QueueT), and ∆x is the probability that the vertex on the other end of the edge is untouched in the limit of large M .
respectively. The dots in black are the simulated results from one run on a random regular graph of size N = 10, 000 and r = 5, 000.
This ∆x ∆y is agree with Eq. 6. Thus Eq. 6 is also valid for random networks with extremely dense edges. Fig. S2 shows an instance of random regular graph with extremely dense edges.
C. Derivation of Eq. 7 for random graphs with arbitrary degree distributions Eq. 7 is derived from Eqs. 2, 3, 4, 5 and 6: Eq. 2 is Eq. 4 is Eq. 5 is In the limit of large N , we use a mean-field approximation where G(tN ) and H(tN ) are represented by their expectations g(t) and h(t), respectively. Eq. 6 becomes the following integral equation: where S 1 (0) = k = 2M/N is the average degree of graph, substituting Eqs. S3 and S6 into this equation gives integro-differential equation: From Eqs. S4 and S5, get S 1 (z) dz = dt and dS 1 (z) = −S 2 (z) dz, then the initial values of x, y and z are all zeroes, then along with Eq. S5, give Eq. 7 where 0 ≤ y ≤ x ≤ t end ≤ 1, t end = 1 − S 0 (z(t end )). z(t end ) is the smallest positive root of 2z = ln S 1 (0) − ln S 1 (z).

D. Derivation of Eq. 10 for Poisson-distributed random graphs
The degree distribution of this graph obeys: Then with Eq. S5 the initial values of t and z are all zeros, then with Eq. S6, substituting this equation into Eq. S3 and Eq. S8, the curve functions are: where 0 ≤ y ≤ x ≤ t end < 1, and t end is the smallest positive root of t = 1 +

E. Power-law distributed random graphs
In this text, we consider a power-law degree distribution given by where α is a constant and C = 1/ kmax k=k min k −α . k min and k max are the minimal and maximal degree of the graph, respectively. By the means used in the main text for random graphs with arbitrary degree distributions, let (S16) where z is a variable with initial value z 0 = 0 and suppose they are all finite. The curve functions of BFS-tree and BFS-graph of this power-law distributed random graph follow the same form of Eq. 7 and Eq. 8 in the main text: (S17) 2. DERIVATION OF EQ. 11 FOR LATTICE EMBEDDED RANDOM REGULAR

GRAPHS (LERRGS)
The LERRG is an uncorrelated combination of a random r-regular graph and a ddimensional finite lattice with periodic boundary conditions, i.e., each vertex has 2d nearest lattice neighbors and r long-range neighbors chosen uniformly at random from the lattice.
The natural numbers d and r are supposed such that N d ≥ 1, N r ≥ 1, and rN is even to ensure that the copies of vertices can be coupled randomly.
To get the curve functions of BFS-tree and graph, we should know how many untouched neighbors that a vertex on exploring will visit (see Eq. 2 or Eq. S3). There are two contributions to this number: long-range neighbors connected by random edges and local neighbors on lattice.

A. The contributions of long-range neighbors
The situation is similar to that of random graphs with arbitrary degree distributions.
Suppose xN vertices have been touched before we explore a vertex A of position y. Vertex A is in one of the two cases: (i) Vertex A was brought into QueueT by a random edge from its parent, then similar to Eq. S1, for large N , the expected number of untouched vertices that A will meet through its random edges is where G(tN ) is the graph degree of the vertex of position t in QueueT. In this graph, 2M = (2d + r)N and G(tN ) ≡ 2d + r (t ∈ (0, 1]), Eq. S18 can be written as (ii) Vertex A was brought into QueueT by a lattice edge, it has r random edges free and the expected number of untouched vertices that A will meet through its random edges is

B. The contributions of local lattice neighbors
To be easily understood, we consider the effect of lattice edges on a single d-dimensional lattice at first, and then turn to the LERRG. The process of BFS starts from the origin and forms a "crystal" (see Fig. S3(a)). Since the search order is random, the touching order of all vertices of type L l is random. Suppose a vertex A of type L l+1 has j neighbors of type L l , then each of these neighbors shares an equal chance 1 j to touch vertex A at first. For example:

Lattice neighbors on a d-dimensional lattice
and N 4 . Vertex C has a probability of 1/4 to touch vertex A at first among these four vertices of type L 10 . Now, we consider how many untouched vertices of type L l+1 will be visited by a vertex of type L l on exploring.
Lemma S1 During the process of BFS on a d-dimensional lattice, a vertex C of type N j (j = 0, 1, . . . , d) will visit 2 d−j j+1 untouched vertices of type N j+1 , and j j untouched vertices of type N j on average. The total expected number of new vertices that C will visit is 2 d−j j+1 + 1.

Proof.
There are two cases for vertex C: lattice neighbors of type L l+1 and N j+1 . Since a vertex of type L l+1 and N j+1 will be touched firstly by vertices of type L l with equal probability 1 j+1 , then C will touch 2 d−j 1 1 j+1 vertices of type L l+1 and N j+1 on average. On the other hand, C has j lattice neighbors of type L l+1 and N j . Then the expected number of untouched vertices that C will visit is Since C only has d lattice neighbors of type L l+1 and all of them are of type N d , then C will visit d d = 1 untouched vertices on average. In this case, 2 d−j j+1 + 1 is also valid.

Lattice neighbors on a LERRG
For the existence of long-range random edges, small crystals grow randomly all over the lattice during the process of BFS (see Fig. S3 is how many untouched vertices will be visited by A through its lattice edges. Here we call the lattice edge connecting vertex A and a vertex of type L l+1 (l + 1 lattice steps away from A's crystal origin) as free lattice edge of A (edges in red in Fig. S3), and denote this type of edge by F LE j if the vertex on the other end was of type N j . By Lemma S1, A has 2(d − j) and j free lattice edges of type F LE j+1 and F LE j , respectively. These free lattice edges may lead us to untouched neighbors on lattice.
Lemma S2 During the process of BFS on a LERRG of d-dimension, suppose xN vertices have been touched before we explore a vertex A of position y in QueueT. The expected number of untouched vertices that A of type L l and N j will visit through lattice edges is Proof.
We will detect every free lattice edge of A and calculate the probability that the vertex on the other end is untouched.  vertices.
Therefore, the probability that B has not been touched yet before we explore A is where B is of type N j . Then, by Lemma S1, the expected number of untouched vertices that A of type L l and N j will visit through lattice edges is 1−x 1−y of them are of type N j+1 , and 1−x 1−y of them are of type N j .

C. The combined contributions of the random and lattice edges
Suppose a vertex A of position y has a child of position x in QueueT. Let ρ j and ρ j be the probability of A and its child to be the type of N j (j = 0, 1, . . . , d), respectively. Where d j=0 ρ j = 1 and d j=0 ρ j = 1. Although there are many crystals interlaced all over the lattice during the process of BFS, as illustrated in Fig. S3(b), the type (N j ) of a vertex is determined only once by its parent who touch it at first. One vertex only has one type. For example, if a vertex was brought into QueueT by a random edge firstly, then the type of this vertex is N 0 .
Considering a parent vertex of all possible types (N j ) and in association with Eqs. S19 and S20 and Lemma S2, the relationship between a vertex and its children on search tree satisfies: where ρ j and ρ j are the probability of A and its child being the type of N j , respectively.
Therefore, in the limit of large network size, the expected number of untouched vertices that A will visit is: It is difficult to calculate ( d j=1 (ρ j + c j−1 ρ j−1 ) + r − ρ 0 ) directly, we will show that it approaches the dominant eigenvalue of matrix M dr and (ρ d , · · · , ρ 1 , ρ 0 ) approaches the associated eigenvector.
The eigenfunction of matrix M dr is where r ≥ 1 and d ≥ 1. Note that Eq. S29 has a largest real root. Denote the roots of Eq. S29 by λ 1 , λ 2 , . . . , λ d+1 , and they satisfy: where λ 1 is the largest real root.
Refer to the power method used in finding the dominant eigenvalue of a matrix in the study of numerical analysis [1,2], let v s ρ = (ρ s d , · · · , ρ s 1 , ρ s 0 ) be the vector after s steps of iterations, and v s ρ is initialized with v 0 ρ = (0, · · · , 0, 1) for all the vertices are of type N 0 at the beginning, they satisfy  The rate of convergence depends on |λ 2 /λ 1 |. Table S1 shows that ( d j=1 (ρ s j + c j−1 ρ s j−1 ) + r − ρ s 0 ) approaches λ 1 with high accuracy after few iterations. Each component (ρ j ) of the associated eigenvector of λ 1 represents the probability of a vertex to be a corresponding vertex type (N j ).
Therefore, in the limit of large network size, Eq. S27 can be written as With Eq. S3, the curve functions of the BFS-tree and BFS-graph are where λ 1 is the largest real root of (λ − 1) d = r(λ + 1) d−1 (Eq. S29). Tables S2 and S3 show that the analytic results are well consistent with the simulated results.  Note that these functions are consistent with that of random regular graphs (Eq. 9 in the main text) when d = 0. Comparing LERRGs with random regular graphs, there are two interesting cases. One is the LERRGs of d = 1 (a ring combined with a random regular graph), where both the curve functions of BFS-tree and BFS-graph are same as that of random (2d + r)-regular graphs. This is due to the fact that such LERRGs are equivalent to the corresponding random regular graph which possesses a Hamiltonian circuit. The event of a random r-regular graph of r ≥ 3 possessing a Hamiltonian circuit occurs asymptotically almost surely [3,4]. The other is the LERRG of d = 2 and r = 1, whose curve function of BFS-tree is same as that of a random 4-regular graph, i.e., 1 − x = (1 − y) 3 . In other words, when we explore two vertices (except few vertices close to the root) of a same position x

ROOT SELECTION EFFECTS
We here examine the variances of the graph curves affected by using four different root selection schemes, (1) Pick one end of a randomly chosen edge.
(3) Pick a vertex with the maximal degree.
(4) Pick a Vertex with the minimal degree (at least has one edge).
for all the networks studied in this manuscript, including random graphs, LERRG, PPI network and the related network models.
For random graphs and LERRG, the above four schemes have negligible differences since the edges reaching out from a root will touch child vertices proportional to their degree, which has a same effect of picking one end of a randomly chosen edge. The expressions of the tree and graph curves for random graphs are exact in the limit of large network size. It observes gradually diminishing of the curve difference with the increasing of vertex number in simulations, as shown in Figure S5 and Tables 3 3. However, the curve variance of real networks or networks built by models is relatively larger, as Figure S6 shows. To quantify the variance of a set of graph curves derived from a same network, we calculate the graph distance of each graph curve to a center curve which is simply the average of these curves. Denote the curve difference by ∆D G , we use its average value ∆D G , which an average over all graph curves in the set, to represent the curve variance caused by the choosing of root and the processing of BFS.
Numerical results show that the ∆D G of random graphs are small and decrease with a larger network size (see Table 3) under the four different root selection schemes. The ∆D G of the Drosophila protein network, DMC, DMR and LPA are relatively large that cannot be ignored even when using a same root selection scheme (see Table 3). In these situations, graph curves derived from a same network have obvious differences. Fortunately, the curve that averaged over amounts of curves derived from a same network is stable with small ∆D G , as is shown in Figure S6b and Table 3. This result suggests that the average curve is a stable description of networks and statistically suitable for network comparison.
Note that the curve difference ∆D G of the first root selection scheme is very small, thus the comparisons between the PPI data and the three network models in the main text are based on these average curves. In detail, we averaged over 100 curves derived from a same network where the root is selected by picking one end of a randomly chosen edge.
Among the four seed selection schemes, the graph curve rooted from a maximal degree vertex is the most different from the other three, which brings a variability of the graph curves, as you have predicted and figured out, that could be useful in a statistical test to compare two different graphs. We are eager to study this variability for more real-world networks and network models in the future, especially when they have a special underlying structure and are sensitive to the schemes of root selection or the graph search algorithm.  Each ∆D G is calculated by comparing 100 curves derived from a same network to a center curve considering different root selection schemes, where the center curve is averaged over the 100 curves using the first scheme. There are no obvious differences between the four root selection schemes. The ∆D G of a LERRG is relatively larger than the random graphs since it contains a lattice structure that brings more variability during the process of BFS. The ∆D G is small and decreases with a larger network size.  Each ∆D G is calculated by comparing 100 curves derived from a same network to a center curve considering different root selection schemes, where the center curve is averaged over the 100 curves using the first scheme. The ∆D G is large that cannot be ignored. The average curves are stable with smaller differences. Graph curve rooted from the maximal degree vertex is the most different from the other three.

TEST THE ROBUSTNESS OF NETWORK CLASSIFICATION METHOD
To test the robustness of our classification method against noise, we artificially introduce some noise into the original network by two kinds of edge rewiring mechanisms [7,8]. The first is to replace some percentage of original edges in the network by random ones (noise1), and the second is to randomly rewire some percentage of edges while maintaining the degree distribution of the original network (noise2).
When given a network G which is built by using one of the three network models, i.e., DMC, DMR and LPA, we perturb the structure of G by introducing different percentages of noise, and then classify the resulting networks to find back the model which built it. Figure   S7 shows the median graph distances of the three network classes as functions of the rewiring percentage for the two kinds of noises, where each network instance is built by models based on the size of the Drosophila protein network with a confidence threshold of P * c = 0.5, and each data point is averaged over 100 different realizations of the randomization procedure.
As validation, the networks are confidently classified as Poisson-distributed random graphs for high levels of the noise1. The results show that the classification performs well for small and intermediate amounts of the noises. Meanwhile, the robustness against the second noise is better than the first one since the second noise maintains the degree distribution of the original network.
Among the three network classes, the DMC networks are the most sensitive to both the two noises, even when noise2 keeps their degree distributions unchanged. The reason may be that the underlying structure of the DMC networks is sharply changed when a small fraction of noises is introduced. As shown in Figure S8, a small fraction of noises largely increases the global connectivity of the DMC networks. That is also why the classification is sensitive for the noised DMC networks. Note that a part of the noised DMC networks are classified as the DMR. The reason here could be the growth mechanisms of DMR which randomly create links between a new vertex and any other old vertices is similar in effect to noise2 which rewiring edges randomly according to the degree of vertices.

NETWORK COMPARISON RESULTS
We generate 1, 000 examples for each of the three different network models with their vertices number N and average degree k around (allowing for small intervals of ±5%) that of the Drosophila network for different confidence thresholds P * c . The parameters q del , q con and q new are sampled uniformly in [0, 1]. In each network, multiple edges and self-loops are removed and isolated vertices are eliminated. Each curve of the BFS-graph is averaged over 100 runs of BFS on the corresponding network. The results for P * c = 0.5 have been presented in the main text. The results for P * c = 0.65 and P * c = 0.0 are shown in Figures  S9 and S10, respectively.