Mean Field Approximation for Biased Diffusion on Japanese Inter-Firm Trading Network

By analysing the financial data of firms across Japan, a nonlinear power law with an exponent of 1.3 was observed between the number of business partners (i.e. the degree of the inter-firm trading network) and sales. In a previous study using numerical simulations, we found that this scaling can be explained by both the money-transport model, where a firm (i.e. customer) distributes money to its out-edges (suppliers) in proportion to the in-degree of destinations, and by the correlations among the Japanese inter-firm trading network. However, in this previous study, we could not specifically identify what types of structure properties (or correlations) of the network determine the 1.3 exponent. In the present study, we more clearly elucidate the relationship between this nonlinear scaling and the network structure by applying mean-field approximation of the diffusion in a complex network to this money-transport model. Using theoretical analysis, we obtained the mean-field solution of the model and found that, in the case of the Japanese firms, the scaling exponent of 1.3 can be determined from the power law of the average degree of the nearest neighbours of the network with an exponent of −0.7.


Introduction
Complex networks have been extensively studied over the last decade [1][2][3]. Problems of transport on complex networks, which are some of the most fundamental problems concerning the physics of complex networks, have also been intensively studied. For example, random walks on complex networks have been investigated from various viewpoints [4][5][6]. One application of transport on complex networks is PageRank, which corresponds to the steady-state density of transport caused by random walks on the Internet. PageRank is one of the most successful indices evaluating the importance of web pages and has been utilized by internet search engines.
Most studies regarding transport on complex networks have been based on theoretical approaches; however, recently the problems of actual transport on complex networks have also been studied by using massive data analysis. Examples of such problems include, airport traffic on the worldwide airport network [7], the number of trains on the Indian railway network [8], the flow of viewers on portal sites [9] and control flows on stock-ownership networks [10][11][12].
In a previous study, we analysed the empirical data from an inter-firm trading network, which consisted of approximately one million Japanese firms, and the sales of these firms (a sale corresponds to the total in-flow into a node) to investigate the actual transport phenomenon in a complex network [13]. This inter-firm trading network is known to be a typical complex network with a power law degree distribution [13,14], a negative degree-degree correlation [13,15], a small world property [14], community structures [16], power laws of money flows [15,17] and an asymmetric behaviour of authorities and hubs explained by a network-evolution model based on the preferential attachment rule [18,19]. To be more precise, we obtained the following results from Ref. [13]: (i) we found a non-trivial empirical power law with an exponent of 1.3 between the number of business partners in a firm and its sales by analysing the data; (ii) we introduced a simple money-transport model in which a firm (i.e. a customer) distributes money to its out-edges (i.e. suppliers) in proportion to the indegree of the destinations; (iii) using numerical simulations, we found that the steady flow of the abovementioned inter-firm trading network derived from this model can approximately reproduce a power-law scaling with an exponent of 1.3 between the number of business partners (i.e. degrees of the network) and sales. Furthermore, the sales distribution of the actual firms obeys the power law distribution with an exponent of approximately 1. Moreover, the sales of individual firms derived from the moneytransport model are shown to be proportional to the real sales on the average. However, we also showed that the simple random walk model (i.e., PageRank) in which a firm is assumed to evenly distribute to all its outgoing neighbours does not reproduce the empirical scaling with an exponent of 1.3. This result implies that PageRank does not correspond to the sales. Note that scaling with an exponent of 1.3 was observed are observed by analysing the abovementioned Japanese financial data not only between the sales and degrees but also between other important firm-size indicators, for instance, between sales and number of employees, between profits and number of employees, between profits and degrees, etc. [20].
In the previous study [13], we could not specifically identify what types of structure properties (or correlations) of the Japanese inter-firm trading network determine the 1.3 exponent, although we found that we need a particular network structure to be reproduced in our framework using numerical simulations. Therefore, the current study is devoted to the theoretical study of the results of Ref. [13]. In particular, we apply a mean-field approximation to the models to clarify how the exponents of the power-law relationships, specifically between the degree of the inter-firm trading network (i.e. the number of business partners) and sales depend on the network structure.
In this paper, we start in Section 2 by reviewing the models and the numerical results of the previous study. Next, we introduce the mean-field approximation for these models and reveal the relationships between the power-law exponent, models, and networks in Section 3. Finally, we summarize and present our conclusions in Section 4.

Power Laws for Diffusion on Inter-Firm Trading Network
In this section, we review the results of Ref. [13]. To understand the empirical scaling with exponent 1.3 between degrees and sales, we introduced the following money-transport models [13].
Model-1 (Equi-partition model): Model-2 (Weighted partition model): where x m (t) denotes the sales of node m at time step t, A ij is an adjacency matrix, k (in) m is the in-degree of node m, k (out) m is the outdegree of node m and N is the number of nodes on the network. Note that for k (out) i~0 in Eq. (1) or (2), we omit the contributions of the node i.
In Model-1, a node representing a business firm is assumed to evenly distribute its total in-flow (i.e. sales) at time t to all its     (1)]. The black triangles indicate the network for c~{0:7, the red nablas for c~0:0 and the green squares for c~0:7. The mean of sales as a function of degree, x(k), for Model-1. From this figure, we can confirm that x(k) is proportional to k regardless of k nn (k). These results agree with the approximation given by Eq. 11, namely, x(k)!k. doi:10.1371/journal.pone.0091704.g003 outgoing neighbours in the next time step. This transport is equivalent to the simple probability diffusion of the PageRank model in the case of no random spontaneous jumping [4]. However, for Model-2, a node distributes its sales to its outgoing neighbours in proportion to the destinations' in-degrees. This model is one of the so-called biased random walk models [21]. We used two types of networks for numerical experiments to clarify the dependencies on the firm network [13]. The data set used in the generation of networks and the data analysis in Refs. [13,20] was provided by Tokyo Shoko Research, Ltd. and contains approximately one million firms, which practically covers all active firms in Japan. For each firm, the data set contains the annual sales and a list of business partners in 2005, categorized into suppliers and customers [13,14,20].
The first network is the real firm network whose nodes are firms and whose edges are defined by the following rule: if firm i purchases goods and/or services from firm j, or, equivalently, if money flows from firm i to firm j, we connect node i to node j with a directed link (there are 961,318 nodes and 3,783,711 edges) [13,14]. This network was generated by using the business partners data in the abovementioned data set. The firm network is a typical complex network whose main properties are as follows: (i)a power-law degree distribution with exponent 1.3, (ii)a negative degree-degree correlation and (iii)a small-world property (the mean distance between nodes is 5.62 and the maximum distance is 21) [13,14,16].
The second network is the shuffled firm network, which is an almost-uncorrelated network with the same degree distribution as the firm network generated by the Maslov-Sneppen algorithm [22,23]. Note that we simulated the time evolution of the models on the largest strongly connected component (LSCC) for each network to neglect boundary effects. Then, total money is conserved, For all combinations of models and networks, we obtained by numerical simulations the following power law for large k (in) (tww1) [13]: Here, x(k (in) ) is the conditional mean of x for a given k (in) as a function of k (in) and is defined by where d(i) is the separator of box i, taken evenly in logarithmic space, e.g. d(i)~2 i (i~0,1,2, Á Á Á ), and k (in) on the right-hand side is represented by the geometric mean ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d(i) : d(iz1) p . The exponent b depends on both the network and model. These dependencies are summarized in Table 1. Note that Model 2 for the firm network reproduces the empirical scaling with exponent 1.3.

Mean-Field Approximation for Models
In this section, we apply the mean-field approximation to the models to understand the relationship between the exponents, models, and network structures as shown in Table 1. Here we neglect the directions of the networks for simplicity (scalings nearly identical to results presented in the previous section can be obtained by numerical simulation, regardless of whether or not the edge directions are neglected). By assuming P N i~1 x i (1)~1, we can regard the models as Markov processes with an existing probability of x M (t), where Q Mi (t) is the transition probability from node i to node m [13]. Note that although we neglect the direction of edges in this paper, the following discussion can be extended to the case of a directed network by adding some assumptions and calculations. In general, one of the mean-field solutions of the Markov process defined by the transition matrix is given in Ref. [24] by where g(k) is the weight of the transition probability to a node with the degree k, which is defined as a function of the degree k, R(k) is the probability that the walker is at any node of degree k in the steady state, P k (k) is a probability density function of degree k and P k'Dk (k'Dk) is the conditional probability that a node of degree k is connected to a node of degree k'. To obtain this solution, the authors of Ref. [24] employed mainly two approximations. First, they used the annealed network approximation where they regarded the sets of firms with common degrees in the original network as nodes of the approximated network, and the weighted edges of the approximated network are associated with a probability that two nodes of degrees k and k' are connected, where i[k is denoted as a sum over the set of nodes of degree k [24]. Second, they replaced Q Mi with average probability of an interaction between nodes of degree k and k', defined by Then, we can replaced the Markov process given by Eq. 5 with where R(k; t) is the probability that the walker is at any node of degree k at time t. By substituting Eq. 7 into this equation and by using the degree detailed balance condition, k'P k (k')P k'Dk (k'Dk)~kP k (k)P k'Dk (kDk'), we confirm that Eq. 7 is a steady state of Eq. 10. More details on these approximations are available in Ref. [24] Next, we apply Eq. 7 to our models given by Eqs. 1 and 2. In Model-1, the transition probability is uniform; that is, g(k)~1 in Eq. 6. Therefore, the conditional mean of sales x(k) for the given degree k (i.e. R(k) per node), is written as This scaling agrees with simulation results for Model-1 for the firm network and the shuffled network.
Similarly, by applying Eq. 7 to the case that g(k)~k in Eq. 6 (corresponding to Model-2), we obtain where we denote the average degree of nearest neighbours as k nn (k): P q qP k'Dk (qDk) [25]. Thus, we find From Fig. 1(a) in Ref. [13], we see that k nn (k)!k {0:7 for a large degree k in the case of the firm network. We substitute this empirical result into Eq. 13 to obtain This scaling corresponds to the simulation results for Model-2 for the firm network and empirical observation. Note that Eq. 13 implies that the exponents b depend on the average degree of nearest neighbourhoods, k nn (k) for Model-2.
For the shuffled network for Model-2, which is the uncorrelated network shown in Fig. 1(a) of Ref. [13] (i.e. k nn (k)~const:), we obtain the following from Eq. 13: This equation agrees with the numerical result. Finally, we numerically check the approximation by using different artificial networks that have power-law degree distributions with exponent 1.3 and power-law average degrees of nearest neighbours like the firm network. The artificial networks satisfy the following conditions: N The degree obeys the power law distribution, P k (k)!k {a{1 . N The average degree of the nearest neighbours is expressed as a power function of the degree, k nn (k)!k c .
To generate networks that satisfy the above conditions, we modify the configuration model as follows: 1. Assign the degree sequence, fk i g, obeying the power law with exponent {a to nodes fig. For where M is the number of nodes and floor½x is the largest integer not greater than x. 2. Randomly sample a node denoted by the v from the set of nodes that have the largest fk i g.

Update
Randomly sample a node denoted by w from all nodes except for node v with probability proportional to fq i g, where q i~k bz1 i . 5. Update k w ; k w /k w {1. 6. Connect node v and w with an edge (undirected). 7. Repeat steps 2 through 6 until k i~0 (i~1,2, Á Á Á M). Figures 1 and 2 show the cumulative distribution the function of degree, which corresponds to P k (k) and the average degree of nearest neighbours as a function of degree k nn (k) for the artificial network with parameter a~1:3 (empirical parameter), c~{0:7 (empirical parameter), 0:0 and 0:7. In addition, Figures 3 and 4 show the mean of the sales as a function of degree x(k), which is numerically derived from Eqs. 1 and 2 for the corresponding artificial networks. From these figures, we can confirm that all scaling exponents obtained numerically accord with the results of the approximations given by Eq. 11 for Model-1 and by Eq. 13 for Model-2. Moreover, from Figure 4, we confirm that only the case c~{0:7, which is the empirical observation of k nn (k) for the firm network, reproduces the empirical scaling 1.3.

Conclusion and Discussion
In this study, by applying the mean-field approximation to the money-transport models given by Eqs. 1 and 2, we were able to consistently understand the relationships, between the power-law exponents, models and network structures summarised in Table 1. In particular, we presented the connections of the non-trivial power law scaling with an exponent of 1.3 (in Model-2 for the firm network) with the average degree of the nearest neighbours k nn (k), which is one of the cardinal features of a network. This result is one of the explanations for empirical scaling relationships with exponent 1.3 between the number of business partners and sales. Moreover, non-trivial empirical scaling with exponent 1.3 between sales s and number of employees l, s(l)!l 1:3 , which is observed in Ref. [20], might be connected to the network structure. Because, roughly speaking, combining the scaling between sales s and degrees k, s(k)!k 1:3 , described in this study, and the trivial empirical scaling between number of employees l and degrees k, l(k)!k 1:0 reported in Ref. [20], we obtain the scaling between employees and sales, s(l)!l 1:3 .