International Trade Modelling Using Open Flow Networks: A Flow-Distance Based Analysis

This paper models and analyzes international trade flows using open flow networks (OFNs) with the approaches of flow distances, which provide a novel perspective and effective tools for the study of international trade. We discuss the establishment of OFNs of international trade from two coupled viewpoints: the viewpoint of trading commodity flow and that of money flow. Based on the novel model with flow distance approaches, meaningful insights are gained. First, by introducing the concepts of trade trophic levels and niches, countries’ roles and positions in the global supply chains (or value-added chains) can be evaluated quantitatively. We find that the distributions of trading “trophic levels” have the similar clustering pattern for different types of commodities, and summarize some regularities between money flow and commodity flow viewpoints. Second, we find that active and competitive countries trade a wide spectrum of products, while inactive and underdeveloped countries trade a limited variety of products. Besides, some abnormal countries import many types of goods, which the vast majority of countries do not need to import. Third, harmonic node centrality is proposed and we find the phenomenon of centrality stratification. All the results illustrate the usefulness of the model of OFNs with its network approaches for investigating international trade flows.


I. INTRODUCTION
Flow network is an important tool for describing and analyzing open flow systems and has been studied extensively in a large range of flow network systems, such as ecological flow networks [6,9,10], world trade flow networks [11], traffic and transportation flow networks [12].Flow networks are commonly modelled by directed weighted networks, where directions and weights of edges represent directions and volume fluxes of flows respectively.Because open flow systems always exchange energy, matter and information with their surroundings, flow networks normally have two special nodes (i.e., the source and the sink) representing the environment, where all flows are supposed to be from the source, through the system and finally to the sink node [13].
Based on the model of flow networks, many useful methods have been developed for exploring structures and dynamics of flow systems.For example, Rosvall and Bergstrom [7] proposed a method of probability flow of random walks for revealing community structure in weighted and directed networks.Vézina and Platt [8] described an inverse method for estimating network fluxes in undersampled environments.In a recent work [9], we solved a problem that how to measure the distances between nodes in flow systems.Several flow distances, i.e., the first-passage flow distance, the total flow distance and the symmetric flow distance, were put forward, and their helpfulness in calculating "trophic levels" and clustering for nodes has been preliminarily shown in the work.
Using these methods, some significant laws and important knowledge have already been revealed in such kinds of complex flow systems.For instance, in living things, such as animals [1], plants [2] and microbes [3], allometric scaling laws (e.g., the power-law relationship between an animal's metabolic rate and its body mass [1]) exist extensively due to energy transportation on energy flow networks of living organisms [4,5].The recent works have shown that the universal allometric scaling law also exists in a much broader range of flow networks, e.g., weighted food webs [9] and trade flow networks [11].
However, complex systems containing multi-flows cannot be well represented by monolayer flow networks.For example, in a transportation system, aircraft traffic, train traffic and highway automobile traffic may coexist; in a city infrastructure, water flow, power flow and gas transmission flow exist at the same time; in a human body, the circulations of blood, lymph and interstitial fluid coexist.If we do not distinguish between different types of flows, and simply adopt monolayer flow network to describe these flow network systems, many implicit information of multi-flows will be lost.Thus, many important knowledge for multi-flow systems may not be revealed.Besides flow systems having multi-flows of different types of substances, complex flow systems containing flows of the same substance with different labels (e.g., groups) also need a practical tool for system modelling and analytics.In this paper, we use world wide trade as a specific instance of multi-layer flow network.
In previous work, international trade has been studied from the viewpoint of networking.
Different types of trade networks (such as binary or weighted, directed or undirected ones) have been built to model real trade interactions between countries.For example, Serrano and Boguñá [16] finds that the binary and undirected world trade web also presents the typical properties of complex networks, e.g., scale-free inhomogeneities and a high clustering coefficient.Fagiolo et al. [17] studied the topological properties and their evolution over time using a weighted world trade network.Fan et al. [18] explored countries' roles and positions using the improved bootstrap percolation and the other methods in an international trade network of single-layer.Unlike these previous studies [11,[16][17][18], here we use the framework of multi-layer flow network to explore novel characteristics of international trade networks based on the methods of flow distances, which is recently proposed in [13].
The rest of the paper is structured as follows.In Section 2, a formal description for multi-layer flow networks is presented, and flow distances are introduced.In Section 3, we use multi-layer flow network of international trade as a case, and discuss the establishment of the multi-layer flow network from the viewpoints of trading commodity flow and money flow respectively.Then, using the approach of flow distances, some results on countries' trophic levels, node centralities and mean first-passage flow distances from the source to the sink are discussed in Section 4. At last, we give the conclusions of this study in Section 5.

A. Modeling multi-layer flow networks
Multi-layer flow networks (MFN) can be regarded as a special type of multi-layer networks [14,15].This tool is able to well depict multiplex flows in different layers.Thus, we have the following definition.
An multi-layer flow network is a pair MFN = (G, E), where G = {G α ; α ∈ 0, . . ., M − 1} is a family of directed (binary or weighted) graphs G α = {V, E α } (called layers of MFN ), and is the set of interconnections between nodes and their counterparts in the rest of layers.
The node sets V in each layer are all the same, and are supposed to contain N common nodes and two special nodes "source" and "sink", where "source" is the start of all flows and denoted as node 0, and "sink" is the end of all flows and labeled as node N + 1. E α , the adjacency matrix of layer α, can be written as follows.
where e i,j is the flow from node i to node j.Especially, the first column and the last row of this matrix are all 0 because there are no inflow to "source" node and no outflow from "sink" node.Normally e 0,N +1 is 0, because ordinarily there is no flow from "source" to "sink" directly.The total inflow to node j (denoted as e •j ) is calculated as N +1 i=0 e ij and the total outflow from i (labeled as e i• ) is N +1 j=0 e ij .Because the flow system is assumed to be in equilibrium, except "source" and "sink", all other nodes have a balanced inflow and outflow, that is e •i = e i• , where i = 1, • • • , N .The flows to "sink" (i.e., e i,N +1 ) are regarded as dissipation.Let us consider a single layer α of a given MFN firstly.Suppose a large enough number (say λ) of particles flow along directed edges randomly.This random walk process can be assumed as a Markov chain.The mean first-passage flow distance (MFPFD) from one node i to another node j (denoted as l ij ) is defined as the expected number of steps for reaching j for the first time, given that initially the particles are at i.The mean total flow distance (MTFD) (denoted as t ij ) is the average number of steps for arriving at j regardless of whether it is the first time of arriving, also given that the particles are at i initially.
To illustrate the above concepts vividly, the following thought experiment is designed and considered.Suppose all particles are initially no color.There are two situations.1) If particles go through node i, their color turns blue; then when blue particles arrive at node j, they become red.In this situation, MFPFD from i to j is the average number of steps of blue particles turning red.2) If particles go through node i, their color turns blue; blue particles remain blue, when arriving at node j.Then, MTFD from i to j is the average number of steps of all accumulated blue particles (also containing those whose arrival times is more than 1) arriving j.Here, a blue particle may be counted repeatedly if it arrives at j for multiple times.
Based on the above experimental description, the computation of flow distances are given as below.Formally, we have a stochastic matrix describing the transitions of the Markov , where Then, for this absorbing Markov chain, its fundamental matrix is defined as below.
where I is the corresponding identity matrix with the same size of M α .Thus, we have the , where The detailed derivations of the above formulas can be found in [13].Because t α ij and l α ij are asymmetric and cannot well satisfy the applications which need symmetric distance metrics (e.g., clustering and generating minimum spanning tree), a symmetric flow distance c α ij was also introduced in [13], which is given as below.
Here, we present an alternative symmetric flow distance called symmetric minimum flow distance (SMFD), which is simply calculated as below.
These elements f α ij form the matrix of SMFD ) .Thus, matrices of all layers of T α , L α and F α can be assembled into the corresponding vectors, i.e., V T = Based on the matrices of MTFD, MFPFD and SMFD, many interesting knowledge can be found in the open flow systems, such as "trophic levels" and centralities of nodes.Trade Classification, Revision 4) standard [21] is used to organize hundreds types of products in the dataset.A fragment of the dataset is shown in Table I, where "ICode" and "ECode" are the corresponding codes for importers and exporters respectively, and "Value" means the value of commodity whose unit is thousands of US dollars."DOT" (direction of trade) has two options: 1 (data from the importer) and 2 (data from that exporter).The "Quantity" of commodity is measured by "Unit" whose codes can be "W" (weight of metric tons), "V" (Volume of cubic meters) and so on.Readers can refer to [19] for a detailed description of this dataset.
In this study, we use the value of commodity as the volume flux of the trade.

B. Establishment of the multi-layer flow network of international trade
We use the data in year 2000 to build a multi-layer flow network of international trade.
In this multi-layer flow network, there are totally 192 nodes (containing two special nodes "source" and "sink") and 1288 different layers, where each common node (and its counter-

Viewpoint of trading commodity flow
In each layer, for the given product of this layer, if there exists a trade relationship between two countries, a directed edge can be built from the exporter to the importer, and the corresponding value of the trade is set as the volume flux of each edge.The edge from "source" to the country node (i.e., j) means the production of this commodity in the country j, and the edge from node j to "sink" represents the consumption of this commodity in the country j.The volume flux of theses edges are the corresponding value of the production (or the consumption).
For each country node j (except "source" and "sink") in a certain layer α, according to the constraint of balanced inflow and outflow (Fig. 2), we have the following equation.
where e α 0j and e α j,N +1 are the domestic production and consumption of country j for the product in layer α respectively.Because e α 0j (the flow from the source to j) and e α j,N +1 (the flow from j to the sink) are not available, for simplicity, they are estimated as below.
FIG. 3: Illustration of a balanced country node from the viewpoint of money flow

Viewpoint of money flow for trading
Besides the viewpoint of trading commodity flow, the flow network of international trade also can be established from the viewpoint of money flow.Because the trading of commodity is always accompanied by money flow, the direction of money flow is just the opposite of that of commodity flow, that is from the importer to the exporter.The edge from "source" to the country node j represents country j's trade deficit driven by the consumption of the commodity, and the edge from country node j to "sink" means the surplus of the exports over the imports.
For each balanced country node j (except the source and the sink), Equation 9still holds (Fig. 3), where e α ij and e α jk are the money inflow from country i to country j and the money outflow from country j to country k respectively.e α 0j and e α j,N +1 are the deficit and the surplus respectively, which also can be calculated using Equation 10.

IV. RESULTS
A. Trading trophic levels 1. Countries' trophic levels from the viewpoints of commodity flow and money flow "Trophic level" is a term borrowed from ecology, meaning the position a species occupies in a food chain.In flow network systems, we use this variable to indicate the position that a node occupied in the whole system, and its value is one node's distance from "source".Since there may be multiple paths from the source to the node, adopting the distance of the shortest path from the source may underestimate the trophic level of a node, and using the mean first-passage flow distance (MFPFD) from the source is more reasonable [13].We use the layer of the commodity of live bovine animals as an example.Based on MFPFD discussed in the subsection II B, we use the MFPFD from the source to each country node as its trading trophic level.The trophic levels of each country from the viewpoints of commodity flow and money flow are given in Fig. 4 respectively.In the figure, country nodes are plotted in a circle, whose distances to the center of the circle are proportional to their trophic levels, and whose angles and colors are selected randomly.The nodes' sizes are in proportion to the natural logarithm of the nodes' total outflow (or inflow).
From the viewpoint of commodity flow, the source of commodity flow is the production of this commodity and the sink means the consumption.So for a certain commodity, the trophic level of a country node represents the position that the country occupied in the global supply chain (i.e., the commodity flow).The smaller the trophic level, the role of the country is more inclined to be the producer (i.e., the exporter) of this commodity.On the contrary, the bigger the trophic level, the greater the distance from the source to the country node and the country is more inclined to be the consumer (that is the importer) of this commodity.From Fig. 4a, we find that some countries (such as Germany, China, Australia, Canada and South Africa), whose trading trophic levels are slightly greater than or equal to 1, are inclined to be the exporters of live bovine animals.Some other nodes, such • The distributions of trade trophic levels of countries from two viewpoints exhibit similar patterns: a large percentage of trade trophic levels are equal to or slightly larger than 1, and another group of trophic levels is between 2 and 3.

Countries' trophic levels for different products
We compare countries' trophic levels in different layers with different products.We portray them using the countries-products matrix as shown in Fig. 6.In the figure, each point represents a trophic level with the corresponding country and product, and its value is depicted by the color, where white indicates the value does not exist, cyan means the value is between 1 and 2 (no including 2) and indicating the country tends to be an importer, and magenta represents a relatively high value indicating an exporter.Three variables, i.e., the number of cyan points (say n cyan ), that of magenta points (say n magenta ) and the mean of MFPFD from the source to the country represented by the row (say l α 0,i ), are extracted to characterize each row (or column).Rows and columns of the matrix displayed in the figure are sorted by the sum of n cyan and n magenta in descending order, then by l α 0,i .
In Fig. 6, we obtain an approximately right angle trapezoidal shape for cyan and magenta points.We can found that magenta points are concentrated in the upper portion of the trapezoidal shape, while the lower part is mainly cyan dots.It can be interpreted that the exporters of a wide spectrum of products are active countries in the international trade, which are located in the upper part of the figure and can be regarded as competitive and success countries according to [25].In contrast, bottom rows are those not active in the international trading, which can only export a limited variety of products and tend to be underdeveloped countries [25].Economies, which export few types of goods but import a

Countries' node centralities in a certain layer of product
We calculate country centrality based on flow distances in a certain layer of product.The centrality of a country node can be computed as the average of distances from the node to all the other countries as given in [13].However, if one of those distances is infinite, the node centrality will become infinite.Thus, the above computation method is infeasible for a sparse distance matrix in which most of the elements are infinite.Therefore, we propose a new definition of node centrality, called harmonic centrality, as where f i,j (j = 1, • • • , N and j = i) is the SMFDs between nodes i and j, and N is the number of common nodes in the flow network.Two special nodes, i.e., source node 0 and sink node N + 1, are excluded, because here we are concerned about the distances of the node to the other common nodes, and the distances to the source and the sink have been depicted in the metric of trophic level.
Thus, the harmonic node centrality metric can avoid the infinite distance problem and well measure centralities of nodes.A smaller f i implies a more central position the node occupied in the flow network, because it has a smaller average distance to the other nodes.
In Table II, we list the top 5 and the bottom 5 countries with their f i in the decreasing order of f i in the money flow network of trading live bovine animals.The top 5 countries are all important hub nodes from the viewpoint of topology.

Countries' node centralities for different products
We compute and compare node centralities of countries in different layers of products.
We establish a countries-products matrix containing the corresponding harmonic centralities.For example, the row for USA is We illustrate the ranking matrix in Fig. 7, where rows are sorted by the mean of rankings, and columns are rearranged in descending order of the number of countries participating and bottom 8 economies with their corresponding average harmonic centrality rankings (say fi ) and numbers of types of trading products (say n).It implies that fi may be a good alternative indicator for countries' competitiveness from the perspective of node centrality, and countries need to promote their harmonic centrality rankings in the trading of various products for competitiveness enhancement.We calculate the mean first-passage flow distances (MFPFD) from the source to the sink (say l α 0,N +1 ) for money flow network in each layer.It indicates the average step that a random particle may jump from the source to the sink, and can be regarded as the flow length of the network in the layer.In Fig. 8, the distribution and cumulative distribution of l α 0,N +1 for all 1288 layers are shown.We find that over 27% of l α 0,N +1 is 3, and the density of l α 0,N +1 decreases with the growth of the value of l α 0,N +1 .The maximal one is 6.24.We further compare l α 0,N +1 among groups of layers containing different categories of products.According to the classification of products given by SITC4 [21], the result is shown in Table IV, where the category of 9 (commodities and transactions not classified) is ignored.From the table, we find that a group formed by categories 5 (chemicals and related products, n.e.s.), 7 (machinery and transport equipment), 6 (manufactured goods classified chiefly by material) and 8 (miscellaneous manufactured articles) have a significant higher average l α 0,N +1 than the group of categories 0 (food and live animals), 1 (beverages and tobacco), 2 (crude materials, inedible, except fuels), 3 (mineral fuels, lubricants and related materials) .For simplicity, it can be concluded as manufactured products (categories 5, 6, 7 and 8) have a significant larger l α 0,N +1 than primary products (categories 0, 1, 2, 3, 4).This phenomenon also has been confirmed by the stacked distribution chart of l α 0,N +1 for different categories of products (Fig. 9), where blue bars are stacked on the left side by contrast with most yellow and red bars located on the right.

V. CONCLUSIONS AND DISCUSSIONS
In this paper, we use the approach of flow distances to explore the international trade flow system from the perspective of multi-layer flow network.The model of multi-layer flow network can well model the open flow system with multiple types of flows, where each layer contains one type of flows.We introduce the formal description of multi-layer flow networks (MFNs), and flow distances on the MFNs (e.g., the mean first-passage flow distance and the mean total flow distance), where a new flow distance called symmetric minimum flow distance is proposed.Then, we build the multi-layer flow network of international trade from two coupled viewpoints, i.e., the viewpoint of commodity flow and that of money flow.Thus, based on the approach of flow distances, some interesting knowledge are discovered  Firstly, countries' trading "trophic levels" is used to depict the positions that countries occupied in the international supply chain.From the distribution of trading "trophic levels", we find that countries can be divided into three groups: countries whose trophic levels are slightly bigger than or equal to 1, between 2 and 3, and larger than 3. Besides, since the viewpoints of commodity flow and money flow are coupled, some regularities can be found.
For example, money flow network and commodity flow network for the same commodity have the same mean first-passage flow distance from the source to the sink.
Secondly, by comparing countries' trophic levels in different layers with different products, we find that exporters of a wide spectrum of products are active and competitive countries in the international trade, while countries export a limited types of commodities are inactive and tend to be underdeveloped countries.Besides, we find some countries import many kinds of goods, which the vast majority of countries do not need to import.This phenomenon may indicate that these economies are at high economic risk.
Thirdly, we propose a new node centrality called harmonic centrality for solving the problem of infinite distance.A smaller harmonic centrality indicates a more central position the node occupied.Then, we compare harmonic centralities of countries in different layers of products.It is interesting to find the phenomenon of centrality stratification.It means that competitive countries tend to be in the center position in the trading of a large variety of products, while underdeveloped countries likely rank low in their limited varieties of trading products.
Fourthly, we compute the mean first-passage flow distances from the source to the sink for different types of commodities, which can be regarded as the flow length of the network in each layer.We find that manufactured products have significant larger mean first-passage flow distances from the source to the sink than primary products.
Our findings demonstrate the effectiveness of the proposed model of multi-layer flow networks and the approach of flow distances.
For instance, in a world wide trade system, money flows for trading different types of commodities may flow among different countries; in the logistics industry, different transportation flow networks are adopted by competing logistics enterprises.In such cases, multi-flows cannot be well distinguished with each other if we simply use monolayer flow network to model the systems.Overall, because these co-existing multi-flows are hard to be finely modelled by a monolayer flow network, it is necessary to introduce a new concept called Multi-layer Flow Networks (MFNs).Examples of MFNs include multi-layer trade flow networks, where different types of commodities flow in different layers, and multi-layer transportation flow networks, where passenger traffic flows on the layer of air transportation network and that of rail network.To the best of our knowledge, a generalized theoretical framework of multi-layer flow networks is still lack.So it is necessary to study multi-layer flow networks and their applications in depth.
B. Distance on multi-layer flow networksIn[13], L. Guo et al. have proposed flow distances on open flow networks.In this subsection, we will extend them to those on multi-layer flow networks.
FIG. 2: Illustration of a balanced country node from the viewpoint of commodity flow FIG. 4: Trading trophic levels of countries for the commodity of live bovine animals from different viewpoints.
FIG.6: Countries-products matrix reporting countries' trophic levels in different layers with different products.Countries and Products are arranged in descending order of the sum of n cyan and n magenta , and then l α 0,i , where n cyan , n magenta and l α 0,i are the number of cyan points, that of magenta points and the mean of MFPFD from the source to the country represented by the row respectively.

FIG. 7 :
FIG. 7: Countries-products matrix reporting countries' harmonic centrality rankings in columns.Countries are sorted in ascending order by the average ranking, and products are arranged in descending order of the number of trading countries.
FIG. 8: The distribution and cumulative distribution of MFPFDs from the source to the sink

24 FIG. 9 :
FIG. 9: The distribution of MFPFDs from the source to the sink for different categories of products

TABLE I :
A fragment of the NBER-United Nations trade dataset We use the NBER-United Nations trade data (http://cid.econ.ucdavis.edu/nberus.html)to explore new features of multi-layer flow network of international trade.The dataset covers the details of world trade flow from 1962 to 2000, and SITC4 (4-digit Standard International

TABLE II :
List of countries sorted by f i

TABLE III :
List of economies sorted by the average harmonic centrality ranking.n is the number of types of trading products and fi the country's average harmonic centrality ranking.

TABLE IV :
MFPFDs from the source to the sink for different categories of products