How the network properties of shareholders vary with investor type and country

We construct two examples of shareholder networks in which shareholders are connected if they have shares in the same company. We do this for the shareholders in Turkish companies and we compare this against the network formed from the shareholdings in Dutch companies. We analyse the properties of these two networks in terms of the different types of shareholder. We create a suitable randomised version of these networks to enable us to find significant features in our networks. For that we find the roles played by different types of shareholder in these networks, and also show how these roles differ in the two countries we study.

S α = {s|s ∈ S, τ (s) = α}. (A1) We can use our data to define a Corporation-Shareholder network, B in which the set of nodes, V B , are the union of the set of shareholders and companies, V = S ∪ C. An edge is present in this network between a shareholder and a company if the shareholder has shares in that company.
In practice our work focusses on a projection of the corporation-shareholder network onto just the shareholder nodes. That is we define the Shareholder network P to have a set of nodes S, the set of shareholders. An edge between two shareholders, say s i and s j , exists in this network if both s i and s j have invested in the same company (at a level above our threshold). In terms of an adjacency matrix P for this network, we have that This ensures the shareholder network P is a simple network.

B Betweenness Centrality
A walk is a sequence of vertices in which each node is connected by an edge to the next node in the sequence. A path is a walk in which no node appears twice. The length of the path is the number of vertices minus one, i.e. the number of edges traversed as one moves through the sequence of vertices. For many centrality measures we consider the shortest path from an initial source node s and ending with a target node t. The number of shortest paths from s to t is denoted by σ st as there can be more than one path of the same length between any pair of vertices. Given these shortest paths, we define σ st (v) to be the number of these shortest paths which pass through some v other than s or t. Then, the betweenness [1,3] 1

C Closeness centrality
We will define closeness centrality c(v) [2,3] of a vertex v to be where d(u, v) is the shortest path distance between u and v and n is the number nodes in the component connected to node n.

C.1 Estimating Closeness
Consider first a general random graph, that is, one with a specific degree distribution but otherwise unconstrained, working in large sparse graph regime, N → ∞, k ∼ O(1). This type of configuration model graph can be constructed using edge rewiring. Suppose we start at a node of degree k. Then we might estimate that the number of nodes steps away from our starting node is wherez is some effective branching ratio. That is we expect each node we arrive at steps away from our starting node, in some breadth first search out from the initial node, to be connected to an average ofz new vertices which are then ( +1) steps away. The approximation here is that all nodes look the same as they must in a true random graph. The exception is the first node where we know that that has k neighbours if that node has degree k. However we note that statistically, all we are really saying in this approximation is that for most networks, taking a few steps is sufficient to allow us to sample any part of the network so statistically many networks will appear to be homogeneous on larger scales. If we are being more precise, for a random graph near its phase transition, where we can assume a tree like structure, we know thatz will be the average degree of a neighbouring node minus one -we arrive on one edge going into a neighbour, leave on the remaining edges. Because the current degree of a neighbour k k kp(k) However, for any given large network, we do not need to assume (A6) is true, merely that there is some effective branching ratio such that (A5) still works well.
To estimate closeness, we first estimate the maximum distance max by demanding that the total number of nodes connected to our starting node is the number in the Largest Connected Component N LCC as we assume we are studying nodes in this component. This may be estimated as Rearranging for N LCC 1, we find that Not surprisingly, if you start from a high degree node, a high k, your first step will reveal far more of the network and so take you closer to the remaining parts. Thus the maximum distance in a random graph drops as the degree k of the node increases.
Now we can use this to find the closeness c(v) of a node v since this is defined to be the inverse of farness, f (v), the average distance from a node to all other nodes. For the random graphs, or graphs which appear homogeneous on larger scales, we can estimate farness using (A5) as where we have used (A7) and we write v = max (k v ) as the largest of the shortest path lengths from vertex v which has degree k v . Not surprisingly this is dominated by the distance to the further nodes as in the tree they are the dominant contribution. We see that if (z − 1) k/N , i.e. if we are not close to the transition and we have a large N , then this result for farness gives While in this limit a random graph, let alone a real graph, is not a tree, it shows that we should expect the closeness centrality measure to be correlated with the degree of a node. Indeed the prediction is that the inverse closeness (farness) should show a linear dependence on the logarithm of the degree of a node, ln(k), with a slope that is the inverse of the log of the branching ration minus one, 1/ lnz, that is Since this expression is true where we do not have a tree, we do not expect the slope to match a the value ofz in a random tree (A6). Rather, if we do find a linear relationship for the farness and logarithm of degree, then the slope is a way of defining an effective branching ratio.

D Community detection algorithms
The Louvain algorithm [4] aims to produce a community structure which has a large value of modularity Q where Here A ij represents the adjacency matrix between nodes i and j; k i and k j are the sum of the weights of the edges attached to nodes i and j, respectively; m is the total number of edges in the graph. c i and c j are the communities of the nodes. The Louvain algorithm [4] starts each node in an individual community and tries to increase modularity by moving a node into the community of a neighbour. When a local maximum is reached, the communities are used to define a new graph where each node in the new network represents a single community in the previous network, and the process is repeated. The Infomap community detection is based on the movements of a random walker. The aim is to choose communities which minimise the amount of information needed to record the movement of random walkers between communities. This is done using the map equation: where M is the modules or partitions of the network and each node is assigned to a module i. L(M ) is the description length of the trajectory of a random walker walking along the links of the networks. q i and q i represent that the random walker enters and exits each module i, respectively. For details see Rosvall and Bergstrom [5].

E Comparison of community detection results for largest component of Turkey
If the structure of communities in the data is well established then using two different detection methods should be able to give similar results [6]. After detecting the communities of the graphs using the two algorithms, Louvain [4] and Infomap [5], for two countries, we found that the percentage of nodes whose two communities contain the same nodes is about 75% in Turkey. However, it is noticed that the Louvain method [4] produces a very large community size, while Infomap does not have this large community.
If we look at the largest component, the two different methods are separating this component in different ways, see Figure A1. We can see from Fig A1, most outside parts of the circles are drawn the same shape of nodes in the same colours which means the these nodes are in one community in both methods. In the center of the graphs, that nodes are coloured differently show these square nodes are in same community in Louvain but in different communities in Infomap method. In Table A1 we give out the statistics of the comparison of communities.  (a) Louvain method (b) Infomap method Figure A1. Comparison between the two detection methods. The left one is for Louvain method and the right one is for Infomap method. The layout style is based on forcedirected graph drawing. The number of unique communities for Louvain method is 9 and for Infomap is 124. Each colours represents a community and the colour schemes of the two methods are the same.

G Update of Data Base
The data of two countries is retrieved from BvD [7], which is updated every year. The total number of known companies in a given year changes. For example, a 4% difference is observed from 2017 to 2018. However, the authors have downloaded the data and done the analysis at different years from 2016, 2017 and 2018. The results described in the main text show no noticeable differences.