Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

How the network properties of shareholders vary with investor type and country

  • Qing Yao ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Centre for Complexity Science, Imperial College London, London, United Kingdom, Blackett Laboratory, Imperial College London, London, United Kingdom

  • Tim S. Evans,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Centre for Complexity Science, Imperial College London, London, United Kingdom, Blackett Laboratory, Imperial College London, London, United Kingdom

  • Kim Christensen

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Centre for Complexity Science, Imperial College London, London, United Kingdom, Blackett Laboratory, Imperial College London, London, United Kingdom


We construct two examples of shareholder networks in which shareholders are connected if they have shares in the same company. We do this for the shareholders in Turkish companies and we compare this against the network formed from the shareholdings in Dutch companies. We analyse the properties of these two networks in terms of the different types of shareholder. We create a suitable randomised version of these networks to enable us to find significant features in our networks. For that we find the roles played by different types of shareholder in these networks, and also show how these roles differ in the two countries we study.


Complex networks capture information about the bilateral relations between pairs of objects [1]. As pairwise relationship are so fundamental to many processes, the networks approach has proved to be a powerful tool for many different areas, see for instance Newman [2] for an overview.

This paper looks at some networks in an economics context which is one area where networks have proved useful [35]. In our work we focus on the networks representing the interactions between companies, a topic that has already received some attention. Vitali, Glattfelder and Battiston used network science to show that the world is in control of a few important shareholders [6, 7]. Takayasu and her collaborators have studied the flow of money from suppliers to consumers over long time periods [810]. Viegas et al. successfully applied the complex systems theory to Mergers and Acquisitions markets (M&A), studying the scaling relationship between the companies ancestry and the number of M&A to predict mergers [11]; Huajiao Li, Pengli An, Haizhong An and et al. has studied the common shareholdings and give implications particularly using Chinese listed energy companies [1216].

In our work, we use complex network methods to study the investment characteristics of different types of shareholders. To do this we build a network of shareholders linked if they have invested in the same company. The topological structures of this network have been quantified and analyzed. Furthermore, to provide some insights of organization choices: we have compared measures of the complex network with some empirical analysis.

Before that, we will first summarize some relevant concepts from finance and economics in order to place our work in this context.

The ownership and control of companies

The work of Berle and Means [17] provided an early and influential view of how ownership and control of companies need to be linked, based on their perceived failures in 1930’s US corporate governance to give shareholders effective control of companies. The approach to company ownership has evolved since the 30’s. For instance La Porta et al. porta1999corporate used information on large corporations in 27 wealthy economies to identify the shareholders with ultimate control of these firms. They came to the conclusion that large shareholders do now typically have power over firms.

However, in 1974 Zeitlin’s “Corporate ownership and Control” [19] proved to be highly influential, undermining previous widely held views that large corporations were not influenced by the wider economic and social environment [20]. A network approach is an ideal way to look at ownership and control within the context of the links between companies and their shareholders. A network is made up of two parts: the nodes (vertices or actors), and the bilateral relationship between pairs of nodes which are represented as pairs of nodes known as edges (links or ties). Nodes can be any appropriate unit while edges can represent any type of relationship between any the units; for example, kinship, material transactions, flow of resources or support. In our context, a natural way to capture the information about shareholdings is to make use the nodes to represent the companies and shareholders. The edges are directed with an edge from a shareholder to the company invested in.

Several studies of shareholders and companies start from this network perspective. For instance, Vitali, Glattfelder and Battiston [6] use a version of this shareholder-company weighted directed network to capture both the influence by a shareholder through direct shareholdings along with the influence implied by chains of ownership and shareholdings. This investigation first confirms that ownership tends to be parsed among numerous shareholders, while control is found to be in the hands of few important shareholders globally. They discovered a structure of bow-tie, revealing the control flowing to small tightly connected financial institutions.

Another use of this network approach is to study how power may be concentrated in a few financial institutions. Gai and Kapadia [21] used networks to see if such concentrations led to intrinsic weaknesses in the network of financial institutions which leads to the spread of failure in times of financial stress, using this as a possible explanation of the financial crisis.

Outline of paper

In this paper, we will use network methods to look at relationship between the shareholders of companies. Our focus will be on companies of all sizes, working with two examples—two countries with very different economic environments. The economic ecosystem depends on the many smaller companies as much as the few large companies and we include all of these in our data. In the following section, we will discus the source and nature of our data, and how we capture this wider economic environment using a network representation. We will then investigate how network analysis can throw light on the structure of the connections between shareholders.

Materials and methods

Data sources

The data used in this research is extracted from the Amadeus, a product of Bureau Van Dijk (BvD) [22]. This provides data on around 21 million companies across Europe, including the names of shareholders, the percentage of a company’s shares held by each shareholder. We have focused on the data for one year, 2014, and two exemplary countries within this data: Turkey and the Netherlands. We found fifty thousand Turkish companies and over a million Dutch companies.

The data we use for macroeconomic statistics is retrieved from the World Bank [23] and CEIC Data [24].

Network representation

Our data on the shareholders in companies has a natural representation as a network and this is available at [25]. In our “shareholder-company network”, each distinct shareholder and each distinct company is represented by a node. A directed edge is placed from shareholder to each company in which they have invested, see Fig 1. Note that some companies can be shareholders of other companies, that is, the intersection of sets {1,2,…,12} and {A,B,…,F} is not necessarily empty. For example in Fig 1, the node 11 would be node A. For example in Fig 1, the node 11 would be node A. Also, we do not have the value of each investment, nor can we be sure that all shareholders are present in our databases. For those reasons we chose not to try to represent the size of investments e.g. through a weight added to the edges.

Fig 1. Shareholder-company network example.

An example of our shareholder-company network, which illustrates the relationships between shareholders, the upper blue circles numbered 1 to 12 and companies, the lower red squares labelled A to G. An edge indicates that there is an investment from the shareholder represented by the source node in the company represented by the target node. For example, shareholders 3 and 5 have both invested in company E, and an arrow represents this relation. In addition, shareholder 3 has invested in company A, but no other shareholder.

The boundary of each network is defined by the nationality of the companies; here we study two examples: Turkish companies and Dutch companies. We consider all the shareholders of each company which means that we consider both domestic and overseas shareholders. We will highlight this when analysing the data for companies in the Netherlands.

We are particularly interested in the relationships between shareholders implied by their investments. That is, if two shareholders have invested in the same company they have a common interest and are likely to have similar wider commercial interests. So we will focus most of our work on the analysis of these investor-investor relationships and we do this through a representation of our data in terms of a projection onto just the shareholder nodes. Our “shareholder network” has one node for each shareholder, and two different shareholders are connected by an undirected edge if they have both invested in the same company. An example of the shareholder network is shown in Fig 2 which is a representation of same data shown in the shareholder-company network of Fig 1.

Fig 2. Shareholder network.

The “shareholder network” for the data displayed in Fig 1. This is the projection of the shareholder-company network onto just the shareholder nodes. The nodes here are just the shareholder nodes 1 to 12. An edge indicates that two shareholders have assets in common. For example, shareholders 3 and 5 have both invested in E, therefore, an edge between nodes 3 and 5 exists in this projected graph. Note that a simple network is used, edges have no directions and no weights, and there are no self-edges.

An important aspect of our work is that our data also classifies the shareholders to be one of 13 different types of owner, as listed in Table 1. We will use this classification to study how the structure of the shareholder networks depends on the type of owner. It is immediately clear from the numbers of each type that companies in different countries can have very different types of shareholder which already suggests that other aspects of corporate structure will be different.

Table 1. Summary statistics for different types of shareholder.

The different types of shareholder recorded in our data as retrieved from the BvD database. The numbers found in our different data sets are in the righthand columns.

Finally, in many situations we measure values but we need to see if these are large or small by comparing the results against those in an appropriate null model. Our null model is obtained by swapping pairs of edges in our shareholder graph which maintains the degree of each node [26] as in the configuration model (for example see [27]). However, we only make swaps which maintain the constraint that our edges are always between a shareholder node and a company node as illustrated in Fig 1.


In this section we will look at the results of our analysis of the shareholder network. We will start with some general characteristics of the network before moving on to focus on how the different types of shareholder play different roles in the network as revealed by various measurements.

General network analysis

Some key facts for our two data sets and for the shareholder networks derived from them are summarised in Table 2.

Table 2. Basic information on the two data sets.

Each data set looks at the companies registered in one country and their shareholders from any country for the year 2014. There is no information on many of the Companies as the numbers above indicate. The number of edges in our shareholder network is based on the information available on the shareholding information. The slope γ of a power-law degree distribution of similar slope, P(k) ∼ kγ, is a rough characterisation to illustrate the broad distributions. ‘LCC’ is the largest connected component.

An important characteristic of any network is the degree distribution, P(k). This may be defined as the probability that a node selected uniformly at random has degree k, where the degree of a node refers to the number of edges connected to a node. The degree distribution for the shareholder networks are shown in Fig 3. The distributions are generally fat-tailed as illustrated by the power-law forms shown in Fig 3.

Fig 3. Plots of degree distributions.

The degree distributions P(k) (the frequency of nodes with degree k) against degree k edges on a log-log scale for shareholder networks where the holdings are in (a) Turkish companies, (b) Dutch companies. The red dots are the raw data, the green crosses represent the same data in logarithmic bins, and the blue lines are the best linear fits (P(k) ∼ kγ) to ranges of k values where we see approximately linear behaviour. The slope of the blue lines, −γ, is 2.6 and 2.7 for Turkey and the Netherlands respectively. A summary of the general statistics of these shareholder networks can be found in Table 2.

It can be seen from Fig 3, that the distribution of the degrees of the network roughly follows a power-law. The large k tail implies that typically there are a small number of shareholders who have investments in common with large numbers of other shareholders. On the other hand, the small degree part of the distribution indicates that most shareholders have investment in common with only very few others.

The distributions show other interesting features. For the Netherlands there is a distinctive ‘bump’ in shareholders who are related to between twenty and a hundred other shareholders. These appear to be far more common than the trend shown for small degree would suggest. One explanation for this is that some companies have lots of overseas shareholding relationships, for example the French shareholder of the Turkish Tobam Holding Co. With foreign shareholders, it seems likely that most would be large shareholders who are looking to diversify their holdings by looking outside their home company. Including these shareholders has two effects. First including them increases the total number of nodes in our shareholder network which lowers the distribution P(k) for shareholders investing in companies based in their home country. Secondly, it seems likely that if a foreign shareholder has gone to the trouble of making one investment across borders, it is likely they have made several, so that they are the bump. Put another way this is a boundary effect. Our large foreign investors will also be linked through foreign firms to small foreign shareholders who only invest in foreign firms, part of the low degree part of the distribution. Those firms are excluded by definition from our data. The fact that the bump is much less pronounced in the Turkish shareholder network suggests that Foreign investment does not play such a big role in this case. As the macroeconomic data suggests that Netherlands has high FDI (International trade and foreign direct investment) indexes both outward and inward, 13.37% net inflows of GDP. It ranks 9th in the world Netherlands ranks 10th when including Hongkong, China who is number 1 and Turkey ranks 178th.

Analysis by shareholder type

One of the major features of our data set is that we can distinguish between 13 different types of shareholder as shown Table 1. So we will have a look at how various network measures reveal the different roles of different types of shareholder and how that depends on the two countries we are studying.


The degree of a node is one of most important and simplest centrality measurements. In our shareholder networks, a high degree of a shareholder may indicate that that shareholder has better contacts within the business community. To evaluate the different role of different types of shareholders in Section, the violin plots for the degree of different types of shareholders are shown in Fig 4. Violin plots are similar to box plots indicating ranges and additionally show the estimated probability densities of the data at different values and include a marker for the median of the data.

Fig 4. Violin plots of the degree.

Violin plots of the degree of the most common types of shareholders for the largest connected component of shareholder network of (a) Turkish and (b) Dutch companies. There are too few shareholders for other types of investor. This figure breaks down degree distributions into different types of shareholders. We note that in Turkey, the large degrees are contributed by the banks and insurance while in Netherlands, banks’ average degree is higher than the other types of shareholders. It means Netherlands’ banks co-invested a lot with other shareholders.

This degree centrality is straightforward to find from the data, but does full complex network provide more information? We consider several other topological measurements but we start with one of the simplest. We look at the effect on the largest connected component of removing nodes one at a time, choosing the remaining nodes uniformly at random from those of just one type. The effect of removing different types of shareholder on the largest connected component of the Turkey shareholder network is shown in Fig 5.

Fig 5. The number of components increases as the number of nodes removed.

(a) Turkish and (b) Dutch companies. Nodes of one shareholder type are chosen at random and removed one by one from the largest connected component of the shareholder network. Results shown here are averaged over 100 realisations. Blue represents Bank shareholders being removed, green represents Corporate and red represents Families. The larger scale plots display the regions in the dashed boxes of the smaller scale plots to more clearly reveal the behaviour for small numbers of node removals. Note in particular the different role of banks (blue) and corporates (green) in Turkey and Netherlands. The small and big plots share the same axis labels.


In our percolation analysis, we focus on one type of shareholder. Starting from a given shareholder network, G(r), we choose one node, uniformly at random from the set of shareholders of the given type, and we remove that node and any edges attached to it. This leaves us with the next network G(r − 1) with one less node. We then repeat the process. In our case we will start from the the largest connected component of one of our shareholder networks, and then we will remove the nodes corresponding to one of the more common shareholder types. We will look at the number of components of the sequence of networks G(r) as a function of the number of nodes removed, r.

Results averaged over 100 realisations are shown in Fig 5. For both countries, we see that the change in the number of components is roughly linear, at least for a relatively large rank of r values, but the slopes are very different. These differences in the percolation analysis give us an insight into the roles of different types of shareholders within the network.

Individual and Family shareholders, which are 92% of the nodes in Turkish shareholder network while 0.34% in Dutch shareholder network, seem to have a limited effect on the connectivity of the largest component. This can be understood by the nature of Individual and Family shareholders whom we would expect to have investments in a few closely related companies and so would only be linked to a few closely linked shareholders. That is, it is not surprising if Individual and Family shareholders are poorly connected to other shareholders and are somewhat peripheral to the network.

There are also a large number of nodes corresponding to Corporate shareholders, just under 7% in Turkish shareholder network and 80% in Dutch shareholder network, and their average degree is again not high. Being different from Family shareholders, removing this type of shareholder breaks up the giant component much more quickly. So Corporate shareholders seem to be important in bringing together smaller components. For instance, in the real world, companies involved in mergers and acquisitions are likely to bring together different bodies of interest.

Since Banks shareholders often invest in a large number of different assets, in terms of the shareholder network they are going to be responsible for providing a path in the shareholder network between many different types of shareholder. This central role is reflected in the fast rate at which their removal breaks up the largest connected component.

Investor assortativity.

Different types of investors mix with other types of investor to different extents. This can be measured by looking at the the assortativity in the types of investor at the end of each edge. The covariance of the investor labels associated with the two ends of each edge is defined as (1) where is one if vertex i is an investor of type τ and this is zero otherwise. Here m is the number of edges. To measure assortativity, we look at the diagonal elements, cov(τ, τ), to see if edges have the same type of investor at both ends. If we sum these and normalise by the variance in the labels we arrive at the investor type assortativity coefficient r (2) where vertex i is an investor of type τi. This varies in value between a maximum of r = 1 for a perfectly assortative network, in which each investor is only linked to investors of the same type, while the minimum value of r = −1 indicates a perfectly disassortative network. A value of r = 0 implies that the types of investor at the ends of edges are uncorrelated.

Our results are shown in Fig 6. In the case of Turkey, the Corporate shareholders have a high assortativity, indicating that they prefer to invest with each other. Likewise Family investors in Turkey prefer to invest with each other. Despite this similarity in terms of assortativity, other network measures will show also Family and Corporate investors in Turkey play very different roles. For the Netherlands, the only standout feature is that Mutuals and Banks prefer to invest with each other, suggesting that in the Netherlands these investors are tightly linked.

Fig 6. Heatmaps of degree assortativity.

The degree assortativity in the LCC for different investor types. It has been normalised for the counts of pair of each edge in LCC. The assortativity coefficient r is 0.17 for Turkey and 0.0081 for Netherlands. The colour bar is set at the same scale.

Diversity of neighbours.

Interestingly, while Banks and Corporate shareholder nodes are important in maintaining the connectivity of the shareholder network, there is an important difference in their share holding patterns. To see this we turn to a measure of the diversity of the neighbours, d(i), of a node i in terms of the different types of shareholders. Our measure of diversity of a node i is defined as: (3) Here ki is the degree of node i and ki(τ) is the number of neighbours of node i which are of type τ. If the neighbours of a node i are all of the same type, say , then d(i) = 0. However, if the neighbours of node i are all of a different types, ki(τ) = 1, then diversity would be ln(ki). To make a suitable comparison, we find the expected measure of diversity dnull given the distribution of labels in each data set, that is (4) where N(τ) is the total number of nodes of type τ and we have N = ∑τ N(τ). The null model diversity measurements indicates the global diversification of different types of shareholders within one country. In Fig 7, we see that in terms of the classification scheme used in our data, the Netherlands has a much more diverse set of shareholders than Turkey. If a node’s diversity is lower than this expected diversity value, this indicates attraction of certain types of shareholder to the same investments. On the other hand, if a node’s diversity is higher than might be expected at random, this indicates that some types of shareholders repel each other, as the probability of them co-investing is lower than expected.

Fig 7. Violin plot of diversity index of selected types of shareholders.

(a) Turkey and (b) Netherlands. The information of other type is not listed here because due to the limited information available and the limited amount of the data. The blue space is the diversity index density estimation and compared with a null index (indicated by a green line) which is define as dnull in Eq (4).

Diversity indices for Turkey correlate roughly with degree for different types of shareholders. The exception is the Family shareholder type whose diversity index is the lowest and below the global average model, indicating these shareholders tend to invest with a very limited range of co-investor types. One explanation is that many Family type shareholders only invest with the same type of shareholders and perhaps, in many cases, these connections reflect real social and family ties. We will see further evidence for this view in other measures. The Netherlands’ diversity index is interesting as most of the shareholder types have mean diversity measures below the global diversity measure, showing some tendency for Dutch shareholders to invest with a limited set shareholder types. Overall though the values of diversity measurement of the same type in two countries are similar, implying that in terms of diversity, the behaviour of different types of shareholder is similar in different countries, except the type Families.

Betweenness centrality.

Another way to study how the roles of different types of shareholders vary in our network, it is useful to look at how betweenness centrality values vary. Betweenness centrality measures the number of shortest paths passing through a node. In the context of our shareholder network, the shortest path can be interpreted as the the minimum number of common assets that connect other two shareholders as each edge represents a shared asset between a pair of shareholders. The interpretation is that the higher the betweenness the more likely they will be central to the process of connecting other shareholders making them more important to other shareholders.

In order to see if these betweenness values are significantly high or low, and so to see if this measure gives different information from the degree, we compare our values against those in our null model in which the edges are swapped but the degree of each node is unchanged. We create 100 different null models and use these values to create the boxes in Fig 8 alongside the results obtained from our data.

Fig 8. Plots of betweenness.

The mean betweenness values for different types of shareholders in the largest connected component of shareholder networks, (a) for Turkey and (b) for the Netherlands. The red dots are the real data and the box plots for the results obtained from 100 degree preserving null models. We note that most betweenness values for Turkey and Netherlands are significantly different from the randomised networks, some types are lower and some types are higher. That means that there is significant network structure on larger scales and the properties are not just controlled by the degree.

In Fig 8(a) we show the betweenness centrality values for each type of shareholder in the Turkish shareholder network, as well as the results from our null model. In this case, Banks always have the highest average betweenness and highest maximum betweenness. This implies that a high percentage of shortest paths go through Banks which in turn means that Banks can play a pivotal role in linking other shareholders. As these are key instruments for providing investments in firms, this is not surprising. However, this network measure confirms our intuition and hence we see these companies are fulfilling their role in the economy.

However, nodes representing Banks, Financial and Insurance companies are much less central than you expect from the null model. These companies have a high degree yet their betweenness is lower than one might expect. So Banks, Insurance and Financial companies are still very central in the network but they are much less effective in brokering connections than we would expect from their degree value, the message from their betweenness values in the null model. What this suggests is that in Turkey, Banks, Insurance and Financial companies are investing in a narrower range of companies than they could.

The State organisations in Turkey are also less central than expected, suggesting their involvement is constrained by some issues, e.g. political or legal constraints limiting involvement to certain key sectors or to just a few larger firms.

On the other hand while the largest number of shareholders in Turkish companies are the Family type shareholders, this type of shareholder has the lowest average betweenness. This is consistent with what we found from both the low degree of most Family shareholders but also from the diversity measures that the focus of many Family type shareholders may be framed within a social and family setting. Another explanation may be that the size of these investments may be smaller, again biasing their involvement to smaller firms. The picture is that the investments made by Family type shareholders are peripheral to the large scale shareholding structure in Turkey. Our results on community detection in section will support this view.

When we compare the results for Turkey against those for the Netherlands, we see two big differences as the Banks and Insurance companies investing in Dutch companies are much more central than found in the null model, the opposite of our result for Turkey.

Closeness centrality.

While betweenness centrality indicates how a node may control the important communication pathways between shareholders, closeness centrality indicates how easy it is for each shareholder to reach any other shareholder. The closeness of a node is the inverse of the sum of the shortest path distances from that node to all other nodes. The larger the closeness of a node, the shorter the distances to other nodes and so in general there are fewer message transmissions, less information is lost, communications will be faster and generally will cost less.

In the context of our shareholder network, information can be related to opportunities to buy new assets or to sell existing ones. Since the network is highly interconnected, a failure in one sector can have repercussions in another so the earlier a shareholder hears about potential problems, the more successful they are likely to be.

To see how the closeness varies for the different types of shareholders, we use the edge swapping techniques of our null model to make comparisons. In Figs 9 and 10, we see that for the both shareholder network that the average closeness of each shareholder type is lower than in the randomised model and that this shift is similar for all types of shareholder. This tells us two things.

Fig 9. The average closeness indices for different types of shareholders in the LCC of Turkish shareholder network.

In Fig (a) we show the results for each shareholder with the red dots for the original data while the box plots are for the randomised data. In (b) each point shows the average ‘Farness’ (the inverse of closeness) of one shareholder type against log(N/k), where k is the average degree of nodes of that type. The higher blue points are for the original data, the lower orange points are for the randomised network. The lines in (b) are for a linear fit to the points. The slope of this fit to the original data is 0.71, 0.26 for the randomised network and the theoretical value in a random branching model is 0.24.

Fig 10. The average closeness indices for different types of shareholders in the LCC of Dutch shareholder network.

In Fig (a) we show the results for each shareholder with the red dots for the original data while the box plots are for the randomised data. In (b) each point shows the average ‘Farness’ (the inverse of closeness) of one shareholder type against log(N/k), where k is the average degree of nodes of that type. The higher blue points are for the original data, the lower orange points are for the randomised network. The lines in (b) are for a linear fit to the points. The slope of this fit to the original data is 0.34, 0.16 for the randomised network and the theoretical value in a random branching model is 0.17.

First that there is additional structure in the real world over the null model. That is revealed by the change in closeness between real data and our null model. The fact that null model always has higher closeness can be explained if there are many peripheral nodes, many nodes which are at a large distance from most other nodes. Our null model will bring in lots of ‘short cuts’ to/from these peripheral nodes, the distances to these peripheral nodes drop and so the null model closeness values are higher. Put another way, connections in the real shareholder network mean that communication within the network is not as efficient as it could be.

For Turkey, most nodes represent Family type nodes and these have the lowest closeness, suggesting they form the bulk of the peripheral nodes. For the same reason, it is the corporate shareholders who are peripheral for The Netherlands. Conversely, in both cases, though there are few banks, these have high closeness indices suggesting they are not part of the periphery.

Overall, the average closeness values for all shareholder types appear to be dominated a large set of peripheral nodes who are less well connected to the global network than they could be, as the randomised networks show. The only significant difference in closeness values between the different shareholder types is a reflection of their average degree, making this centrality measure somewhat redundant when looking at averages across groups of node. That is not to say that closeness is not useful. For individual shareholders, a comparison with the typical behaviour, in terms of degree and closeness, can lead us to find interesting outliers (say low degree, high closeness) worthy of investigation for a given context.

Community detection.

The shareholder network can also show us if the common shareholdings reveal large scale ‘communities’ within the shareholders, more than the labels in the data record. Such communities in the projected network show us groups with common interests. An illustration of communities is shown in Fig 11. Looking at these groups can tell us something about the diversity of shareholders for each corporate or the centrality of shareholders of the whole economy, which has been discussed in Section.

Fig 11. Illustration of communities in a network.

The same projected graph as in Fig 2 from the network graph shown in Fig 1. The different colours label them as different communities which are the structural characteristics in the network science context. As in the graph, 1-12 shareholders are categorised into 4 communities, 12 belongs to one community, 2,3,4,5,6 belong to another community, 8,9,10,11 are in the third community and 1,7 are in the fourth community.

To do this we use community detection methods to look for groups of nodes which typically have more connections between themselves than one might expect, and/or fewer connections to nodes outside a community. Two popular algorithms have been used here to detect the communities: the Louvian method [28] and Infomap [29], see S1 Appendix for more details on these methods. We construct a distribution for the size of the communities we find. Using two approaches gives us a handle on the uncertainties in this process and we look to see to which community each node belongs to for the two methodologies separately. Some statistics are listed in Table 3.

Table 3. Statistics of the communities.

Communities are found in the shareholder networks derived from the two data sets using the two different methodologies, Louvain (L) and Infomap (I). ‘Avg. community size (CS)’ is the average number of shareholders in one community. The average community size is defined by as the number of shareholders divided by the number of communities, where the number of communities include the communities whose community size is 1. The average community sizes excluding single nodes are: 3.03 and 3.01 for Turkey and 3.03 and 2.98 for Netherlands.

In Fig 12, we show the community size distribution on a log-log plot for Louvain and Infomap method for each country.

Fig 12. Community size frequency distribution.

Community size frequency distribution for (from top to bottom): (a) Turkey and (b) the Netherlands. The figures are plotted on log-log scale. For each country we show the results from two methods; Louvain on the left and Infomap on the right. The blue cross represents the data, the green dot represents the data binned using a logarithmic binning, and the black line is a linear fit to the binned data.

The community size distributions are clearly fat-tailed and power-laws, indicated by the straight lines on the plots, capture most of the behaviour. These distributions show that the vast majority of communities are small, typically three or four shareholders. These are simply disconnected components of the shareholder graph created when a small number of shareholders invest in the same one or two companies. Their shared connections mean these shareholders form a strong community.

The tail of these community size distributions in Fig 12 shows that there are a small number of large communities representing shareholders have cross-invested in each other’s investment portfolio but in a way that is highly correlated. Comparing against our null model, we find that such correlated cross investment, the fat-tail of the community size distribution, disappears after our edge swapping. This again shows that the shareholder network is not like a random graph, it has significant structure which reflects a non-trivial way in which these connections are made.

To see what we can learn from these community structures we look at the kind of shareholder we find in each community using the classification of our 13 type of shareholders shown in Table 1. We will take Turkey as an example. In this case two types of shareholder dominate: the Industrial investor (Industrial companies) and the Family investor (‘one or more named individuals or families’). In any one community, we look at the fraction of shareholders of these two common types in the different communities. The distribution for each of the two common types of shareholder are shown Fig 13.

Fig 13. The number of communities found with a given fraction of one type of shareholder.

The communities are found with Infomap in the shareholder network for Turkey and Netherlands. On the left we have the fraction of Family shareholders in different communities while on the right we have the fraction of Corporate shareholders in each community. The figures in first row includes community of all sizes. The fat-tailed distribution means this is dominated by the large number of small communities, and these are almost always of a single type of shareholder, hence the peak at 1.0. The second row shows the same analysis done when we exclude small communities which have three or less nodes (CS = community size). Similar analysis for the Louvain community detection method is given in the S1 Appendix.

It can be seen from Fig 13 that Individual or Family shareholders behave quite differently from the Corporate shareholders in these two countries. Individual and Family shareholders prefer overwhelmingly to invest in companies with the same type of shareholder. One explanation is that this preference for other Family type owners reflects genuine family ties in the social sense. In general though, individual or family shareholders tend to bond together and exclude other types of shareholders. By way of contrast, we can see that Corporate shareholders are much more happy to share control with other types of shareholders.

However, if we exclude the large number of small communities, those of size one, two and three shareholders, the number of communities we do not see much change in the pattern of ownership for those with family type shareholders in Turkey, there is not much change in the types of shareholder in these larger communities, Family type shareholders prefer to share control with other Family type shareholders. On the other hand, we do see that larger communities containing industrial companies are far more likely to have mixed types of investor. This phenomenon can be explained by the fact that individual or family shareholders are mainly in small isolated communities which is created by few common investments. In contrast, the corporate shareholders in Turkey and Netherlands appear in both large and small communities. In small communities, they do not invest with other types of shareholders, while in large communities, they have relatively low occupation rate. Further detailed comparison of largest connected component has been provided in S1 Appendix.

The co-invested structures of individual or family are small simple and pure and this supports a picture of the controlling power of family social unit as discussed in the work of Villalonga and Amit [30] and that of Yurtoglu [31].


A core strength of network science is its ability to model relationships between individuals while allowing us to capture the structure of the network on bigger scales and to find the impact of this structure on the individuals within the group. Here we have used this approach to build a network of shareholders and their relationships, as defined by common share holdings. The key to this is to be able to construct the network from real data sources which are difficult and expensive to obtain and require extensive cleaning. We have shown this can be done by producing networks for two different countries. An important aspect of our networks is that we retain information on the types of shareholder involved so that we provide a new perspective on the roles of these different types of shareholder.

One network feature of note is the way that closeness centrality is found to be of little use as it is highly correlated with degree, in fact linear correlation between the inverse of the closeness of a node with the logarithm of the degree of that node. This stems from the fact that much of its behaviour is dominated by the network at large distances from any node where the network, at least statistically and in terms of the shortest path routes, acts like a random graph and so like a random tree.

Our network analysis has highlighted several features in the data. One particular one is that the role of the individual or family investor in Turkey is far more peripheral that found in the data for the Netherlands. We have seen this in percolation, diversity, and betweenness measurements and in the makeup of the network communities. Likewise, the properties we have found for the Corporate shareholder in Turkey suggest they form a central core for that network. This observation suggests that the core-periphery paradigm [32] could be useful here perhaps using one of many ways to quantify the concept such as [33, 34]. We leave this for later work. Another observation has been the way what are termed Bank shareholders seem to have a different role in the two countries, more important than other types shareholders in the Netherlands and Turkey.

Looking ahead, one application of our methods in the context of finance would be to evaluate the risk in such networks. Our percolation measurement illustrates the principle. By removing nodes at random we see how the network has different vulnerabilities to random failures in different types of shareholder. We could also use our network to see how the loss of confidence of one shareholder might spread through the network, effecting the price of different companies in different ways. This would illustrate how negative (or positive) effects travel through the networks which will give rise to the systematic risk.

Another future direction is to look at similar data sets from different time periods and to see how the network changes over time. Can we find a model of the behaviour of shareholders at the microscopic scale which shows the macroscopic evolution of the network such as the phenomenon of takeovers?

Supporting information


The authors would like to thank Eduardo Viegas, Henrik Jensen, Tarun Ramadorai, Yangshen Yang and Nanxin Wei for useful comments.


  1. 1. Brandes U, Robins G, McCranie A, Wasserman S. What is network science? Net. Sci. 2013;1(01):1–15.
  2. 2. Newman M. Networks: an introduction. OUP Oxford; 2010.
  3. 3. Arthur WB. Complexity and the economy. Science. 1999;284(5411):107–109. pmid:10103172
  4. 4. Farmer JD, Gallegati M, Hommes C, Kirman A, Ormerod P, Cincotti S, et al. A complex systems approach to constructing better models for managing financial markets and the economy. Eur. Phys. J. Spec. Top. 2012;214:295–324.
  5. 5. Acemoglu D, Akcigit U, Kerr W. Networks and the macroeconomy: An empirical exploration. NBER Macroeconomics Annual. 2016;30(1):273–335.
  6. 6. Vitali S, Glattfelder JB, Battiston S. The network of global corporate control. PLOS One. 2011;6(10):e25995. pmid:22046252
  7. 7. Glattfelder JB, Battiston S. Backbone of complex networks of corporations: The flow of control. Physical Review E. 2009;80(3):036104.
  8. 8. Ohnishi T, Takayasu H, Takayasu M. Network motifs in an inter-firm network. Journal of Economic Interaction and Coordination. 2010;5(2):171–180.
  9. 9. Ohnishi T, Takayasu H, Takayasu M. Hubs and authorities on Japanese inter-firm network: Characterization of nodes in very large directed networks. Progress of Theoretical Physics Supplement. 2009;179:157–166.
  10. 10. Iinoa T, Kamehamaa K, Iyetomia H, Ikedab Y, Ohnishic T, Takayasud H, et al. Community Structure in a Large-Scale Transaction Network and Visualization. In: Journal of Physics: Conference Series. vol. 221; 2010. p. 012012.
  11. 11. Viegas E, Cockburn SP, Jensen HJ, West GB. The dynamics of mergers and acquisitions: ancestry as the seminal determinant. In: Proc. R. Soc. A. vol. 470. The Royal Society; 2014. p. 20140370.
  12. 12. Li H, An H, Huang J, Huang X, Mou S, Shi Y. The evolutionary stability of shareholders’ co-holding behavior for China’s listed energy companies based on associated maximal connected sub-graphs of derivative holding-based networks. Applied Energy. 2016;162:1601–1607.
  13. 13. Li H, Fang W, An H, Gao X, Yan L. Holding-based network of nations based on listed energy companies: An empirical study on two-mode affiliation network of two sets of actors. Physica A. 2016;449:224–232.
  14. 14. An P, Li H, Zhou J, Chen F. The evolution analysis of listed companies co-holding non-listed financial companies based on two-mode heterogeneous networks. Physica A. 2017;484:558–568.
  15. 15. Li H, Fang W, An H, Yan L. The shareholding similarity of the shareholders of the worldwide listed energy companies based on a two-mode primitive network and a one-mode derivative holding-based network. Physica A. 2014;415:525–532.
  16. 16. Guan Q, An H, Liu N, An F, Jiang M. Information Connections among Multiple Investors: Evolutionary Local Patterns Revealed by Motifs. Scientific Reports. 2017;7(1):14034. pmid:29070827
  17. 17. Berle AA, Means GGC. The modern corporation and private property. Transaction publishers; 1991.
  18. 18. Zeitlin M. Corporate ownership and control: The large corporation and the capitalist class. In: Classes, Power, and Conflict. Springer; 1982. p. 196–223. 10.1111/0022-1082.00115
  19. 19. Mizruchi MS, Schwartz M. Intercorporate relations: The structural analysis of business. Vol. 1, Cambridge University Press; 1992.
  20. 20. Gai P, Kapadia S. Contagion in financial networks. 2010.
  21. 21. Dijk BV; 2017. Available from:
  22. 22. Bank TW; 2018. Available from:
  23. 23. CEIC, D (2018) Global Economic Data, Indicators, Charts and Forecasts
  24. 24. Yao Q. Shareholder Networks. 2019;
  25. 25. Maslov S, Sneppen K. Specificity and stability in topology of protein networks Science 2002; 296:910. pmid:11988575
  26. 26. Molloy M, Reed B. A critical point for random graphs with a given degree sequence, Random Structures and Algorithms 1995, 6, 161–180.
  27. 27. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of community hierarchies in large networks. J.Stat.Mech 2008; p. P10008.
  28. 28. Rosvall M, Bergstrom CT. Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences. 2008;105(4):1118–1123.
  29. 29. Villalonga B, Amit R. Family control of firms and industries. Financial Management. 2010;39(3):863–904.
  30. 30. Yurtoğlu BB. Corporate Governance and Implications For Minority Shareholders In Turkey. 2003. 10.1111/j.1755-053X.2010.01098.x
  31. 31. Snyder D, Kick EL. Structural position in the world system and economic growth, 1955-1970: A multiple-network analysis of transnational interactions. American journal of Sociology, 84(5):1096–1126, 1979.
  32. 32. Borgatti SP, Everett MG. Models of core/periphery structures. Social networks, 21(4):375–395, 2000.
  33. 33. Rossa FD, Dercole F, Piccardi C. Profiling core-periphery network structure by random walkers. Scientific reports, 3:1467, 2013. pmid:23507984
  34. 34. Porta R, Lopez-de Silanes F, Shleifer A. Corporate ownership around the world. The Journal of Finance. 1999;54(2):471–517.