Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

On the Relation between the Small World Structure and Scientific Activities

  • Ashkan Ebadi ,

    a_ebad@encs.concordia.ca

    Affiliation Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, Quebec, Canada

  • Andrea Schiffauerova

    Affiliations Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, Quebec, Canada, Department of Engineering Systems and Management, Masdar Institute, Masdar City, Abu Dhabi, United Arab Emirates

On the Relation between the Small World Structure and Scientific Activities

  • Ashkan Ebadi, 
  • Andrea Schiffauerova
PLOS
x

Abstract

The modern science has become more complex and interdisciplinary in its nature which might encourage researchers to be more collaborative and get engaged in larger collaboration networks. Various aspects of collaboration networks have been examined so far to detect the most determinant factors in knowledge creation and scientific production. One of the network structures that recently attracted much theoretical attention is called small world. It has been suggested that small world can improve the information transmission among the network actors. In this paper, using the data on 12 periods of journal publications of Canadian researchers in natural sciences and engineering, the co-authorship networks of the researchers are created. Through measuring small world indicators, the small worldiness of the mentioned network and its relation with researchers’ productivity, quality of their publications, and scientific team size are assessed. Our results show that the examined co-authorship network strictly exhibits the small world properties. In addition, it is suggested that in a small world network researchers expand their team size through getting connected to other experts of the field. This team size expansion may result in higher productivity of the whole team as a result of getting access to new resources, benefitting from the internal referring, and exchanging ideas among the team members. Moreover, although small world network is positively correlated with the quality of the articles in terms of both citation count and journal impact factor, it is negatively related with the average productivity of researchers in terms of the number of their publications.

Introduction

The world is really small! This comes to our minds when a mutual acquaintance is found with someone who we do not know at all. The idea of the small world network is traced back to the work of Milgram in 1967. Through a series of field experiments he found that even in a very large network on average only six intermediates are needed to reach a person who is completely unknown. This property is also called “six degrees of separation” in the literature [1]. Later, Travers and Milgram [2] tried to formulate the small world property by calculating the probability of any two randomly chosen people knowing each other in a large population. In other words, in small world networks the average path length (average distance between two given nodes in the network) is relatively short in spite of the existence of high clustering (tendency of the nodes in a network to cluster together). Therefore, short path lengths among network actors facilitate the spread of various ideas that are generated in separate clusters, which results in producing novel knowledge [3,4].

The level and the efficiency of knowledge diffusion are affected by small world property. Cowan and Jonard [5] developed a model to study the efficiency of the small world networks and claimed that the level of knowledge is at its maximum when the network structure has small world properties. Therefore, it is good to have small world property in the network but how persistent are such networks? Kogut and Walker [6] analyzed the cross-ownership among German firms during 1990s and the robustness of the small world property. They found that the small world network tends to preserve its properties of high clustering and short path lengths even if it experiences a considerable number of shocks and re-structuring of the links of the network. Therefore, once the small world network is established it retains the property unless the network perceives a considerable amount of re-structuring forcing it to transform into another structure.

Several researchers analyzed the effect of small world property in the network of firms. Sullivan and Tang [7] constructed the inter-firm network of the United States venture capital industry to evaluate its effects on the firms’ performance. They observed a positive impact of the small world structure on productivity of the firms. In another study, Baum et al. [8] investigated the Canadian network of investment bank syndicate from 1952 to 1990 to see how small world network emerges and evolves over time. They confirmed that the networks formed among firms usually resemble small world characteristics. Schilling and Phelps [9] focused on the impact of small world property on firms’ performance through analyzing the number of patents. Their results show that there is a positive effect of the small world since high clustering and short path length enable companies to get access to new knowledge that is required for innovation.

In addition, several empirical studies focused on individuals’ activity and analyzed the effect of the small world property on the performance of individuals in the network. Fleming and Marx [4] studied the collaboration of the inventors in Silicon Valley and Route 128 in Boston and found that the network of the examined inventors resembled the small world structure. However, no positive relation was observed between the existing small world property in the network and the inventive productivity of the researchers in the region. Fleming et al. [10] have also shed some light on the impact of the small world in the network of inventors and their innovative and managerial approaches within a small world network to remain competitive. Although they found a positive effect of short average path length on the technological productivity, no significant positive influence of the small world property was observed.

Other studies analyzed the impact of the small world structure in co-authorship networks. Co-authorship analysis has been particularly recognized by some studies (e.g. [11,12]) as being the most common tool in investigating the relations and patterns in scientific collaboration. Newman [13] investigated the co-authorship networks in physics, biology and mathematics and found the small world structure in all the aforementioned networks. Goyal et al. [14] focused on a single scientific discipline. Using the co-authorship network of economists during 1980 to 1999, they found small world properties in the examined collaboration network. Moreover, they found an increasing trend in the average degree of the network over time and realized that the number of brokers is also augmenting. In another study, despite considering several fields for the study Moody [15] also focused on the subspecialties (e.g. economic sociology, criminology, etc.) in a single discipline and analyzed the network of sociologists during the period of 1963 to 1999. He surprisingly found that the network did not resemble the small world properties likely due to the considerable overlap among the subfields and the authors.

Hence, there is a tendency in co-authorship networks for the small world structure. Role of the best connected actors in joining the other individuals and clusters in the network is very important. Moreover, the co-authorship pattern in a scientific field is also crucial for a network to obtain small world structure. The more a scientific discipline is team oriented and the larger the size of the team, the more probability of finding the small world properties in the structure [16,17]. Therefore, the analysis of small world properties is more seen in the disciplines in which teamwork is common [18].

Studies that have generally assessed impact of the network structure variables in co-authorship networks have found correlations between the centrality measures and some performance variables [1922]. Yan and Ding [19] focused on 16 journals in the field of library and information science (LIS) and constructed the co-authorship network at the micro level over the time span of 1988 to 2007. They calculated four centrality measures for the authors in the network, i.e. betweenness centrality, degree centrality, closeness centrality and PageRank and found a positive relation between the mentioned measures and citation counts of articles. Abbasi et al. [20] focused on the scholars in the field of information systems and statistically analyzed the impact of the network structure variables on the performance of the researchers using citation based indicators. They found a positive relation between all the network structure variables and the performance of the scholars except for the betweenness and closeness centralities. In another study, Kumar and Jan [23] assessed and compared the impact of the network variables in the field of energy fuels on research performance in Turkey and Malaysia. According to their results, popularity, position and prestige of the researchers measured by the network centrality indicators have a positive impact on their research performance. In addition, they found PageRank as the most influential centrality measure. Eslami et al. [22] focused on the field of biotechnology in Canada and statistically investigated the impact of the network structural variables on the quantity and quality of technological performance of the researchers within the period of 1966 to 2005. Their results suggest a significant impact of the structure of the examined co-authorship network on knowledge and technology production, however, no impact was observed on the quality of the patents.

Nevertheless, the results about the impact of the small world structure on performance are inconsistent. For example, Fowler [24] found a non-linear relation between small world properties and voting participation rate, and Uzzi and Spiro [3] found a similar relation between the financial and artistic performance of the artists and the small world properties. However, Schilling and Phelps [9] observed a linear relation whereas Fleming et al. [10] found no relation between small world properties and performance. Hence, no consensus is found in the literature about the impact of the small world structure on the performance [25]. One reason could be the use of different datasets and performance measures in the studies that makes it hard to come into a general agreement about the impact of the small worldiness on researchers’ performance. Hence, the assessment of the impact is suggested to be done in different fields and scientific environments. In addition, although there are very few studies that particularly analyzed the impact of the small world variables on productivity of the inventors and firms, to our knowledge no study has analyzed its relation with the quality of publications and researchers’ team size. This paper is designed to fill these research gaps.

Our main objective is first to analyze if the examined network resembles the small world property and then to study its relation with the scientific output, the quality of the produced papers and the team size. It is assumed that analyzing the relation between the small world property and the quality of the publications will help to highlight the benefits of a systematic collaboration network rather than a random one in producing higher quality research. In addition, it will identify the importance of a well-established collaboration network in which researchers are well connected by short distances. Moreover, analyzing the relation between the small world property and the average team size of the researchers will determine if researchers in a small world network have larger team sizes due to the shorter distance among researchers in such a network. As being a member of a larger team size may result in higher rate of publication for each of the team members, if a positive relation is observed between the small world property and the team size then one may expect higher overall rate of publications in such collaboration networks.

In order to achieve this objective a comprehensive dataset of publications of Canadian researchers in natural sciences and engineering was used. First the existence of the small world properties in the co-authorship network of these researchers was examined and then the inter-relations between the small world variables and quantity of the scientific output (measured by the number of publications), quality of the articles (measured by the normalized citation rate and by the average impact factor of the journals) and size of the research teams (represented by the average number of authors per paper) were statistically investigated. The rest of the paper is organized as follows: Section “Data and Methodology” describes methodology and data used in this study. The empirical results and interpretations are provided in section “Results”. Section “Conclusion” presents the findings of this research and the limitations of this study are discussed in the last section “Limitations”.

Data and Methodology

The study has three phases. In the first phase, a database of all the research publications produced by the Canadian researchers in natural sciences and engineering was created. It was decided to focus only on engineering and natural sciences and to exclude social and medical sciences, because collaboration patterns in different disciplines vary (as an example, please see [26]). In order to do so only the researchers funded by Natural Sciences and Engineering Research Council (NSERC), which is the main Canadian federal funding agency for the researchers working in all the areas of engineering and natural sciences, were included. Since almost all the Canadian researchers in these research fields are currently receiving or received in the past a research grant from NSERC [27], it was assumed that this approach allows us to identify them quite effectively. This procedure was more straightforward than collecting all the Canadian papers and trying to distinguish between the ones that are written by the researchers in natural sciences and engineering and other scientific fields through employing some keywords or journal categories. Eligibility for NSERC funding makes our target researchers clearly defined. The funding data was collected from NSERC website. Then the articles written by these researchers were collected from SCOPUS (i.e. a commercial database of scientific articles that has been launched by Elsevier in 2004) within the period of 1996 to 2010 since the data quality of SCOPUS was low before 1996 (e.g. lack of the citation data before 1996). Moreover, to have a proxy of the quality of the papers SCImago was used to collect the impact factor information of the journals in which the articles were published. SCImago was chosen for two main reasons. Firstly, it provides annual data of the journal impact factors that enabled us to perform a more accurate analysis since the impact factor of the journals are considered in the year that an article was published not its impact in the current year. Secondly, SCImago is powered by SCOPUS that makes it more compatible with our articles database. In total, the final database contained 130,510 articles and 177,449 authors together with all the related information (e.g. article title, co-authors, their affiliations, year of publication).

In the second phase, Pajek software was used to construct the collaboration networks of the researchers and to measure the structural network and small world variables. Co-authoring an article was assumed as a sign of collaboration among the researchers, but there was no information on the length of this relationship. In some of the similar studies (e.g. [10,28]) a 5-year period for the life of each created collaboration link in the networks was considered while in other studies a 3-year time window has been assumed (e.g. [29]). The indicators were calculated for both of the mentioned time windows and it was found that the results are more robust for the 3-year time window. In other words, several independent variables were considered at first as the candidates for the control variable. Then correlations among various combinations of dependent and independent variables were tested to select the model with no significant correlation among the variables. Moreover, 3-year and 5-year time windows were considered for constructing the co-authorship networks. Since the number of observations dropped for the 5-year time window, we could not fit a model with no significant correlation among the variables. Therefore, we selected the 3-year time window. Hence, a 3-year time window was assumed in our study and the 3-year moving window was shifted forward from 1996 to 2010 to extract the publications for each of the networks. This procedure resulted in 12 undirected networks. The structure of the 12 networks was then analyzed separately by Pajek software to measure the small world variables for each of the 12 networks.

In the last phase, the measures calculated in the previous phase were used as inputs to statistically analyze the inter-relations between small world properties, the productivity and scientific collaboration of the scientists. For this purpose, five regression models were defined and estimated by STATA. The first dependent variable accounts for the research productivity of the researchers within each of the 12 periods (no_art). The number of publications has been widely used in the literature as a proxy of the scientific productivity (e.g. [30,31]). A single year for representing the productivity of the researchers was considered since it is assumed that the results of researchers’ collaboration come to light soon after the respective collaboration period is finished (as was done in [10,28]). In other words, it is assumed that the 3-year collaborative activity among the researchers will be reflected in the next year in the form of the number of their publications. Hence, for the total number of articles in the year i (no_arti), the small world variables were calculated for the networks constructed on the 3-year snapshot from year i-3 to i-1. To further investigate, the number of publications was normalized by dividing it by the number of authors and was considered as the dependent variable for the second regression model (art_per_auti). This helped us to better analyze the relation between the small world variables and productivity since higher number of authors may result in higher number of publications. Hence, by averaging the number of publications over the number of co-authors the impact of the raise in the number of authors was accounted. In order to assess the quality of the publications the normalized number of citations was used in the third model. Citation count based indicators are one of the most widely used approaches in determining research quality [32]. However, like all the methods they have some drawbacks, e.g. negative citations, self citations [31], and limitations of the citation data source [33]. Nevertheless, it is generally accepted in bibliometrics that the real or expected number of citations received by publications can be used as a good index of the mean impact at the aggregate level [34,35]. Hence, the citation counts were normalized based on the following definition and were used for the analysis at the aggregate level: where (2010—year i + 1) represents the gap between the current year and the final year of the study and is used for normalizing the citation counts. The reason for normalizing the number of citations is that older articles have more chance to be cited. Hence, in general as we move toward the recent periods the total number of citations decreases. The average impact factor of the journals in which the articles were published was also used as another proxy for the quality of the papers and defined the fourth dependent variable (avgif). The last dependent variable represents number of authors per article in year i (aut_per_arti) as a measure for the team size of the researchers.

The independent variables that were considered in all the five aforementioned models are as follows:

  • Small World (sw)
  • Network Connectivity (netcon)

In order to calculate the small world variable, it was needed to calculate clustering coefficient and average path length. In the following, the definitions of the clustering coefficient and path length along with the independent variables’ definitions are presented.

Clustering Coefficient (CC)

This index counts the number of triangles in the given undirected graph to measure the level of clustering in the network. In other words, it is the likelihood that two neighbors of a node in a graph are connected to each other; hence it measures the tendency of the nodes to cluster together [36]. According to Watts and Strogatz [37] the clustering coefficient can be defined based on a Local Clustering Coefficient (LCC) for each node within a network. LCC is defined as follows: The denominator of the above formula counts the number of sets of two edges that are connected to the node i. The overall clustering coefficient is calculated by taking average of the local clustering coefficient of all the nodes within the network. Hence, in which n denotes the number of vertices in the network. This measure returns a value between 0 and 1 in a way that it gets closer to 1 as the network interconnectivity increases.

Shortest Path Length (PL)

This index represents the separation degree of the network and is the lowest number of vertices that are needed to be traversed to reach from one vertex to another vertex [38]. The shorter the distance is the more easily information may flow among the researchers. The path length was calculated for the largest component of each of 12 created co-authorship networks. Component of a network is a sub-network in which there is no isolated vertex and all the vertices are interconnected. From the definition, the small world variable is measured for the largest component of each network. This limitation is due to the fact that the shortest path can be calculated just in a connected network. Hence, the largest connected component was considered for measuring the aforesaid variable in each of the 12 generated networks. This assumption has been widely employed in the literature (e.g. [3,10,28,3941]) and is justifiable, since the core research activities mainly occur in the largest component in which the most influential authors are present [42]. Moreover, the proportions of the largest component in our created networks are not only large in comparison with similar studies (e.g. [21,41,43,44]), but they are even gradually increasing. After 2002 our largest component covered more than 75% of the whole network, reaching to the level of almost 90% in the last period (Fig. 1). Therefore, the largest component can be used for the calculation of the path length.

Small World (SW)

The small world variable is calculated based on the clustering coefficient and the path length:

Network Connectivity (netcon)

It is a measure of the connections between pairs of vertices and is related to the average degree of the network. In other words, in the co-authorship network of the researchers it indicates the average number of collaborators for each researcher who had at least one article co-authorship during the given period of time. This is an important measure since higher number of co-authors in a network results in a tighter network that facilitates the knowledge exchange [45]. The network connectivity (netcon) was used as the control variable. The reason is that higher number of researchers in a network can increase the chance of higher network connectivity and consequently the chance of higher collaboration among the researchers that may have an effect on our dependent variables.

Results

Pre-analysis

Number of the researchers in each of the examined periods of time reflects the size of the network in the corresponding year. As the first step, the trend of the network size was analyzed. According to Fig. 2, the network size did not change much until 2000 since then it has been steadily increasing with an almost constant positive slope. Since an annual increase was expected in the number of researchers, the steady line indicating the number of researchers between 1996 and 2000 might be due to the SCOPUS data that seems to be more integrated and complete for the recent years. Another reason for the steady trend during the first five years could be the immaturity of the examined collaboration network in a way that after a couple of years new researchers started joining the network with a faster pace. This issue will be further investigated in the rest of the paper.

In line with the increase in the number of authors an increase is seen in the number of articles, having almost the same trend. According to Fig. 3, the number of articles remained constant during the first and the last 5-year periods. However, a positive jump is observed during the second 5-year period (from 2001 to 2005).

thumbnail
Fig 3. Historical trend of the researchers’ articles from 1996 to 2010.

https://doi.org/10.1371/journal.pone.0121129.g003

Small world analysis

According to Kogut and Walker [6], a network has a small world structure if its average clustering coefficient is significantly higher than a random network of the same number of vertices while having approximately the same path length. Hence, in order to investigate the small world structure in the co-authorship network of the researchers, an Erdős–Rényi random network [46] of the same size as the actual network was constructed for each of the examined periods. The respective path lengths and clustering coefficients were then calculated for the generated random networks and compared to the corresponding amounts of the actual networks. The results are depicted in Fig. 4 and Fig. 5. The X-axis in Figs. 4 and 5 represents the starting year of each of the 3-year time intervals that were considered to calculate the collaboration network variables. For example, 1996 represents the period of [1996–1998].

Although the small world networks are often large in size, they exhibit relatively short path length and high clustering coefficient [47]. Clustering coefficient in co-authorship network represents if researchers’ collaborators are also collaborating with each other in form of writing a paper jointly [48]. As it can be seen in Fig. 4, the clustering coefficient for the actual network is almost constant maintaining about 0.8 and is significantly higher than the clustering coefficient for the respective random networks (that are between 0.0003 and 0.0006) in all the examined periods. This result is completely in line with the previously done studies that investigated the small world structure (e.g. [43,48]). This is a primary sign of the small world structure in the examined network of researchers. In addition, the clustering coefficient of the examined network is very high in comparison with the other similar studies, e.g. all the four co-authorship networks studied by Newman [49], and SIGMOD co-authorship networks of Nascimento et al. [44]. This indicates that in the examined network it is more likely for two co-authors to have a common collaborator with whom they have also published an article.

The path length for the actual and the generated random networks were compared. According to Fig. 5, although the path length of the examined co-authorship network remains relatively constant during the initial 5-year period, it starts dropping significantly and continuously after 2000, while getting very close to the path length of the random network. The value of the path length of our examined network is almost similar to the one of Nascimento et al. [44] who found a path length of 5.65 in the SIGMOD co-authorship network, and is lower than some other studies (e.g. [41]). In general, in other similar studies that contain more than 10,000 vertices and analyzed the small world property in co-authorship networks, the average path length is not more than 10 (e.g. [50,51]). According to Fig. 4 and Fig. 5 and based on the definition of Watts and Strogatz [37] the examined co-authorship network of researchers strictly resembles the small world structure.

As the next step, SW indicator was defined and used to analyze the small world characteristics of the collaboration network of researchers. To calculate the value of the small world indicator the method that has been employed in several similar studies was followed (e.g. [6,28,52]) which used the following formula for calculating the small world ratio:

Table 1 shows the results for the small world variables calculated for all the examined periods. According to Baum et al. [28], as the size of the network increases the value of the small world indicator should increase. As it can be seen in Table 1, there is an increase in the amount of SW indicator during the first three periods. After a sudden drop, it continues to increase steadily after 1999 reaching to the maximum value of the SW indicator in the latest periods. The drop could be due to two reasons. First, the SCOPUS data was probably less complete during the first intervals as the number of articles found in SCOPUS is almost constant in the first three periods. Second reason could be the nature of the collaboration network that may have been less mature during the initial periods. As more researchers join the network, more links are established and the network evolves dynamically. This enables the network to reflect more small world properties as the time passes. This proposition is also confirmed by the trend of the clustering coefficient.

However, it is also argued in the literature that small world properties follow the form of an inverted U-shape (e.g. [53]). That means an increase in the small world properties will be followed by a later decrease. According to Fig. 6, the trend of SW indicator in the examined network had a local maximum in the period of [1998–2000] and then after a sudden decline it started to rise again till the period of [2006–2008] where the second local maximum is seen. Hence, a declining trend is expected to be seen after 2007 and a reassessment of the small world properties is suggested for the future. In a small world network, researchers can get access to the pools of knowledge in diverse clusters and communities through knowledge brokers who are the actors in the network that connect different clusters. Therefore, other actors can retain or even improve their position in the network by accessing continuously to the flows of diverse information and knowledge or other resources [54,55]. The reason for the inverted U-shaped form of the small world property is that as the network evolves the knowledge brokers become less important gradually due to the limited advantages of the brokerage positions that will lead to the decline of the small world. In other words, as the network evolves different clusters gradually get familiar with the information pools of the other clusters through the existing knowledge brokers, hence making the knowledge generated in different clusters more homogeneous. Facilitating the knowledge exchange reduces the diversity in the whole network gradually [56], making the role of the knowledge brokers less important. As a result of the decline in the entrance of new knowledge brokers along with the decay of the old brokers, network becomes more separated. Hence, actors collaborate with their stable and familiar partners within their own clusters and communities. This will lead to multiple isolated clusters and consequently lower small-worldiness [53].

To compare the small world structure in the examined collaboration network a list of previously identified small world co-authorship networks as well as the network properties is presented in Table 2. Considering the network size, the NSERC researchers’ co-authorship network is similar to the SPIRES and LANL co-authorship networks of Newman [49], and MATH co-authorship network of Barabási et al. [48]. NSERC network is significantly more cliquish (i.e. it has a very high respective clustering coefficient) than SPIRES network where the value is quite comparable to the one for the LANL network. However, it is less cliquish than the MATH network that has the highest clustering coefficient among all the listed networks. Comparing the path length of our examined network with the mentioned networks, it can be said that NSERC network is more similar to the LANL network of Newman [49]. As it can be seen in Table 2, all the previously studied small world networks have the path length ratio lower than 2, ideally closer to 1. In the case of our examined co-authorship network the path length ratio is declining and getting very close to 1 in the final period (1.47). However, different clustering coefficient ratios are observed in the previous studies that led them to a wide range of values for the SW indicator.

thumbnail
Table 2. Comparison of previously studied co-authorship networks with the last period of our network (NSERC).

https://doi.org/10.1371/journal.pone.0121129.t002

Statistical Analysis

After observing the small world structure in the examined co-authorship network, the inter-relations between the small world property and several bibliometric measures were statistically analyzed. As the first step, any pair wise correlations among the independent variables were checked and no significant correlation was found among them.

Negative binomial regression model was considered for our first dependent variable, i.e. number of articles in the following year. Since the dependent variable in the first model is a count measure, the best regression model would be the Poisson model [57]. However, for a Poisson regression the variance and mean of the sample should not differ significantly. Hence, the data should be tested to detect any over-dispersion or under-dispersion that will lead the Poisson model to underestimate or overestimate the standard errors resulting in misleading estimates for the statistical significance of variables [58]. Therefore, the likelihood ratio test was done to see if the Poisson model fits our data. The results show that the over-dispersion coefficient (α) is significantly different from zero, which means that Poisson distribution is not an appropriate choice and negative binomial regression could be a better estimator. For the remaining 4 dependent variables, i.e. normalized citation count in the following year, average impact factor of journals in which the articles have been published in the following year, number of articles per author in the following year, and number of authors per article in the subsequent year, linear regression models were used.

Table 3 shows the results for the inter-relation analysis between the small world property and productivity of the researchers in terms of the number of their publications. The results show that both of the independent variables (small world and network connectivity) can be regarded as significant predictors of the scientific productivity in the following year.

According to the results, the small world property and network connectivity are positively correlated with the number of publications of the researchers in the subsequent year. These results were expected since as the network becomes more connected, researchers get more familiar with other scientists’ fields of research that may lead to the establishment of more collaboration links. In addition, the small world structure can accelerate the exchange of knowledge and expertise among the researchers that may result in higher productivity. The reason is that small world networks allow access to distant information and the knowledge is transferred more efficiently in such networks [25]. Our results are in accordance with major conclusions of the previous studies (e.g. [5,6,22]).

Collaboration among the researchers in a small world network was also tested. For this purpose, the number of authors per articles was considered as a proxy of the researchers’ team size and its relation with the small world structure was assessed. Number of authors per article is a common indicator in scientometrics and has been widely used in the literature as a proxy for scientific collaboration (e.g. [59,60]). As it can be seen in Table 4, only the small world variable is significant reflecting a small positive relation with the team size. Hence, it seems that researchers might benefit from the shorter path length and more clustered sub-networks to get in touch with other researchers who are working in the same scientific area. This may result in establishment of new collaboration links and expansion of their team size. Moreover, high clustering creates more repeated links among the researchers, causing the risk to be shared among the researchers that might lead to an increase of the trust level in the community [61]. Of course other factors like changes in society, funding, regulations, etc. can also play a role here. In addition, it may also possible that the author effect is a cause and the small world effect is a result. As the next step, relation with the average productivity of the researchers was assessed.

Since the number of authors has an increasing trend over the examined period, to assess the productivity of the researchers more accurately the average number of articles per author was examined. The result of the linear regression model is depicted in Table 5. As it can be seen the small world property is negatively correlated with the average productivity of the researchers, which is an interesting finding. Although a positive relation was found between the small world structure and the total number of articles, it is observed that it may harm the average publication rate. Hence, it seems that in a small world structure researchers start to collaborate more by forming bigger scientific teams that may lead them to increased overall productivity. However, when it comes to the average productivity per researcher it becomes lower since the team sizes have grown. The other aspect to be analyzed is the quality of the papers that are produced. Therefore, in the next part relation between the small world property and the quality of papers is analyzed.

thumbnail
Table 5. Regression results for average number of articles per author model.

https://doi.org/10.1371/journal.pone.0121129.t005

Two linear regression models were considered to check the relation between small world and quality of the publications, one is based on number of citations the articles received, and one on average impact factor of the journals in which the articles were published. Both mentioned measures can serve as a proxy for quality, but with a slightly different meaning. Impact factor indicates the respectability of the journal, i.e. the quality and the level of contribution perceived by the authors and the reviewers of the paper, whereas the citations show the impact of the article on the scientific community and on the subsequent research. Since both proxies have some flaws, it was decided to analyze both of them. Table 6 shows the regression results for the relation between the small world structure and the normalized number of citations received in the subsequent year. The number of citations was normalized based on the year of publication since generally older articles have higher total number of citations. According to the results, the linear regression is well fitted to our data. In addition, both variables are significant at the level of 95% confidence and based on the resulting R2 the independent variables are relatively good predictors of the dependent variable. Controlling for the network connectivity, small world property is positively related with the quality of the papers in the following year in terms of number of citations received. Hence, it can be said that researchers may benefit from the small world structure to exchange ideas more easily, and since they get connected to other researchers they can improve the quality of their work by internal referring among the team members and other researchers in the network. This is consistent with other studies that analyzed the impact of network centrality measures (not specifically small world properties) on the quality of the papers measured by number of citations and found positive relations (e.g. [43]).

The same analysis was performed using a different proxy for the quality of the papers, namely the average impact factor of the journals in which the articles were published. According to Table 7, a significant positive relation is observed between the average journal impact factor and the small world structure. This along with our findings from Table 6 confirms the important role of the small world systems in supporting researchers to produce higher quality publications. From the results it can be said that although small world network may harm the average rate of publications, team members may benefit from such a system to increase the overall quality of their publications.

Conclusion

This study focused on the co-authorship network of the Canadian researchers in engineering and natural sciences and investigated the existence of the small world structure and its relation with the researchers’ productivity, quality of their publications, and their team size. Several previous studies analyzed different co-authorship networks and found correlations between network centrality measures and researchers’ productivity (e.g. [19,20,22,23]), however to our knowledge no study has focused specifically on the relation between the small world properties and quality of the publications and scientific team size.

Our results show that the examined network exhibits significant small world properties by having very high clustering coefficient in comparison with the random networks of the equal size while the path lengths are almost the same. The separation degree among scientists decreases to around five in the final period, when it becomes even lower than famous Milgram’s finding of six degrees of separation [62]. Hence, the networks in the final periods become more connected and the low path length among the researchers allows them to exchange knowledge more easily. Moreover, in comparison with most of the other co-authorship networks that have been studied, our examined co-authorship network has relatively larger clustering coefficient, smaller average path length, and larger proportion of the largest component. Specifically, the size of the largest component is critical since the path length (and consequently the small world measure) can be only calculated in the connected sub-network. Hence, this study benefited from the large share of its largest component to have better estimations of the small world variables. On the other hand, the enormous largest component in our examined co-authorship network may represent the fact that the core research activity is being done in an inter-connected large cluster of the researchers. Of course, the size of the largest component also depends on the nature of the research activity and the level of its interdisciplinarity.

The results show that although the small world structure has a positive relation with the total number of publications, it is negatively correlated with the average productivity of the researchers. Since a positive relation was observed between the small world and the researchers’ team size, it can be concluded that researchers may benefit from the small world properties to get familiar with other active researchers in their field and expand their scientific team. This team expansion can bring them several advantages such as internal referring, better and faster access to expertise and other resources, new sources of funding, etc. that will result in higher rate of publication for the whole team. But since the size of the team has grown up the average productivity will become lower. Therefore, it seems that even though a small world network could not be positively related to the individual productivity of the researchers, it might help them to invest their efforts in a more efficient way. Being involved in larger teams and getting in contact with other experts in the field allow them to not only gain new skills but also employ their skills more efficiently. Tighter collaboration among the team members can also create a synergy among them that will surely result in higher productivity of the team. The positive relation between the small world structure and the papers’ quality also supports the idea that the small world structure may facilitate more effective exchange of knowledge among the team members that may result in higher quality works. However, It is also possible that the reinforcement of the relation between scientific performance and small world structure be more dependent on the team size rather than on small world property. For example, van Raan [63] found a positive relation between team size and quality of research.

As discussed before, small world properties were reported to follow the form of an inverted U-shape [53]. According to our results, the Canadian natural science and engineering network has seen its latest pick in the period of [2006–2008], after which the article production started to decrease. Hence, according to Gulati et al. [53] a decreasing trend is predicted for the years after 2010, resembling an inverted U-shape curve. Considering the observed relations with the examined bibliometric measures it would be suggested to reassess the structure of the network periodically. In general, knowing the structure of the collaboration network and its relation with the performance measures may help the decision makers to set better strategies in supporting collaborative activities

Limitations and Future Work

The main limitation was in regard with the sample size. The reason for the selection of the time interval of 1996 to 2010 was that SCOPUS has a weaker coverage before 1996. Moreover, articles need at least three years to be well cited and as a result the periods after 2010 were not included. Future work can address this limitation by using other databases with more number of observations. More observations would allow analyzing the interrelations between the small world property and other network centrality measures to assess the combined impacts.

Another limitation was in regard with the calculation of the small world variable for which the largest component was considered. Although there are some suggestions in the literature for overcoming this limitation (e.g. [9]), they could be applicable when the special purpose customizable software for social network analysis is available to code a program to calculate the small world indicator over the whole network. However, as mentioned before the proportion of the largest component in this study was larger than other similar studies, which allowed us to make more realistic estimates of the small world measures.

Furthermore, we were exposed to some limitations in measuring scientific collaboration among the researchers as we were unable to capture other links that might exist among the researchers like informal relationships. These types of connections are never recorded and thus cannot be quantified, but there are certainly some knowledge exchanges occurring in such associations that could affect the network performance. In addition, there are also some drawbacks in using ctableo-authorship as an indicator of collaboration since collaboration does not necessarily result in a joint article [64]. An example could be the case when two scientists cooperate together on a research project and then decide o publish their results separately [65]. Hence, future work can address this issue by taking other indicators into account. Finally, since the analysis presented in this document was performed at the aggregate level, a future research direction can use a large dataset to investigate the relation between network structure variables and researchers’ performance indicators at the individual level of researchers.

Author Contributions

Conceived and designed the experiments: AE AS. Performed the experiments: AE. Analyzed the data: AE AS. Contributed reagents/materials/analysis tools: AE AS. Wrote the paper: AE AS.

References

  1. 1. Guare J (1992) Six degrees of separation. Dramatists Play Service.
  2. 2. Travers J, Milgram S (1969) An experimental study of the small world problem. Sociometry: 425–443.
  3. 3. Uzzi B, Spiro J (2005) Collaboration and creativity: The small world Problem1. American Journal of Sociology 111(2): 447–504.
  4. 4. Fleming L, Marx M (2006) Managing creativity in small worlds. California Management Review 48(4): 6–27.
  5. 5. Cowan R, Jonard N (2004) Network structure and the diffusion of knowledge. Journal of Economic Dynamics and Control 28(8): 1557–1575.
  6. 6. Kogut B, Walker G (2001) The small world of Germany and the durability of national networks. American Sociological Review: 317–335.
  7. 7. Sullivan BN, Tang Y (2012) Small–world networks, absorptive capacity and firm performance: Evidence from the US venture capital industry. International Journal of Strategic Change Management 4(2): 149–175.
  8. 8. Baum JA, Rowley TJ, Shipilov AV (2004) The small world of Canadian capital markets: Statistical mechanics of investment bank syndicate networks, 1952–1989. Canadian Journal of Administrative Sciences/Revue Canadienne des Sciences de l'Administration 21(4): 307–325.
  9. 9. Schilling MA, Phelps CC (2007) Interfirm collaboration networks: The impact of large-scale network structure on firm innovation. Management Science 53(7): 1113–1126.
  10. 10. Fleming L, King C, Juda AI (2007) Small worlds and regional innovation. Organization Science 18(6): 938–954.
  11. 11. Glänzel W (2001) National characteristics in international scientific co-authorship relations. Scientometrics 51(1): 69–115.
  12. 12. Savanur K, Srikanth R (2010) Modified collaborative coefficient: A new measure for quantifying the degree of research collaboration. Scientometrics 84(2): 365–371.
  13. 13. Newman ME (2004) Who is the best connected scientist? A study of scientific coauthorship networks. Complex networks (Springer): 337–370.
  14. 14. Goyal S, Van Der Leij MJ, Moraga‐González JL (2006) Economics: An emerging small world. Journal of Political Economy 114(2): 403–412.
  15. 15. Moody J (2004) The structure of a social science collaboration network: Disciplinary cohesion from 1963 to 1999. American Sociological Review 69(2): 213–238.
  16. 16. Guimerà R, Uzzi B, Spiro J, Amaral LAN (2005) Team assembly mechanisms determine collaboration network structure and team performance. Science 308(5722): 697–702. pmid:15860629
  17. 17. Wuchty S, Jones BF, Uzzi B (2007) The increasing dominance of teams in production of knowledge. Science 316(5827): 1036–1039. pmid:17431139
  18. 18. Lissoni F, Llerena P, Sanditov B (2013) Small worlds in networks of inventors and the role of academics: An analysis of France. Industry and Innovation 20(3): 195–220. pmid:24151698
  19. 19. Yan E, Ding Y (2009) Applying centrality measures to impact analysis: A coauthorship network analysis. Journal of the American Society for Information Science and Technology 60(10): 2107–2118.
  20. 20. Abbasi A, Altmann J, Hossain L (2011) Identifying the effects of co-authorship networks on the performance of scholars: A correlation and regression analysis of performance measures and social network analysis measures. Journal of Informetrics 5(4): 594–607.
  21. 21. Kumar S, Jan JM (2013) Mapping research collaborations in the business and management field in Malaysia, 1980–2010. Scientometrics 97(3): 491–517.
  22. 22. Eslami H, Ebadi A, Schiffauerova A (2013) Effect of collaboration network structure on knowledge creation and technological performance: The case of biotechnology in Canada. Scientometrics 97(1): 99–119.
  23. 23. Kumar S, Jan JM (2014) Research collaboration networks of two OIC nations: Comparative study between turkey and Malaysia in the field of ‘Energy fuels’, 2009–2011. Scientometrics 98(1): 387–414. pmid:24996737
  24. 24. Fowler J (2005) Turnout in a small world. Temple University Press, Philadelphia: 269–287.
  25. 25. Uzzi B, Amaral LA, Reed-Tsochas F (2007) Small‐world networks and management science research: A review. European Management Review 4(2): 77–91.
  26. 26. Larivière V, Gingras Y, Archambault É (2006) Canadian collaboration networks: A comparative analysis of the natural sciences, social sciences and the humanities. Scientometrics 68(3): 519–533.
  27. 27. Godin B (2003) The impact of research grants on the productivity and quality of scientific research. Ottawa: INRS Working Paper.
  28. 28. Baum JA, Shipilov AV, Rowley TJ (2003) Where do small worlds come from? Industrial and Corporate Change 12(4): 697–725.
  29. 29. Beaudry C, Allaoui S (2012) Impact of public and private research funding on scientific production: The case of nanotechnology. Research Policy 41(9): 1589–1606.
  30. 30. Centra JA (1983) Research productivity and teaching effectiveness. Research in Higher Education 18(4): 379–389.
  31. 31. Okubo Y (1997) Bibliometric indicators and analysis of research systems: Methods and examples. OECD Science, Technology and Industry Working Papers (1997/01), OECD Publishing, Paris.
  32. 32. Kostoff RN (2002) Citation analysis of research performer quality. Scientometrics 53(1): 49–71.
  33. 33. Couto FM, Grego T, Pesquita C, Verissimo P (2009) Handling self-citations using Google scholar.
  34. 34. Seglen PO (1992) The skewness of science. Journal of the American Society for Information Science 43(9): 628–638.
  35. 35. Gingras Y (1996) Bibliometric analysis of funded research. A feasibility study. Report to the Program Evaluation Committee of NSERC.
  36. 36. Hanneman RA, Riddle M (2011) Concepts and measures for basic network analysis. The Sage Handbook of Social Network Analysis: 340–369.
  37. 37. Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684): 440–442. pmid:9623998
  38. 38. De Nooy W, Mrvar A, Batagelj V (2005) Exploratory social network analysis with Pajek. Cambridge University Press.
  39. 39. He J, Hosein Fallah M (2009) Is inventor network structure a predictor of cluster evolution? Technological Forecasting and Social Change 76(1): 91–106.
  40. 40. Newman ME (2000) Models of the small world. Journal of Statistical Physics 101(3–4): 819–841.
  41. 41. Liu X, Bollen J, Nelson ML, Van de Sompel H (2005) Co-authorship networks in the digital library research community. Information Processing & Management 41(6): 1462–1480.
  42. 42. Fatt CK, Ujum EA, Ratnavelu K (2010) The structure of collaboration in the journal of finance. Scientometrics 85(3): 849–860.
  43. 43. Yan E, Ding Y, Zhu Q (2010) Mapping library and information science in China: A coauthorship network analysis. Scientometrics 83(1): 115–131.
  44. 44. Nascimento MA, Sander J, Pound J (2003) Analysis of SIGMOD's co-authorship graph. ACM Sigmod Record 32(3): 8–10.
  45. 45. Wasserman S (1994) Social network analysis: Methods and applications. Cambridge university press.
  46. 46. Erdős P, Rényi A (1960) On the evolution of random graphs. Publ.Math.Inst.Hungar.Acad.Sci 5: 17–61.
  47. 47. Albert R, Barabási A (2002) Statistical mechanics of complex networks. Reviews of Modern Physics 74(1): 47–97.
  48. 48. Barabâsi A, Jeong H, Néda Z, Ravasz E, Schubert A, Vicsek T (2002) Evolution of the social network of scientific collaborations. Physica A: Statistical Mechanics and its Applications 311(3): 590–614.
  49. 49. Newman ME (2001) Clustering and preferential attachment in growing networks. Physical Review E 64(2): 025102.
  50. 50. Newman ME (2001) Scientific collaboration networks. I. network construction and fundamental results. Physical Review E 64(1): 016131. pmid:11461355
  51. 51. Newman ME (2001) Scientific collaboration networks. II. shortest paths, weighted networks, and centrality. Physical Review E 64(1): 016132. pmid:11461356
  52. 52. Davis GF, Yoo M, Baker WE (2003) The small world of the American corporate elite, 1982–2001. Strategic Organization 1(3): 301–326.
  53. 53. Gulati R, Sytch M, Tatarynowicz A (2012) The rise and fall of small worlds: Exploring the dynamics of social structure. Organization Science 23(2): 449–471.
  54. 54. Eisenhardt KM, Tabrizi BN (1995) Accelerating adaptive processes: Product innovation in the global computer industry. Administrative Science Quarterly 40(1): 84–110.
  55. 55. Lin N (2002) Social capital: A theory of social structure and action. Cambridge University Press.
  56. 56. Lazer D, Friedman A (2007) The network structure of exploration and exploitation. Administrative Science Quarterly 52(4): 667–694.
  57. 57. Hausman JA, Hall BH, Griliches Z (1984) Econometric models for count data with an application to the patents-R&D relationship. Econometrica, 52(4): 909–938.
  58. 58. Coleman JS, Lazarsfeld PF (1981) Longitudinal data analysis. Basic Books, New York.
  59. 59. Beaver DD, Rosen R (1979) Studies in scientific collaboration part III. professionalization and the natural history of modern scientific co-authorship. Scientometrics 1(3): 231–245.
  60. 60. Rosenzweig JS, Van Deusen SK, Okpara O, Datillo PA, Briggs WM, Birkhahn RH (2008) Authorship, collaboration, and predictors of extramural funding in the emergency medicine literature. The American Journal of Emergency Medicine 26(1): 5–9. pmid:18082774
  61. 61. Chen Z, Guan J (2010) The impact of small world on innovation: An empirical study of 16 countries. Journal of Informetrics 4(1): 97–106.
  62. 62. Milgram S (1967) The small world problem. Psychology Today 2(1): 60–67.
  63. 63. van Raan AF (2006) Performance-related differences of bibliometric statistical properties of research groups: Cumulative advantages and hierarchically layered networks. Journal of the American Society for Information Science and Technology 57(14): 1919–1935.
  64. 64. Tijssen RJ (2004) Is the commercialisation of scientific research affecting the production of public knowledge?: Global trends in the output of corporate research articles. Research Policy 33(5): 709–733.
  65. 65. Katz JS, Martin BR (1997) What is research collaboration? Research Policy 26(1): 1–18.