Figures
Abstract
Multiple studies have linked diversity in scientific collaborations to innovative and impactful research. Here, we explore how different diversity indices—ethnicity, gender, academic age, and topical expertise—interact and thereby influence scientific impact. Leveraging nearly 900,000 biomedical journal articles from PubMed, published in major journals between 1991 and 2014, we investigate the nuanced relationships among these diversity indices and their collective influence on research outcomes. By systematically varying model parametrizations, we assess the robustness of the observed relationships and examine multiple methodological choices. Our findings reveal a consistent pattern of demographic homophily, where scientists tend to collaborate with others who share similar ethnic and gender backgrounds. While each diversity index correlates significantly with impact when considered individually, gender diversity and topical expertise emerge as the strongest positive predictors of impact after accounting for key covariates. However, the association between diversity and impact is moderated by the number of collaborating authors, with larger teams sometimes showing opposite trends due to interactions between the computed diversity indices and team size. Despite this complexity, the practical drivers of scientific impact for an article remain the journal of publication, authors’ prior citation rate, and the number of co-authors. On further examining expertise diversity through three separate dimensions: variety, balance, and disparity, our findings indicate that impactful teams balance a wide range of subject matter expertise while maintaining a focused connection on closely related topics. These findings highlight the importance of strategic team composition and underline the significance of team diversity in scientific research.
Citation: Mishra A, Lee H, Jeoung S, Torvik VI, Diesner J (2025) Patterns of diversity in biomedical coauthorships: An analysis across authors’ ethnicity, gender, age, and expertise. PLoS ONE 20(1): e0316890. https://doi.org/10.1371/journal.pone.0316890
Editor: Sonia Vasconcelos, Institute of Medical Biochemistry Leopoldo de Meis (IBqM) - Federal University of Rio de Janeiro (UFRJ), BRAZIL
Received: March 26, 2024; Accepted: December 16, 2024; Published: January 31, 2025
Copyright: © 2025 Mishra et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data used for analysis is available in a public repository: https://doi.org/10.13012/B2IDB-5259667_V3.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Modern society places significance on the concept of diversity, as evidenced by various scientific articles [1, 2]. Diversity reflecting the differences among individuals, for a range of characteristics or attributes, is recognized for its role in improving economic productivity [3], social cohesion [4], innovation, and creativity [5]. However, the effectiveness of diversity initiatives, particularly those targeting intersections of socio-demographic factors like ethnicity, gender, age, and socio-economic status, remains unclear [5, 6]. It is possible that continually increasing diversity may not positively impact knowledge integration and social engagement [7]. For example, ethnic heterogeneity among groups can create polarization, leading to reduced economic growth and lower levels of economic investment [3, 8]. While some studies in diversity show benefits for mental health and community well-being, if not managed effectively, it can lead to adverse effects such as conflict and feelings of resentment [4, 9]. Highly heterogeneous teams may have different perceptions of the assigned task, creating knowledge boundaries among team members. Furthermore, an individual has multiple facets of characteristics (like gender, ethnicity, class), and the intersectionality of identities is essential to explore as diversity dimensions are interconnected [10]. Thus, diversity can boost creativity and economic benefits, but it also presents challenges, such as discomfort and distrust in interactions across different backgrounds [3, 11, 12].
The practice of scientific collaboration in academia illustrates the diversity in action, as co-authorship reflects the attributes of researchers involved [12, 13]. Researchers come together across disciplines, skills, community backgrounds, and varied institutions to achieve higher-quality results and solve critically important issues. Such collaborative efforts often result in more cited papers (a crude measure of higher impact science) and can also significantly influence societal outcomes and social media visibility [14, 15]. Consequently, many socio-cultural elements in research collaborations have been examined to determine their relative significance and association with scientific impact. Generally, research papers exhibit a positive correlation between the number of collaborating authors and both the article’s impact and the quality of peer review [16, 17]. Previous studies have also indicated that ethnic diversity has the strongest association with scientific impact compared to factors like gender, academic experience, or scientific field [18, 19]. Additionally, researchers have a tendency towards ethnic homophily, indicating that individuals collaborate more with others of the same ethnic background [20]. However, in some fields, ethnically diverse teams may receive fewer citations [21], and ethnic homophily often correlates with lower-impact publications [20]. For gender, productive research teams usually include a mix of genders, with gender diversity being associated with a broader group of research questions being addressed [22–24]. Male researchers are more likely to collaborate with each other and across international borders, while women are disadvantaged in citation networks, especially when holding first or last author positions [25]. The gap in scientific productivity for women further increases with the increase in age, with potential factors including smaller collaborative networks, unfair division of labor, and limited access to funding [26]. The best-performing teams usually have a strong core of authors from the same department or institution, combined with a moderate level of international collaboration [27]. International collaboration tends to enhance the visibility of research articles; however, women being disadvantaged in collaboration networks often leads to receiving fewer citations [25]. At the same time, some countries might contradict this general trend; women researchers in India tend to have more international collaboration [28]. Teams with multiple institutions tend to be more creative, resulting in more significant research regarding the article impact and citation count [29, 30]. The increased global mobility among researchers also contributes to diverse collaborations, mainly due to the presence of elite institutions that drive scientific breakthroughs [31, 32].
Scientific impact and its relationship with collaboration diversity
Scientific collaborations among researchers present an important domain for studying co-authorship diversity, with citation-based proxies frequently used to measure scientific impact [33]. At the article level, several methodologies have been developed to evaluate scientific impact, including citation percentiles, eigenvector normalizations, and source-normalized article metrics [34–37]. Such approaches reflect the wide range of quantitative metrics used in previous studies but also highlight the lack of standardization in defining a publication’s impact. Commonly used measures, such as the five-year citation window and the Hirsch index (h-index), are often applied to assess the impact of articles, journals, and researchers [18, 38, 39]. When examining the relationship between ethnic diversity in research teams and citation impact, researchers have utilized citation counts over three and six years, as well as considered the total citation count till data retrieval [19, 20]). Some researchers compare citation ratios in relation to the impact factors of journals [40]), while others have opted for field-normalized citation scores [41]. Gender diversity has similarly been studied using raw citation counts [21]. Studies have also adopted different time frames for assessing the long-term impact of citations, with ranges including 13 years, 18 years, 21 years, and some even focusing on citations within the first one or two years after publication [42–44]. The choice of citation window and the use of different citation normalization methods can significantly affect research outcomes. Shorter time windows tend to capture more immediate discourses and visibility [45], whereas citations accumulated over longer periods are more reflective of established knowledge and lasting influence. Despite such variations, all forms of citations are generally regarded as indicators of a publication’s scientific impact [46]. Citation practices also vary by discipline, with each field having its own growth rate, delay, and citation pattern [47]. The timing of when am article reaches its peak citation count can also differ depending on the journal of publication [48]. To address the limitation of traditional metrics, this study uses the relative citation ratio (denoted as RCR), a field-normalized and article-level bibliometric assessment of scientific productivity [49]. RCR offers a more nuanced approach to citation analysis, accounting for the variations in distinct academic fields through a co-citation network, and is freely available through a web-based tool, iCite [50]. ‘Scientific impact,’ in this study, thus refers to the paper-level citation impact computed using the relative citation ratio.
With this framework, we aim to answer whether diverse author characteristics in collaborative research are linked to a greater scientific impact of publications. Although the above-outlined previous studies have examined the impact of diversity within various research groups, this analysis specifically focuses on biomedical data at the article level, using a snapshot of PubMed, an online database hosted by the U.S. National Library of Medicine. The relative importance of the four diversity indices, ethnicity, gender, academic age, and topical expertise, is evaluated in relation to impact, adjusting for confounders and interaction effects. Using PubMed uniquely allows us to leverage the capabilities of specific “MeSH” keywords (Medical Subject Headings), used to define an author’s “expertise.” MeSH constitutes a controlled and hierarchically organized vocabulary employed in PubMed to categorize research papers, thereby enabling efficient information indexing, retrieval, and classification [51]. Expertise, thus, covers the author-level contribution in an article and is, thus, a temporal quantification of the author based on common concepts (such as diseases, therapies, and procedures) covered in prior papers. Therefore, this study first computes author-level attributes and then calculates an article-level diversity value (diversity index) and their relationship with the article’s impact. The MeSH terms are considerably different from broader classifications like subject categories (in Web of Science) utilized in studies related to the topic of ‘Interdisciplinarity,’ which is an article-level measure generally based on references [52, 53].
The data reveal strong evidence of homophily in ethnicity and gender in research co-authorships, indicating that researchers tend to collaborate with those with similar demographic backgrounds. Using a robust analysis framework that accounts for multiple sources of variation, we utilize generalized linear models to identify the relative importance of diversity indices with scientific impact while adjusting for confounders and interaction effects. For most teams, gender and expertise diversity show a significant and consistent positive association with scientific impact, whereas ethnicity and age diversity show a negative relationship. The interaction between diversity and author count complicates this relationship by acting as a mediating factor, especially at higher levels of diversity. At the same time, the practical significance of the results is influenced by the inclusion of covariates and the interaction between terms, as the strength of the diversity estimates decreases when they are included. Additionally, we explore the abstract nature of expertise diversity through a multidimensional conceptualization, capturing three distinct attributes: variety, balance, and disparity, each representing an essential but individually incomplete aspect of diversity [12, 52, 54, 55].
- ⋅ Variety represents the number of distinct elements contributing to expertise diversity, measured by the unique count of MeSH terms for an article.
- ⋅ Balance reflects the distribution of those elements within expertise diversity, indicating the relative proportions of MeSH terms across the article.
- ⋅ Disparity captures the topical distance between elements, measuring how distant the MeSH terms are from one another.
Our findings show that scientific impact initially declines with increasing variety but rises at higher levels, suggesting that research benefits when collaborators, as a group, have collectively explored a broad range of topics. However, scientific impact tends to decrease with higher levels of balance and disparity, indicating that teams composed of researchers with subject expertise that is too diverse or focused on diverging unrelated areas are less likely to produce highly cited work. Collaborators with a shared, well-defined disciplinary focus in their past research tend to have more citation success. These results suggest that while diversity is valuable, a more cohesive and focused collaboration leads to higher-impact research.
Results
Descriptive statistics
This study examines the diversity among co-authors of published articles in terms of four different measures: ethnicity (eth), gender (gen), academic age (age), and expertise (exp), with each measure representing a social factor that can potentially influence collaboration. Here, ethnicity and gender (M, F, and Unknown) indicate an algorithmically predicted value to an author’s name. Academic age refers to the experience a researcher has in terms of the count of prior published papers, and expertise indicates a set of topical keywords that best reflect a researcher’s specific knowledge and skills within PubMed. Table 1 presents the descriptive statistics and correlation values between the different diversity indices, dx ∈ (eth, gen, age, exp). The results show no substantial correlation between them, indicating the absence of a significant linear relationship, with minor correlations likely due to the rise in diversity as the number of authors increases. The distribution of expertise (exp) is right-skewed, so a logarithmic transformation (denoted as Log.dexp) is applied to normalize the distribution and improve the reliability of statistical modeling. Taking this into account, we proceed with further testing and regression analysis to understand the relationship between diversity and scientific impact.
Examining patterns of homophily
To begin our analysis, we first investigate the presence of homophily in author collaborations, focusing on whether researchers tend to collaborate more frequently with individuals who share similar characteristics. The process involves generating randomly shuffled datasets for each diversity index across all considered articles, conditioned on publication years and author counts. Repeating this shuffling process multiple times allows for calculating the diversity distribution under random conditions, thereby enabling a comparison with observed trends (See Note 2.3 in S1 File for details defining this process). Fig 1 compares the observed data value to the randomized data distribution for each diversity index, dx:x ∈ (eth, gen, age, Log. exp). The first row, Fig 1A, indicates the cumulative distribution for each diversity index, revealing pronounced homophily with respect to ethnic diversity, to a moderate extent in gender diversity, and no significant patterns in age and expertise. This indicates that researchers tend to collaborate with others of similar ethnic and gender backgrounds than would occur by chance. The temporal analysis, reflected in Fig 1C, indicates that both real dgen and deth have increased over the period of the dataset. This finding suggests a growing inclusion of researchers from a wider range of ethnic backgrounds as well as a gradual increase in female participation in research. Finally, team size correlates positively with diversity across all indices, indicating a potential interaction between diversity and author count. Lastly, papers with a lower author count(two or three) often display higher-than-expected age diversity, likely reflecting collaborations between junior students and more experienced researchers.
Each column corresponds to a diversity index, dx:x ∈ (eth, gen, age, Log. exp), and each row compares the observed to the randomized data based on a specific experiment. A: Cumulative distribution of dx. B: Count values of dx across the span of possible values. C: Change in mean diversity index value over time. D: Change in mean diversity index value over author count.
Relationship between diversity and scientific impact
While homophily is observed for deth and dgen, a linear model cannot be confirmed to capture the optimal fit. Subsequently, as plotted in Fig 2, a quadratic term with a dummy variable captures non-linearities and facilitates a better fit to the data. A higher adjusted R-squared for the quadratic model, and a significant second-order coefficient confirm the superior fit for each diversity index (Table B in the S1 File confirms the improved fit for the higher-order model). Additionally, a positive and statistically significant correlation is observed between the mean grouped diversity value (grouped on author count and year of publication) and scientific impact (see S1 Fig). However, the range of association varying between 0.48—0.72 indicates that the relationship between scientific impact and the diversity indices could be of a higher order. Also, S2 Fig highlights the interaction between the diversity indices and author count, underscoring the importance of author count as a crucial variable in our regression model. By controlling for author count, we account for the differences in collaborative scale and are able to capture team-size relationships. The line plots for different curves based on varying author counts reveal the need to account for author interactions and ensure that the relationship with impact is not biased or inflated. Furthermore, papers are segmented by journal domains: biology, medicine, and science to obtain more structured insights among broad subject categories in PubMed as evidenced by a previous study [56] (See Table A in S1 File to view journal distribution in the dataset).
Each curve is overlaid on a histogram of the diversity distribution, with an asterisk (*) marking the mean for each bin. The y-axis indicates the logarithmic value of RCR as used in the final model.
Considering the right-skewed distribution of the dependent variable (RCR), a Tweedie regression model with a natural logarithm link function is applied to appropriately capture the relationship between impact and the independent variables [57]. The Tweedie model, suited for data with a mix of zero and positive values, is fitted using the ideal power parameter (1.95), minimizing the Akaike Information Criterion (AIC) for optimal fit. This choice is supported by the observed heteroscedasticity and non-normal residuals, which violate ordinary least squares (OLS) regression assumptions. The variance inflation factor (VIF) less than 5 indicates that multicollinearity is not a concern, meaning that the model can accurately estimate regression coefficients. Additionally, the following variables known to be related to citations from previous studies were incorporated as controls—author counts, time since publication, abstract length, institutional impact, authors’ prior impact, paper novelty, journal impact, funding presence, and international collaboration [58–60] (Table C in S1 File justifies the added covariates improving the model fit, and Table D in S1 File explains the optimal order of the added covariates). The low correlation between the independent variables supports their inclusion in the regression model (presented in Table E in S1 File). Table 2 presents the regression results for both the simple models, where scientific impact is regressed individually against each diversity index, and the full model includes all covariates. In the simple models, all diversity indices are statistically significant, with relatively larger effect sizes. However, when controlling for additional factors in the full model, we observe significant coefficient changes, suggesting potential confounding effects. Our analysis reveals that while all diversity indices are statistically significant, their effect sizes vary depending on the model’s included factors (see Fig 3). The regression is justified by the fits shown in S3 Fig, and the effect sizes capture the multiplicative effects of the diversity indices and the confounding variables on scientific impact.
Each subplot shows the contribution of the specific diversity index (linear, quadratic, and author interaction) at each iterative model fitting process, post-evaluating the best combination of variables. In all, confounding factors minimize the effect size of the diversity index and the diversity-author interaction.
Ethnic diversity shows a positive coefficient in the simple model (0.124) but shifts to a negative association (-0.0258) in the full model. This significant change suggests that, while ethnic diversity initially appears to boost impact, the inclusion of control variables reveals a negative relationship. The quadratic term for ethnic diversity follows a similar pattern, decreasing from -0.013 to -0.0058, indicating a stronger diminishing return when additional factors are considered. However, a significant positive interaction with the author count in the final model (coefficient: 0.0569) suggests that larger research teams (greater than the median team size of 5) would benefit from greater ethnic diversity. Similarly, gender diversity presents a slight negative coefficient in the simple model (-0.008), but in the full model, this shifts to a positive and significant coefficient (0.0372). The quadratic term for gender diversity also reverses, from -0.016 to 0.0249, highlighting that the positive impact of gender diversity grows as the diversity value increases. Interestingly, while combining ethnic and gender diversity tends to reduce impact (coefficient: -0.0122), larger author teams mitigate this estimate in terms of practical significance, suggesting that team size plays a key role in effect size. Age diversity shows a consistent negative association across both models, though the magnitude intensifies in the full model (-0.041 to -0.0627), suggesting that as team members’ academic ages diverge, the negative impact on scientific outcomes becomes more pronounced when other factors are controlled. However, the positive interaction with gender diversity indicates that mixed-gender teams with junior and senior researchers yield better research outcomes. Expertise diversity, on a logarithmic scale, shows a positive relationship with scientific impact (coefficient: 0.0136), with this effect size increasing at higher levels (positive quadratic term of 0.0192), highlighting the value of team members with complementary knowledge backgrounds. However, a negative interaction of expertise diversity with author count reveals that this benefit is most pronounced in smaller and medium-scale teams. As the author count increases, there exists a diminishing result due to the interaction effect, suggesting that while diversity is advantageous, larger teams may encounter challenges such as conflicts or inefficiencies, which can reduce their overall scientific impact.
Beyond the explored diversity indices, our analysis reveals the other key factors driving scientific impact. The journal’s impact emerges as the strongest predictor, showing a non-linear positive relationship with scientific impact, suggesting that higher-impact journals increase influence, but this association might taper off, suggesting that other factors become more important in determining scientific impact at the highest levels. The other positively correlated variables with impact include the authors’ prior citation count and the author count, indicating that established researchers and larger collaborative teams produce more influential work, also evidenced by prior studies [61, 62]. Abstract length demonstrates a curvilinear relationship with impact, suggesting an optimal length for maximizing visibility and citations. The length of an abstract has a non-linear relationship with impact, indicating an ideal length for maximizing visibility and citations. International collaboration and funding have modest yet significant positive effects, highlighting the importance of diverse partnerships and financial backing in improving research outcomes [27]. Scientific impact also increases with time since publication and institutional prestige, though the latter also follows a slight non-linear trend. This relationship could imply that extremely prestigious institutions could publish a larger volume of research, some of which may not necessarily achieve a high impact compared to their other work. Novel research follows a non-linear relationship with impact, where highly novel papers may struggle initially to gain recognition but tend to accumulate more citations as time passes. The findings further highlight the multifaceted nature of scientific impact, emphasizing the interplay between researcher attributes, institutional prestige, collaboration patterns, and publication strategies in shaping the influence of scientific work.
Robustness tests
To ensure the robustness and validity of the results, we conduct a specification curve analysis, evaluating model results across 81 plausible specifications [63]. This methodological approach tests for a comprehensive range of reasonable specifications- those that enable sensible testing of the research question, are expected to be statistically valid, and avoid redundancy. Additionally, this analysis is complemented by an inferential component combining all specifications into a joint statistical test, assessing the combined evidence supporting the estimate of interest. The model choices evaluated are detailed in Table J of the S1 File, and a comprehensive process description is provided in Note 2.5 of the S1 File.
The results show consistent patterns, with stable median effect sizes across specifications, and highlight the necessity of including author count to avoid biased outcomes (S4 Fig illustrates the marginal ordered effect across all specifications). Additionally, the robustness is further assessed by comparing the results to a shuffled null distribution, where the diversity index values were randomized. This process simulates what the findings would look like if no true effect existed. By contrasting the observed specification curve (showing actual effect sizes) with the null distribution, we can determine whether the observed effects are genuine or merely due to random variation. Our findings indicate that gender and expertise diversity have a positive, consistent impact on scientific outcomes, while ethnicity and age diversity tend to show negative relationships. Table 3 summarizes our key tests, comparing the observed effects to the null hypothesis of no effect (S5 Fig illustrates the observed and the under-the-null curve for each diversity index across specifications).
Multidimensional conceptualization of expertise diversity
Considering the significant and positive relationship of expertise diversity with scientific impact, it is important to present a fine-grained analysis of this measure of diversity. Thus, here we study expertise diversity as a diversity of categories consisting of three separate dimensions: variety, balance, and disparity [12, 55]. Table 4 confirms that the correlations among the three attributes, variety, balance, and disparity, are not significant, indicating distinct attributes of expertise diversity, dexp. Specifically, we observe negative correlations between variety and balance (-0.319), between balance and disparity (-0.278), and a positive correlation between variety and disparity (0.058). This confirms that the three attributes reflect distinct properties of expertise diversity, dexp, and provide evidence for examining their individual relationship with scientific impact.
Table F in S1 File presents the results of the complete regression model that includes all possible confounders. Utilizing a Tweedie regression model (with a logarithmic link function), we observe a U-shaped relationship with variety initially reducing impact (-0.026) but eventually enhancing it (0.011) as the range of topics widens. This pattern suggests that adding more topics may introduce complexities leading to lower article impact; however, as the number of topics increases, with additional authors who bring a broader range of topics, the article’s relevance could improve, enhancing its impact. Conversely, both balance (first order: -0.017 and second order: -0.005) and disparity (first order: -0.059 and second order: 0.009) consistently show a negative association with scientific impact. Therefore, articles with a balanced spread of topics or too distant in the MeSH tree generally see reduced impact. Most articles would benefit from lower disparity values, as a higher disparity suggests that authors from highly diverse fields struggle with communication and methodological alignment, thus lowering the article’s overall impact. However, the negative relationship between disparity and impact weakens beyond a certain threshold (positive second order: 0.009) as the marginal negative estimate gradually decreases. This trend indicates that the relationship between disparity and impact might turn positive, but the threshold is about four standard deviations from the mean and is only reached for a minority of articles. The curvilinear relationships between the three expertise diversity attributes and scientific impact are illustrated in Fig 4.
(a) Second-order regression fit of standardized expertise diversity attributes—variety, balance, and disparity against scientific impact (RCR). (b) Distribution of articles over variety, balance, and disparity, highlighting an initial decline then increase in impact with variety, while balance and disparity mostly reduce impact.
Discussion
Main findings
This study investigates how different types of diversity in author collaborations are associated with scientific impact in PubMed data. We first establish homophily in author collaboration in the case of ethnicity, deth and gender, dgen, implying that researchers tend to associate with other individuals of the same ethnicity and gender. Moreover, overall gender and ethnic diversity in research teams has increased as the researcher pool has become more diverse. Since research collaborations are a form of social interaction, it is essential to include adequate covariates to explain their impact. Examining diversity in isolation, without proper controls, can lead to inflated or incorrect estimates. The results indicate that interactions between diversity indices and significant covariates (such as author count) are crucial for accurately identifying their true relationship with scientific impact.
Our analysis shows that gender and expertise diversity enhance scientific impact for most publications. The positive relationship between gender diversity and scientific impact suggests that greater representation of women likely contributes to improved creativity and team dynamics, leading to improved article impact. Similarly, expertise diversity provides a broader range of knowledge (conceptual MeSH terms) and leads to impactful publications. However, the diminishing association of expertise diversity in larger teams (with six or more authors), reflected in its negative interaction with author count, highlights potential coordination and communication challenges in managing diverse expertise at scale. This suggests that while team gender and expertise diversity drive impactful papers, their benefits may plateau as the author count increases. Larger teams may struggle to fully leverage diverse expertise, suggesting that novelty from expertise diversity doesn’t always translate into high-impact papers in larger collaborations. Ethnic and age diversity show more complex effects, depicting a negative relationship that could reflect systemic biases or challenges in integrating diverse cultural backgrounds and age gaps. However, the positive interaction between ethnic diversity and author count suggests that larger teams (with seven or more authors) may better manage and leverage ethnic diversity, likely due to more resources or stronger support structures. While all diversity indices are statistically significant in the final regression model, the primary practical drivers of scientific impact are high-impact journals, larger teams, authors’ prior citations, and affiliations with prominent institutions. Established researchers in well-resourced teams consistently produce more influential work. Some studies have linked collaboration diversity with improved collective decision-making, enhanced team creativity, and even greater brain synchronization [65]. However, other research reports little to no positive relationship between demographic diversity and team performance [66]. Our findings show that while diversity matters, it has a modest correlation with overall impact.
Lastly, an important contribution of this research is the operationalization of expertise diversity through three distinct attributes: variety, balance, and disparity. Variety, defined by the number of unique MeSH subcategories, initially has a negative association with impact but reverses this relationship once a certain threshold is surpassed, likely due to the integration of diverse perspectives. Balance, representing the spread of expertise (distribution of MeSH terms across all authors), consistently has a negative relationship with impact, especially at higher values. This suggests that an uneven spread of varying knowledge and skills in co-authors across too many areas can hinder effective collaboration, likely due to coordination difficulties or a lack of deep focus on key topics. Disparity, or the cognitive distance between MeSH terms, begins with a negative association with scientific impact but stabilizes at higher values. This trend indicates that while too much distance between collaborators regarding knowledge concepts may impede cohesion, a moderate level of disparity can bring fresh insights into research. Together, the results suggest that high-impact publications benefit from a delicate balance of expertise diversity: teams should engage collaborators with a broad range of expertise (high variety), but concentrate on a few core disciplines (low balance) and maintain a low to moderate cognitive distance (disparity).
Conclusion
Our findings demonstrate that diversity in research teams spanning expertise, gender, ethnicity, and age uniquely shapes scientific impact in complex ways. Here, we align with perspectives from social epistemology that knowledge production is social, where team composition plays a crucial role in research outcomes [67]. Groups with varied backgrounds and training lead to better data processing and decision-making, a concept studied as cognitive diversity [68]. However, while we examined demographic diversity (ethnicity, gender, age) and diversity in knowledge (expertise), other dimensions of cognitive diversity, such as methodological preferences and personality influences, could also be beneficial toward scientific impact.
The results indicate the positive impact of expertise diversity, aligning with arguments that teams with diverse knowledge bases and thinking styles often outperform more homogeneous groups in addressing complex problems. When researchers from various specialties collaborate, they contribute distinct analytical tools and knowledge bases, and tasks are essentially distributed extending beyond individuals [68, 69]. The positive association between gender diversity and scientific impact aligns with epistemological frameworks that advocate for integrating diverse viewpoints, suggesting that social positions bring necessary perspectives to the table [70]. Diverse teams may identify different research questions or interpret data through various theoretical perspectives. Furthermore, the positive outcomes associated with ethnic and age diversity in larger teams imply that well-resourced teams can more effectively distribute ‘cognitive tasks’ [69]. Our analysis of team size and collaboration patterns highlights how diverse methods and knowledge bases within research networks help avoid incorrect or suboptimal conclusions [71]. This exchange appears more effective in larger, well-structured teams, where varied perspectives are more readily managed and integrated. For instance, researchers from different disciplines may conceptualize the same problem through distinct theoretical lenses, employ varied methodological approaches, or draw insights from different bodies of knowledge [4]. Our findings on expertise diversity attributes show that successful collaboration depends not just on assembling cognitively diverse teams but on creating conditions where different thinking styles can effectively combine. These results have practical implications for research practices. We propose that teams enhance cognitive diversity by including members with different backgrounds, problem-solving approaches, and analytical frameworks. Institutions can support this through integrating diverse perspectives, and team leaders can adopt strategies to maximize their impact. Future research could explore how new communication patterns relate to integration [70, 71]. Overall, our study shows that fostering diversity with the right structures is the key to improve research impact and advance scientific knowledge [68].
Limitations
This study investigates whether diversity within scientific teams can positively influence the impact of their publications. While the results are consistent across a set of model specifications, the results should be taken with caution, given various limitations. First, our data was sampled for three broad areas (biology, medicine, and science), represented by the top 40 most frequent journals in PubMed, with 2–12 authors. Although this selection offers robust and accessible data on authorship and publications, PubMed primarily aggregates articles from biomedical and life sciences journals (e.g., Nature, PNAS, Science). As a result, the findings may not extend to disciplines outside biomedicine, health, or life sciences. Next, to ensure the robustness of our results, we employed specification curve analysis, testing our model findings across multiple parametrizations. However, some researchers may consider certain specifications superior to others. Furthermore, while our analysis encompasses numerous valid specifications, it cannot exhaustively capture every possible analytical approach. Thus, this analysis reduces analytical ambiguity but cannot eliminate it.
Next, we used the Rao-Stirling index to measure diversity, but this is just one option among several viable alternatives, each with its strengths [72–74]. The gender classification tool Genni [76] accounts for the complexities of predicting gender from English names, but it may misclassify many Asian names as “unknown/unisex.” This suggests a potential link between predicted gender and ethnicity, especially given the unique nature of the categories in our dataset (presented in Table H in S1 File). Also, the ethnicity prediction tool, Ethnea [76], categorizes authors into a limited number of groups, which may not fully capture the complexity of ethnicity as a broad concept encompassing shared cultural practices. For example, two authors classified as ‘German’ may have very different cultural backgrounds—one might be an American of German descent, while the other could be a resident of Germany. Additionally, the tool does not recognize ‘African’ as a distinct ethnicity. Another limitation is that Ethnea treats all ethnic categories as equally distant. For instance, it assumes that the distance between German and French is the same as between German and Indian, which may not accurately reflect cultural similarities and differences. Similarly, our understanding of expertise diversity relies on MeSH categories, which may miss important nuances present in finer sub-categories specific to PubMed.
Our scientific impact metric, the Relative Citation Ratio (RCR), also differs from other approaches that use citation normalizations or total citation counts over varying periods, potentially leading to variations in the trends and outcomes observed [38, 42]. Citations are only one measure of research success, and various social factors make attributing specific effects to each variable challenging. Other impact metrics, such as mentions in white papers, policy documents, technical reports, or social media, may provide a different understanding of research influence [77]. Finally, while our analysis focuses on the diversity within individual papers, a more accurate approach might involve examining diversity at the research group level, which could offer a better unit of analysis for social dynamics. Researchers increasingly seek strategic collaborations to enhance their impact amid competitive bibliometric pressures [78]. Our study reveals how team dynamics influence scientific impact while highlighting the need to examine further individual integration within research teams. We recommend further exploring collaboration patterns using established measures and qualitative analysis for stronger comparative evidence.
Data and methods
Dataset
This research uses a snapshot of PubMed data, including articles in biomedicine and life sciences. To obtain author-disambiguated data, we utilize ‘Authority 2018’ [79], which enables author identification across publications. Disambiguation helps to 1) identify the same author even when the author has published under multiple name variant name, 2) model distinct researcher’s history, and 3) identify author-level data, such as their gender, ethnicity, prior citation count, and computing the diversity indices. Additionally, we incorporate the MapAffil dataset [80] to obtain author-disambiguated affiliation data, including institutional, state, and counter information. Moreover, before December 2013, PubMed did not comprehensively record author affiliation and characteristics for every contributor to a publication. Consequently, the author-disambiguated dataset enables the prediction of reliable imputed information that is author-specific. To identify the relative scientific contribution of a journal, we utilize Scimagojr [81], a publicly available online platform to access the journals’ h-indices. To construct the dataset, we first collected the top 40 journals from the entirety of the dataset. The journals collected were divided into three subject categories: medicine, biology, and science as per previous literature [82]. Next, to get a representative dataset, we sampled papers from the collected journals where the author count was between 2 and 12 (inclusive) between 1991 and 2014 (inclusive). Our analysis spans multiple publication years, allowing articles sufficient time to accumulate citations and ensure robust citation-based analysis [83]. Articles with a single author are not included as they would be assigned a ‘null’ value for diversity. We restricted our study to articles categorized as ‘journal articles,’ excluding review letters, letters to the editor, and news articles. Overall, we analyze 907024 unique papers and 1316838 unique authors.
Measures
The study uses the Relative Citation Ratio (RCR) as the primary variable of interest, obtained from the iCite web application [50], providing bibliometric data for individual scientific publications. Since RCR exhibits a right-skewed distribution, we apply a natural logarithm transformation in the modeling process. Utilizing a logarithmic link function, the coefficients reflect the proportional change in RCR rather than absolute changes. The control variables are intended to represent author-specific, paper-specific, and journal-specific factors that could influence the number of citations an article receives. The features used for modeling are represented in Table 5.
Diversity measures.
Our analysis explores the dynamics of collaboration diversity among researchers regarding the authors’ gender, ethnicity, academic age, and expertise. Functional diversity is a widely studied research topic in biological communities to examine ecosystem processes [85, 86]. To compute dgen, dage and dexp, we employ the Rao-Stirling index (Q), which uses the abundance of entities and their pairwise distance to compute diversity [55, 87]. For dgen, ‘Male’ and ‘Female’ are assigned a unit distance, while authors labeled ‘unknown/unisex’ are positioned at a half-unit distance from the other categories. For dage, the numerical age value is categorized into discrete bins, facilitating the calculation of co-authors’ age diversity value. Rao-Stirling index (Q) for any article, p is computed using Eq 2, where S is the total number of diversity elements. The term dkl quantifies the distance, and Pk and Pl denote the proportions of individual categories, k and l.
(2)
To further understand expertise diversity, we adopt a multidimensional approach of decomposing this diversity index into three attributes: variety, balance, and disparity [55, 87]. Let p denote a research article in our dataset. For each author i of article p, we define their expertise aip using the top-k Medical Subject Heading (MeSH) terms from their previous publications. For each article p, we aggregated the collected list of MeSH terms (e.g., ‘B01.050.150.900.649’, ‘A08.186.211’, and ‘F03.615.400.100’), with each unique term individually denoted as m′. Additionally, the first-order primary subcategories referred to as ‘qualifiers,’ such as ‘B01’, ‘A08’, and ‘F03’ within the MeSH hierarchy, highlight the primary area of expertise. Let M denote the aggregated set of qualifiers associated with the article, p, and each qualifier be denoted as mi. Using these notations, Eq 3 provides the formulae for each attribute.
(3)
where,
represents the edge distance between two MeSH terms
and
, and c is a normalization factor, defined as the product of the maximum possible distance dmax between any two terms and the logarithm of the tree depth h, ensuring that the disparity measure is appropriately scaled to reflect the data’s complexity.
Finally, to compute expertise diversity, dexp, we utilize Eq 2, where Pk and Pl denote the proportion of paired MeSH qualifiers within the article’s aggregated list of MeSH qualifiers, m ∈ M, and dkl denotes the edge distance between the MeSH terms, normalized by the same factor utilized in Eq 3, c. Lastly, for ethnic diversity, denoted as deth, the Rao-Stirling index generalizes to the Gini-Simpson index (G) when any two entities are entirely dissimilar. This measure (G) is defined in Eq 4, where qi signifies the proportion of each distinct diversity element, with R representing the aggregated count of unique elements.
(4)
See Note 2.1 in S1 File for detailed descriptions and definitions of all diversity indices, including the decomposed attributes of expertise diversity. Note 2.2 in S1 File further explains their application in calculating the diversity indices.
Supporting information
S1 File. Supplementary file.
This file includes the additional referenced tables and text definitions.
https://doi.org/10.1371/journal.pone.0316890.s001
(PDF)
S1 Fig. Mean scientific impact against diversity indices across journal types.
Relationship between scientific impact, RCR, against all mean diversity indices. Each subplot includes individual data points based on a unique author count value and year of publication. Each regression has also been annotated with Pearson’s r and p values. This correlation is grouped by year and author count, and a correlation value of 0.48—0.7 suggests the presence of a quadratic relation.
https://doi.org/10.1371/journal.pone.0316890.s002
(TIF)
S2 Fig. Interaction between diversity indices and author count.
Each curve indicates the varying estimates for diversity indices, dx:x ∈ (eth, gen, age, log. exp) for different values of the number of authors, indicating the presence of an interaction.
https://doi.org/10.1371/journal.pone.0316890.s003
(TIF)
S3 Fig. Model diagnostics.
The plots indicate that residuals are approximately normal, with some deviations observed due to outliers. Deviance residuals are largely normal across fitted values, with a few high residuals.
https://doi.org/10.1371/journal.pone.0316890.s004
(TIF)
S4 Fig. Descriptive specification curve for each diversity index.
Each subplot displays the ordered array of marginal estimates (including their interaction with author count) for the diversity indices across all specifications, with the horizontal line marking the estimate for the observed data. For diversity indices dependent on author count, dx ∈ (eth, age, exp), the estimates total 81. In contrast, dx = (gen), which is not dependent on author count, totals 27. The asterisk * indicates non-significant specifications.
https://doi.org/10.1371/journal.pone.0316890.s005
(TIF)
S5 Fig. Observed and expected specification curves for each diversity index.
Each subplot displays the ordered estimates for the diversity indices across all specifications, comparing the observed and the expected under-the-null distribution. The expected curves are based on 50 shuffled samples where the key predictor, the diversity index value, is shuffled. All specifications are estimated on each shuffled sample, and the dashed lines depict the 2.5th, 50th, and 97.5th percentiles for each of these ordered estimates. The narrow confidence bands under the null for all diversity indices and the consistently low p-values indicate strong evidence for a robust and significant relationship for the observed data.
https://doi.org/10.1371/journal.pone.0316890.s006
(TIF)
References
- 1. Wagner C, Jonkers K. Open countries have strong science. Nature 550, 32–33 (2017); 2017. Available from: https://doi.org/10.1038/550032a. pmid:28980660
- 2. Lee N. Migrant and ethnic diversity, cities and innovation: Firm effects or city effects? Journal of Economic Geography. 2014;15(4):769–796.
- 3. Montalvo JG, Reynal-Querol M. Ethnic diversity and economic development. Journal of Development Economics. 2005;76(2):293–323.
- 4. Galinsky AD, Todd AR, Homan AC, Phillips KW, Apfelbaum EP, Sasaki SJ, et al. Maximizing the Gains and Minimizing the Pains of Diversity: A Policy Perspective. Perspectives on Psychological Science. 2015;10(6):742–748. pmid:26581729
- 5. Parrotta P, Pozzoli D, Pytlikova M. The nexus between labor diversity and firm’s innovation. Journal of Population Economics. 2014;27(2):303–364.
- 6. Dahlan M, Al-Atwi AA, Alshaibani E, Bakir A, Maher K. Diverse group effectiveness: co-occurrence of task and relationship conflict, and transformational leadership. International Journal of Productivity and Performance Management. 2021;.
- 7. Zhang L, Li X. How to reduce the negative impacts of knowledge heterogeneity in engineering design team: Exploring the role of knowledge reuse. International Journal of Project Management. 2016;34(7):1138–1149.
- 8. Couzin-Frankel J. Shaking Up Science. Science. 2013;339(6118):386–389. pmid:23349264
- 9. Shaw RJ, Atkin KM, Bécares L, Albor C, Stafford M, Kiernan KE, et al. Impact of ethnic density on adult mental disorders: narrative review. British Journal of Psychiatry. 2012;201:11–19. pmid:22753852
- 10. Beck E, Williams I, Hope L, Park W. An intersectional model: Exploring gender with ethnic and cultural diversity. Journal of Ethnic and Cultural Diversity in Social Work. 2001;10(4):63–80.
- 11. Smith DG, Schonfeld NB. The benefits of diversity what the research tells us. About campus. 2000;5(5):16–23.
- 12. Hong L, Page SE. Groups of diverse problem solvers can outperform groups of high-ability problem solvers. Proceedings of the National Academy of Sciences. 2004;101(46):16385–16389. pmid:15534225
- 13. Woolley AW, Chabris CF, Pentland A, Hashmi N, Malone TW. Evidence for a Collective Intelligence Factor in the Performance of Human Groups. Science. 2010;330(6004):686–688. pmid:20929725
- 14. Figg WD, Dunn L, Liewehr DJ, Steinberg SM, Thurman PW, Barrett JC, et al. Scientific collaboration results in higher citation rates of published articles. Pharmacotherapy: The Journal of Human Pharmacology and Drug Therapy. 2006;26(6):759–767. pmid:16716129
- 15. Hou J, Ma D. How the high-impact papers formed? A study using data from social media and citation. Scientometrics. 2020;125:2597–2615.
- 16. Franceschet M, Costantini A. The effect of scholar collaboration on impact and quality of academic papers. Journal of informetrics. 2010;4(4):540–553.
- 17. Wuchty S, Jones BF, Uzzi B. The Increasing Dominance of Teams in Production of Knowledge. Science. 2007;316(5827):1036–1039. pmid:17431139
- 18. AlShebli BK, Rahwan T, Woon WL. The preeminence of ethnic diversity in scientific collaboration. Nature Communications. 2018;9(1):5163. pmid:30514841
- 19. Ding J, Shen Z, Ahlgren P, Jeppsson T, Minguillo D, Lyhagen J. The link between ethnic diversity and scientific impact: the mediating effect of novelty and audience diversity. Scientometrics. 2021;126(9):7759–7810.
- 20. Freeman RB, Huang W. Collaborating with People Like Me: Ethnic Coauthorship within the United States. Journal of Labor Economics. 2015;33(S1):S289–S318.
- 21. Lerback J, Hanson B, Wooden P. Association between author diversity and acceptance rates and citations in peer-reviewed earth science manuscripts. Earth and Space Science. 2020;7(5):e2019EA000946.
- 22. Nielsen MW, Alegria S, Börjeson L, Etzkowitz H, Falk-Krzesinski HJ, Joshi A, et al. Gender diversity leads to better science. Proceedings of the National Academy of Sciences. 2017;114(8):1740–1742.
- 23.
Vedres B, Vasarhelyi O. Inclusion unlocks the creative potential of gender diversity in teams; 2022.
- 24. Abramo G, D’Angelo CA, Di Costa F, Solazzi M. University–industry collaboration in Italy: A bibliometric examination. Technovation. 2009;29(6-7):498–507.
- 25. Maddi A, Gingras Y. Gender diversity in research teams and citation impact in economics and management. Journal of Economic Surveys. 2021;35(5):1381–1404.
- 26. Larivière V, Vignola-Gagné E, Villeneuve C, Gélinas P, Gingras Y. Sex differences in research funding, productivity and impact: an analysis of Québec university professors. Scientometrics. 2011;87(3):483–498.
- 27. Abbasi A, Jaafari A. Research impact and scholars’ geographical diversity. Journal of Informetrics. 2013;7(3):683–692.
- 28. Paswan J, Singh VK. Gender and research publishing analyzed through the lenses of discipline, institution types, impact and international collaboration: a case study from India. Scientometrics. 2020;123(1):497–515.
- 29.
Dong Y, Ma H, Tang J, Wang K. Collaboration Diversity and Scientific Impact; 2018. Available from: https://arxiv.org/abs/1806.03694.
- 30. Iribarren-Maestro I, Lascurain-Sánchez M, Sanz-Casado E. Are multi-authorship and visibility related? Study of ten research areas at Carlos III University of Madrid. Scientometrics. 2009;79(1):191–200.
- 31. Chinchilla-Rodríguez Z, Miao L, Murray D, Robinson-García N, Costas R, Sugimoto CR. A global comparison of scientific mobility and collaboration according to national scientific capacities. Frontiers in research metrics and analytics. 2018;3:17.
- 32. Zhang S, Wapman KH, Larremore DB, Clauset A. Labor advantages drive the greater productivity of faculty at elite universities. Science Advances. 2022;8(46):eabq7056. pmid:36399560
- 33. Wang D, Song C, Barabási AL. Quantifying Long-Term Scientific Impact. Science. 2013;342(6154):127–132. pmid:24092745
- 34. Bornmann L, Leydesdorff L. The validation of (advanced) bibliometric indicators through peer assessments: A comparative study using data from InCites and F1000. Journal of informetrics. 2013;7(2):286–291.
- 35. Bornmann L, Marx W. How to evaluate individual researchers working in the natural and life sciences meaningfully? A proposal of methods based on percentiles of citations. Scientometrics. 2014;98:487–509.
- 36. Bergstrom CT, West JD, Wiseman MA. The eigenfactor™ metrics. Journal of neuroscience. 2008;28(45):11433–11434. pmid:18987179
- 37. Bollen J, Van de Sompel H, Hagberg A, Chute R. A principal component analysis of 39 scientific impact measures. PloS one. 2009;4(6):e6022. pmid:19562078
- 38. Yegros-Yegros A, Rafols I, D’este P. Does interdisciplinary research lead to higher citation impact? The different effect of proximal and distal interdisciplinarity. PloS one. 2015;10(8):e0135095. pmid:26266805
- 39. Larivière V, Ni C, Gingras Y, Cronin B, Sugimoto CR. Bibliometrics: Global gender disparities in science. Nature. 2013;504(7479):211–213. pmid:24350369
- 40. Rigby J, Edler J. Peering inside research networks: Some observations on the effect of the intensity of collaboration on the variability of research quality. Research policy. 2005;34(6):784–794.
- 41. Nielsen MW. Gender and citation impact in management research. Journal of Informetrics. 2017;11(4):1213–1228.
- 42. Wang J, Thijs B, Glänzel W. Interdisciplinarity and impact: Distinct effects of variety, balance, and disparity. PloS one. 2015;10(5):e0127298. pmid:26001108
- 43. Glänzel W. Characteristic scores and scales: A bibliometric analysis of subject characteristics based on long-term citation observation. Journal of Informetrics. 2007;1(1):92–102.
- 44. Stegehuis C, Litvak N, Waltman L. Predicting the long-term citation impact of recent publications. Journal of informetrics. 2015;9(3):642–657.
- 45. Moed HF, Burger W, Frankfort J, Van Raan AF. The use of bibliometric data for the measurement of university research performance. Research policy. 1985;14(3):131–149.
- 46. Martin BR, Irvine J. Assessing basic research: some partial indicators of scientific progress in radio astronomy. Research policy. 1983;12(2):61–90.
- 47. Clermont M, Krolak J, Tunger D. Does the citation period have any effect on the informative value of selected citation indicators in research evaluations? Scientometrics. 2021;126:1019–1047.
- 48. Leydesdorff L, Bornmann L, Comins JA, Milojević S. Citations: Indicators of quality? The impact fallacy. Frontiers in Research metrics and Analytics. 2016;1:1.
- 49. Hutchins BI, Yuan X, Anderson JM, Santangelo GM. Relative Citation Ratio (RCR): A New Metric That Uses Citation Rates to Measure Influence at the Article Level. PLOS Biology. 2016;14(9):1–25. pmid:27599104
- 50.
Hutchins B, Santangelo G. iCite Database Snapshots (NIH Open Citation Collection (NIH Open Citation Collection) [Internet]; 2019.
- 51. Lowe HJ, Barnett GO. Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. JAMA. 1994;271(14):1103–1108. pmid:8151853
- 52. Porter A, Rafols I. Is science becoming more interdisciplinary? Measuring and mapping six research fields over time. Scientometrics. 2009;81(3):719–745.
- 53. Rafols I, Meyer M. Diversity and network coherence as indicators of interdisciplinarity: case studies in bionanoscience. Scientometrics. 2010;82(2):263–287.
- 54. Stirling A. Diversity and ignorance in electricity supply investment: Addressing the solution rather than the problem. Energy Policy. 1994;22(3):195–216.
- 55. Stirling A. A general framework for analyzing diversity in science, technology, and society. Journal of the Royal Society interface. 2007;4(15):707–719. pmid:17327202
- 56. Mishra S, Fegley BD, Diesner J, Torvik VI. Self-citation is the hallmark of productive authors, of any gender. PloS one. 2018;13(9):e0195773. pmid:30256792
- 57.
Tweedie MC, et al. An index which distinguishes between some important exponential families. In: Statistics: Applications and new directions: Proc. Indian statistical institute golden Jubilee International conference. vol. 579; 1984. p. 579–604.
- 58. Tahamtan I, Safipour Afshar A, Ahamdzadeh K. Factors affecting number of citations: a comprehensive review of the literature. Scientometrics. 2016;107:1195–1225.
- 59. Onodera N, Yoshikane F. Factors affecting citation rates of research articles. Journal of the Association for Information Science and Technology. 2015;66(4):739–764.
- 60. Bornmann L, Leydesdorff L, Wang J. How to improve the prediction based on citation impact percentiles for years shortly after the publication date? Journal of Informetrics. 2014;8(1):175–180.
- 61.
Weinberger CJ, Evans JA, Allesina S. Ten simple (empirical) rules for writing science; 2015.
- 62. Larivière V, Gingras Y, Sugimoto CR, Tsou A. Team size matters: Collaboration and scientific impact since 1900. Journal of the Association for Information Science and Technology. 2015;66(7):1323–1332.
- 63. Simonsohn U, Simmons JP, Nelson LD. Specification curve analysis. Nature Human Behaviour. 2020;4(11):1208–1214. pmid:32719546
- 64.
Stouffer SA, Suchman EA, DeVinney LC, Star SA, Williams RM Jr. The american soldier: Adjustment during army life.(studies in social psychology in world war ii), vol. 1. Princeton Univ. Press; 1949.
- 65. Xue H, Lu K, Hao N. Cooperation makes two less-creative individuals turn into a highly-creative pair. Neuroimage. 2018;172:527–537. pmid:29427846
- 66. Bell ST, Villado AJ, Lukasik MA, Belau L, Briggs AL. Getting specific about demographic diversity variable and team performance relationships: A meta-analysis. Journal of management. 2011;37(3):709–743.
- 67.
Fuller S. Social Epistemology; 2002.
- 68.
Page S. The difference: How the power of diversity creates better groups, firms, schools, and societies-new edition; 2008.
- 69.
Hutchins E. Cognition in the wild; 1995.
- 70.
Longino HE. Science as social knowledge: Values and objectivity in scientific inquiry. Princeton University Press; 2020.
- 71. Zollman KJ. The epistemic benefit of transient diversity. Erkenntnis. 2010;72(1):17–35.
- 72. Galinsky AD, Todd AR, Homan AC, Phillips KW, Apfelbaum EP, Sasaki SJ, et al. Maximizing the gains and minimizing the pains of diversity: A policy perspective. Perspectives on Psychological Science. 2015;10(6):742–748. pmid:26581729
- 73. Zhang L, Rousseau R, Glänzel W. Diversity of references as an indicator of the interdisciplinarity of journals: Taking similarity between subject fields into account. Journal of the association for information science and technology. 2016;67(5):1257–1265.
- 74. Leinster T, Cobbold CA. Measuring diversity: the importance of species similarity. Ecology. 2012;93(3):477–489. pmid:22624203
- 75.
Smith BN, Singh M, Torvik VI. A search engine approach to estimating temporal changes in gender orientation of first names. In: Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries; 2013. p. 199–208.
- 76.
Torvik VI, Agarwal S. Ethnea–an instance-based ethnicity classifier based on geo-coded author names in a large-scale bibliographic database; 2016.
- 77. Bornmann L. Do altmetrics point to the broader impact of research? An overview of benefits and disadvantages of altmetrics. Journal of Informetrics. 2014;8(4):895–903.
- 78. Sarewitz D. How science makes environmental controversies worse. Environmental science & policy. 2004;7(5):385–403.
- 79.
Torvik V, Smalheiser N. Author-ity 2018—PubMed author name disambiguated dataset; 2021. Available from: https://doi.org/10.13012/B2IDB-2273402_V1.
- 80.
Torvik V. MapAffil 2018 Dataset: PubMed Author Affiliations Mapped to Cities and their Geocodes Worldwide; 2021. Available from: https://doi.org/10.13012/B2IDB-2556310_V1.
- 81.
SCImago Journal Rank. SCImago SJR—Scimago Journal and Country Rank [Portal]; 2023. Available from: http://www.scimagojr.com.
- 82. Mishra S, Torvik V. Quantifying conceptual novelty in the biomedical literature. D-Lib Magazine. 2016;22(9-10). pmid:27942200
- 83. Wang J. Citation time window choice for research impact evaluation. Scientometrics. 2013;94(3):851–872.
- 84.
Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. In: D-Lib magazine: the magazine of the Digital Library Forum. vol. 21. NIH Public Access; 2015.
- 85. Cadotte MW, Cavender-Bares J, Tilman D, Oakley TH. Using phylogenetic, functional and trait diversity to understand patterns of plant community productivity. PloS one. 2009;4(5):e5695. pmid:19479086
- 86. Ricotta C. A note on functional diversity measures. Basic and applied Ecology. 2005;6(5):479–486.
- 87. Rao CR. Diversity and dissimilarity coefficients: A unified approach. Theoretical Population Biology. 1982;21(1):24–43.