Artificial intelligence and network science as tools to illustrate academic research evolution in interdisciplinary fields: The case of Italian design

Daniele Pretolesi; Ilaria Stanzani; Stefano Ravera; Andrea Vian; Annalisa Barla

doi:10.1371/journal.pone.0315216

Abstract

In this paper, we explore the application of Artificial Intelligence and network science methodologies in characterizing interdisciplinary disciplines, with a specific focus on the field of Italian design, taken as a paradigmatic example. Exploratory data analysis and the study of academic collaboration networks highlight how the field is evolving towards increased collaboration. Text analysis and semantic topic modelling identified the evolution of research interest over time, defining a ranking of pairs of keywords and three prominent research topics: User-Centric Experience Design, Innovative Product Design and Sustainable Service Design. Our results revealed a significant transformation in the field, with a shift from individual to collaborative research, as evidenced by the increasing complexity and collaboration within groups. We acknowledge the limitations faced by this work, suggesting that the methodology may be primarily suitable for bibliometric and more silos-like disciplines. However, we emphasize the urgency for the scientific community to address the future of research not indexed by large open-access databases like OpenAlex.

Citation: Pretolesi D, Stanzani I, Ravera S, Vian A, Barla A (2025) Artificial intelligence and network science as tools to illustrate academic research evolution in interdisciplinary fields: The case of Italian design. PLoS ONE 20(1): e0315216. https://doi.org/10.1371/journal.pone.0315216

Editor: Diego R. Amancio, Universidade de Sao Paulo, BRAZIL

Received: May 3, 2024; Accepted: November 21, 2024; Published: January 14, 2025

Copyright: © 2025 Pretolesi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All data are available from https://github.com/annalisabarla/OA-ItalianDesign.

Funding: This work is partially funded by the European Union - NextGenerationEU and by the Ministry of University and Research (MUR), National Recovery and Resilience Plan (NRRP), Mission 4, Component 2, Investment 1.5, project “RAISE - Robotics and AI for Socio-economic Empowerment” (ECS00000035) as A. Barla is part of the RAISE Innovation Ecosystem. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

While facing an increasing number of multidisciplinary wicked problems [1, 2], science continues to operate under a traditional division into disciplines; silos that were historically structured to serve the needs of organization and specialization rather than innovation. This logic of specialization, which was once highly valuable for advancing knowledge in specific fields, now shows diminishing returns in an era defined by complexity. In a world where problems are interconnected and span multiple domains, the siloed development of knowledge limits the ability to address complex, real-world issues. Integration across disciplines is increasingly necessary, as the incremental utility of further specialization is often insufficient to tackle the challenges posed by today’s multifaceted problems. Historically, integration was the responsibility of decision-makers; however, artificial intelligence now offers the opportunity to apply integration at various stages of research, including problem definition, framing, modeling, and supporting problem-solving. Indeed, the definition of disciplines and their subdivisions usually trail behind the progress of science and the emergence of new discoveries. Also, they often reflect cultural influences and respond to the growing demand for specialization in various fields. In this scenario, many scientists often limit their collaborative efforts within their fields rather than seeking connections with other disciplines, limited, for instance, by the additional time and effort required to establish common ground and frameworks for interdisciplinary projects, and by how they will be evaluated during their academic careers. Nevertheless, science aims to comprehend the world we live in through a collaborative and social process [3, 4] and it is bound to do so in an ever-increasing set of complex challenges. For this reason, the pursuit of monocultural knowledge appears constrained and helpless.

Science of Science (SciSci) [5] emerges as a discipline able to provide the tools and the methods to unravel scientific complexity. Stemming from the computational domain, SciSci aims to provide to all disciplines rigorous scientific and analytical methodologies. The aim is to induce in each discipline a process of self-understanding to build their future in the academic community. Ultimately, SciSci seeks to map the evolution of science to provide a design tool of think-thank like initiatives to influence how our society deals with the wicked problems of our time.

In this work, we investigate how Artificial Intelligence (AI) approaches may be used to characterize interdisciplinary disciplines. In fact, this work uses data-driven approaches to, possibly, analyse any discipline, especially those where cross-fertilization is paradigmatic. This is particularly challenging when transitioning from hard sciences to social sciences and humanities which present unique challenges, particularly due to their reliance on specific scholarly output formats such as books and book chapters. Indeed, a key issue with publications and citation public databases is their emphasis on journals, often overlooking other output types of scientific knowledge dissemination like books, proceedings and reports. Additionally, these tools do not provide—yet—comprehensive geographic coverage or adequately capture the breadth and depth of subject matter in thematic journals.

As a paradigmatic case study, we decided to consider the discipline of design as it is a growing cross-boundary research field spanning from Engineering and Computer Science to Economics, Social sciences, and Arts. Here, the fragmentation across disciplines may be particularly pronounced, as reflected in the fact that even domain experts might not possess a comprehensive grasp of the entire field. This very dynamism contributes to design’s rapid evolution. Furthermore, the influence of local academic and industrial contexts suggests a potentially stronger geographical influence compared to other fields. As a result, the field of design has a heightened interest in self-reflection and, consequently, a greater imperative to effectively communicate its scientific foundation. This focus on understanding and communicating its own knowledge base positions design research as a valuable case study for exploring the challenges and opportunities presented by the interdisciplinary landscape. As we shift from purely exploratory data analysis [6] towards AI-driven methods, we restrict our focus to the Italian academic design research as we can provide a direct inquiry and assessment among the Italian design community to validate the results and observed trends. Since we employ a data-driven approach, we look for repositories that possibly hold all research knowledge. Initially, we looked into IRIS [7], the official repository used by the Italian academy, which would have been the optimal source for the analysis, as all Italian faculties have to periodically submit the list of their research products into the system. Unfortunately, IRIS is not designed to collect standardized reusable data, preventing it from becoming a reliable resource for research development. Hence, we considered different platforms such as Scopus [8], WoS [9], and OpenAlex [10]. We picked the third as it provides free APIs and metadata, and it is shaping up as a solid standard in the field of scientometrics [11]. Moreover, OpenAlex is comparable in size to Scopus and WoS in terms of collected works and authors [12].

To carry out our analysis, we leverage cutting-edge tools and methodologies from the fields of SciSci and Network Science (NetSci) to analyze scientific production data. These AI-driven approaches play a pivotal role in unravelling the Academic Collaboration Network (ACN) within the design field, facilitating a holistic examination of its multifaceted nature.

The remainder of this paper is organized as follows: first, we illustrate the state-of-the-art in the science of science for interdisciplinary research, then we describe materials, methods and the experimental setting and, finally, we illustrate the obtained results. We conclude the paper by discussing what we found and the possible implications for future works.

Related works

Researchers have long been addressing interdisciplinarity and its role in fostering successful scientific endeavours. In [13] the assumption is that researchers are often driven towards boundary-crossing research, looking for a trade-off between high productivity versus maintaining a broad perspective. Information initiatives can offer flexibility, enabling researchers to redirect their focus from their primary specialization towards the peripheral areas that enrich their interdisciplinary work. For example, the authors of [14] define a framework of problem-solving agents, showing how diverse problem solvers can outperform groups of high-ability problem solvers. Similarly, [15] explores the relationship between interdisciplinary research and research impact, showing that higher interdisciplinarity is significantly associated with increased research impact. More recently, [16] highlights the risk of scientific monocultures, including the one induced by AI, which may lead to a more biased and error-prone understanding of the world hence preventing innovation.

Despite the long-standing evidence highlighting the importance of interdisciplinarity for the advancement of research, it was also found that interdisciplinary proposals tend to exhibit lower rates of funding success [17, 18] and are often perceived as high-risk proposals. Also, interdisciplinary research entails significant costs, including time investment in fostering collaborative relationships and aligning diverse perspectives. This may yield fewer and more heterogeneous research outputs compared to those from more discipline-specific efforts, at the risk of being under-evaluated by traditional evaluation metrics [19–21].

Often supported by network science [22], the study of interdisciplinary behaviours in research has been mostly conducted by focusing on two aspects: citation patterns and collaboration networks.

By representing a network of publications and citations, in [23] the authors exploit citations to explore interdisciplinary patterns in six different research fields over a span of 30 years, noting how research has mostly become a team effort and assessing how science is becoming more cross-disciplinary even if knowledge transfer appears to occur in small steps and only towards neighbouring fields.

By defining authors as nodes and collaborations as edges, network analysis unveils crucial insights into the structure and dynamics of academic collaboration [24]. Metrics like degree and betweenness centrality may help identify key players and influential groups, while community detection algorithms may reveal cohesive clusters within the network. Finally, through evolutionary analysis, researchers can track changes over time, shedding light on emerging trends and policy impacts. For example, in [25] co-authorship networks are used to predict authors’ research impact.

More recently, the evolution of research trends and interdisciplinarity are studied within the current widespread use of deep learning methods and large language models in different subfields of research [26, 27], spanning from medicine [28, 29] to economy [30] to material science [31] and to artificial intelligence [32].

Materials

To carry out our investigation, we put together a dataset of Italian design publications, starting from the official list of the 243 faculties currently affiliated with the ICAR/13 scientific disciplinary sector defined by the Italian Research Ministry (https://cercauniversita.mur.gov.it/php5/docenti/cerca.php). Each author is identified by a set of attributes including their name, last name, and affiliation. Using this list we query the OpenAlex online catalogue APIs, retrieving the list of all their publications. Evoking the renowned Library of Alexandria, a cultural and literary beacon of the ancient world, OpenAlex emerged as a successor to Microsoft Academic Graph (MAG) [33], a vast repository of scientific research publications that ceased operations on December 31, 2021. Recognizing the significance of unfettered access to knowledge, the architects of OpenAlex opted to adopt and adapt MAG’s models, creating a freely available and universally accessible database unshackled from commercial constraints. OpenAlex’s raison d’être, in essence, is the dissemination of knowledge. With over 240 million works readily accessible, OpenAlex expands daily with approximately 50,000 new data points. This immense body of knowledge is meticulously organized into a heterogeneous and directed graph, employing eight distinct node types:

Works: Encompassing abstracts of articles, books, patents, datasets, and theses, these entities represent the foundation of scholarly output.
Authors: Every individual contributing to the creation of a work. Note that each author is complemented with an affiliation field listing all known affiliations throughout the years. The affiliations have a country field which allows us to associate the publication with a country of interest.
Sources: Journals, archives, and other repositories preserving works form the backbone of this category.
Institutions: Universities, research centers, and organizations where authors hold affiliations.
Concepts: Abstract ideas addressed in various articles are aptly categorized under hierarchical concepts, with OpenAlex assigning approximately 65,000 of these labels to each work.
Publishers: Companies and organizations responsible for disseminating works.
Research Funders: Those who provide financial support for research endeavours.
Geographic Areas: The locations where authors conduct their research or where works are produced.

Methods

Data handling

Our data collection process began with identifying relevant researchers using the OpenAlex search by name API. To account for potential name ambiguity, we employed strategies that considered researchers’ affiliations. This initial search yielded a total of 5317 publications. Following duplicate removal, the corpus was reduced to 4503 works. To refine our focus, we further restricted the publication window to years with at least 10 published works, resulting in a dataset spanning from 2000 to 2024 and containing 4365 publications. Of these, we kept only the publications that are associated to at least one concept among a list of concepts characterizing the field of design. The list was obtained by a pool of academic researchers in the field, including one of our co-authors, who selected a set of 68 relevant concepts within the OpenAlex catalogue, listed in Table 1. We then extended the list following the OpenAlex concept hierarchy, incorporating all children nodes of the initial set, leading to a total of 230 concepts. After this filter, we were left with 860 publications.

Download:

Table 1. Design concepts identified by experts.

https://doi.org/10.1371/journal.pone.0315216.t001

After, we removed all publications that were not written in English and where the publication type was below a 1% threshold and obtained a total of 834 works. In Table 2 we provide an overview of the types and frequency of works included in the dataset.

Download:

Table 2. Publication type split.

https://doi.org/10.1371/journal.pone.0315216.t002

For the text analysis, we further excluded publications missing titles or abstracts, leading to a dataset of 708 works. Lastly, before proceeding with the analysis, we explored the lengths of abstracts. The gathered statistics provided crucial insights into the distribution of abstract lengths. Notably, we identified that the 5th and 95th percentiles of abstract lengths were 54 and 372 words, respectively. To prevent biased results, we excluded works associated with abstract lengths that fell beyond the two thresholds, whether shorter or longer, narrowing the dataset to a total of 637 works were taken into account.

Heterogeneous graphs as a representation of academic collaboration networks

Heterogeneous Graphs [34], are very efficient abstractions for complex datasets that model data as a graph allowing the presence of multiple types of nodes and/or edges. As depicted in Fig 1, an ACN may be represented as a heterogeneous graph with authors (A), papers (P), years (Y) and concepts (C) as nodes, wherein edges indicate the “co-authorship” (A–A), “write” (A–P), “discusses of” (P-C), “published in” (P-Y) relationships. The characterisation and modelling of ACNs is an ongoing and open problem relevant to understanding how scientific research is evolving and adapting to the ever-increasing complexity of our world [35]. Once we built the heterogeneous graph G, we selected the homogeneous sub-graph of authors, where we only consider links representing co-authorships. To take into account the temporal evolution, we also considered the temporal variable, by generating a set of heterogeneous sub-graphs, once every 5 years, as shown in Fig 7. Similarly, we also considered the corresponding co-authorship sub-graphs to elucidate how the collaboration among design researchers has evolved.

Download:

Fig 1. Scheme of the heterogeneous graph, connecting authors (A), publications (P), years (Y), and concepts (C).

https://doi.org/10.1371/journal.pone.0315216.g001

To evaluate such complex structures we resorted to several state-of-the-art metrics [22]. The density of a graph ranges from 0 to 1. A density of 0 means there are no edges in the graph, while a density of 1 means that every possible edge is present in the graph. Density provides insight into how many connections exist in a graph relative to the total number of possible connections. Higher density indicates a more densely connected graph, while lower density suggests a sparser one.

The average degree provides a measure of the overall connectivity of the graph. A higher average degree indicates that, on average, each vertex has more connections, while a lower average degree indicates fewer connections per vertex.

In graph theory, transitivity is a measure of how interconnected a graph is, specifically in terms of the existence of triangles within the graph. The transitivity of a graph ranges from 0 to 1. A transitivity of 1 means that every connected triple in the graph forms a triangle, while a transitivity of 0 means that there are no triangles in the graph.

The clustering coefficient for an individual node measures the likelihood that its neighbours are also connected to each other. The average clustering coefficient of a graph is then the average of these individual clustering coefficients across all nodes in the graph.

Text analysis and semantic topic modeling

Moving from the exploratory analysis of the dataset towards a semantic approach, we narrowed our focus to publications with accessible abstracts. This approach aims to explore the evolution of the research interests of the entire field over time. In doing so, we will provide objective results of the changing trends that have characterised the last 25 years of Italian design research. We proceeded with two experiments, one based on standard Natural Language Processing (NLP) methods, and the other exploiting deep learning methods and large language models (LLM).

Bigram ranking over time.

In NLP, an N-gram refers to a consecutive sequence of n items (or units) extracted from a particular sample of text or speech. N-grams hold significant importance in text mining and diverse applications within NLP. They encapsulate a set of words that commonly co-occur within a defined context. N-grams find extensive application in computational linguistics, serving various purposes such as text analysis, language modelling, and machine learning [36]. Here, we defined the top bigrams for the subset of publications within a given timespan of 5 years, ranked according to their frequency. Using data visualization, we built a bump chart plot illustrating how bigrams have evolved over time and possibly shading light into the evolution of design research.

Semantic topic modeling.

To capture the essence of the academic research, we proceeded by setting up a semantic topic modelling problem that takes as input an embedding that is a combination of Bidirectional Encoder Representations for Transformers Model (BERT) [37] and Latent Dirichlet Analysis (LDA) [38, 39].

BERT is a transformer-based deep learning model pretrained by Google. It provides a robust representation of the semantic content of documents. For each publication, the title and abstract were fed into the pre-trained BERT model to derive fixed-size embeddings. At the same time, we used LDA to infer a more contextual embedding that may capture the nuances of individual themes within each publication. The concatenated embedding vectors, weighted according to a hyperparameter to balance their relative significance, were then fed into an autoencoder to reduce input dimensionality forcing the encoder to compress the input data into a smaller latent space [40]. This let us obtain a latent representation that capture the most important features or patterns in the data, while discarding the noise or redundancy.

Finally, we set up a clustering problem in the latent space, which we solved with KMeans [41]. The optimal number of clusters K was determined among a set of values based on the maximization of the silhouette score, a metric that assesses the quality of the clustering by measuring the cohesion within clusters and separation between clusters [42]. A higher silhouette score indicates that the object is well-matched to its own cluster and poorly-matched to neighbouring clusters, suggesting a good clustering configuration.

This resulted in a list of topics, each represented by a set of keywords. To provide meaningful labels for these topics, we identified the top 20 most frequent words within each topic and leveraged chatGPT, the chatbot service powered by the GPT language model from OpenAI [43], to generate appropriate names based on such keywords. Finally, we measured researchers’ interest in these topics by counting how many publications were assigned to each topic across all years within the considered time span.

The pipeline (See Fig 2) is inspired by the work of [44]. It was implemented in Python exploiting the L4 GPU High-RAM hardware on Google Colab. The data used in this project are available on GitHub (https://github.com/annalisabarla/OA-ItalianDesign).

Download:

Fig 2. Overview of the pipeline used in this work, highlighting sources and methods to characterise the landscape of Italian design research.

https://doi.org/10.1371/journal.pone.0315216.g002

Results

Data characterisation

Through exploratory data analysis and visual representation, we examined the dataset to gain a comprehensive insight into the collection of design publications and the possible connections among variables. Fig 3 illustrates the yearly distribution of publications, with a visible surging trend since 2000.

Download:

Fig 3. Publication intensity over time.

https://doi.org/10.1371/journal.pone.0315216.g003

Fig 4 depicts the annual publication count, showcasing articles, books and book chapters as the categories with significant representation, defined as those contributing over 1% to the overall publication corpus. The cut-off date in both these figures is set to 2025, since few journal articles in our dataset are already in a preprint stage. Consequently, our analysis focused exclusively on these publication types. Fig 5 illustrates a heatmap delineating the progression of the median author count across different publication types. For articles and book chapters, we noted a steady increase in the number of authors as their series are associated with a positive linear regression coefficient (0.04 and 0.03, respectively).

Download:

Fig 4. Publications type trend above 1%.

https://doi.org/10.1371/journal.pone.0315216.g004

Download:

Fig 5. Publication type over time.

https://doi.org/10.1371/journal.pone.0315216.g005

We then explored the relationship between the concepts that OpenAlex assigns to each publication and the design domain. We present the results in Fig 6. The top panel shows a heatmap displaying the average number of design-related OpenAlex concepts per publication type over time, whereas the bottom panel illustrates a heatmap considering collateral interdisciplinary concepts. To define concepts related to design, we used the same criterion used to filter the publications in the preprocessing, referring to the concepts listed in Table 1.

Download:

Fig 6. Average number of OpenAlex concepts per publication type over time: (a) concepts related to design, and (b) collateral concepts.

https://doi.org/10.1371/journal.pone.0315216.g006

Design landscape with network analysis and graph structures

After data exploration, we exploited graph structure to devise how collaborations have evolved over time in the landscape of Italian design. To this aim, we considered a set of 5-year long periods and, for each, we built a heterogeneous graph, following the schema in Fig 1. The heterogeneous graphs are shown in Fig 7, while the statistics computed on each graph are reported in Table 3.

Download:

Fig 7. Heterogeneous graphs of ACNs in Italian design over the past 25 years.

Each graph displays the network of publications, authors, years, and concepts.

https://doi.org/10.1371/journal.pone.0315216.g007

Download:

Table 3. Summary of heterogeneous graph analytics for different time periods.

https://doi.org/10.1371/journal.pone.0315216.t003

For each ACN we then extracted the co-authorship subgraph, which considers only the authors and the author-author edges and consists of several connected components. This allows us to better understand the evolution of collaboration patterns. We report the corresponding network analytics statistics in Table 4. Finally, we computed the distribution of the number of authors per connected component over time, as shown in Fig 8. This highlights a pattern of increasingly bigger groups: the median value of co-authors per CC increases as well as the maximum observed value.

Download:

Fig 8. Distributions of the number of authors per connected component over time.

https://doi.org/10.1371/journal.pone.0315216.g008

Download:

Table 4. Summary of graph analytics for different time periods.

https://doi.org/10.1371/journal.pone.0315216.t004

Painting the landscape of research interests evolution over time

The next part of our analysis consisted of exploiting state-of-the-art NLP to devise the evolution of research interests among the design community.

First, we considered the subset of publications for which the abstract is available. For the same 5 time intervals defined above, we proceeded with extracting the most relevant bigrams, which are the most frequently occurring word pairs independently of their order. We illustrate how the top-20 bigrams change over time in a bumpchart plot in Fig 9.

Download:

Fig 9. Bumpchart plot displaying the bigram ranking variation over 5 decades from 2000 to 2024.

https://doi.org/10.1371/journal.pone.0315216.g009

Then, we set up a topic modelling experiment where the goal was to consider all publications in the subset and assign each of them to a cluster representing a topic. We take as input title and abstract textual information, proceed with embedding them with two complementary approaches, and find a latent representation in a smaller space, where we solve the clustering problem. The process identified three topics: User-Centric Experience Design, Innovative Product Design and Sustainable Service Design. We visualized the prevalence of publications per topic over time in Fig 10.

Download:

Fig 10. Line chart plot displaying the number of publications per topic over 5 decades from 2000 to 2024.

https://doi.org/10.1371/journal.pone.0315216.g010

Discussion

Our study provides evidence that the field of design is undergoing a significant transformation. The increasing complexity and collaboration within groups suggest a shift from individual researchers conducting studies to larger groups working together. This is further supported by the positive linear regression coefficient (Fig 5 and Table 4), which may indicate a transition towards collaborative research.

Contrary to our initial expectations, we did not observe an increase in the number of concepts per publication over time. This could be attributed to the limited timeframe of our study, induced, in turn by the scarce availability of publications. Another potential explanation could be the limitations of OpenAlex in accurately assigning concepts, possibly due to a lack of comprehensive data on this subject.

The analysis of the heterogeneous graph structure of the Author Collaboration Network (ACN) and the distribution of authors per connected component (Figs 7 and 8) clearly shows an increase in groups of authors collaborating together. These findings are further corroborated by the graph analytics reported in Table 4.

Our topic modelling analysis identified three prominent clusters of related concepts: User-Centric Experience Design, Innovative Product Design and Sustainable Service Design. These clusters also align with the findings of the bigram ranking variation (Fig 9). For instance, the number of publications over time (see Fig 10) in User-Centric Experience Design between 2015 and 2020 is mirrored by the appearance of keywords such as ‘user experience’ and ‘design thinking’ on the plot. Similarly, the descending trend in Sustainable Service Design is reflected by the slight decrease of the ‘design sustainability’ keyword. Lastly, changes in the Innovative Product Design cluster are almost exactly mirrored by changes in keywords such as ‘product design’, ‘product service’, and ‘virtual reality’.

In conclusion, our findings suggest a paradigm shift in the field of design towards more collaborative and group-oriented research. However, further studies are needed to confirm these trends and explore their implications in greater depth.

Limitations

This scientific work acknowledges several limitations. Certain disciplines, particularly in the humanities, heavily favour specific scholarly work formats such as books and book chapters. Regrettably, these formats are not adequately tracked by tools like OpenAlex, WoS, and Scopus. Furthermore, the coverage of humanities and social science journals within these tools lacks comprehensiveness in terms of subject variety and depth.

OpenAlex, despite its high coverage of publications over time and comparable coverage with Scopus and WoS in the past decade, has its limitations. The concepts provided by OpenAlex curators are limited and not accurate, originating from MAG. These are being replaced by more accurate and complex metadata known as Topics and Fields. The coverage of affiliation fields is poor, and the identification process for authors is still ongoing.

The dataset, originating from the Ministry of University and Research (MUR) faculty list, incorporates some bias. It inevitably favours the present and recent past over the distant past, considering only those academics currently in the workforce for Italian universities. The identifiers used are limited to name, last name, and current affiliation, as MUR does not share more reliable identifiers such as ORCID [45]. This results in many namesakes and necessitates filtering using a set of arbitrarily chosen concepts related to design. Consequently, the dataset is very narrow and much smaller than initially expected. This concludes the limitations of this scientific work.

Conclusions

This study presents the promises and limits of data-driven approaches in describing the evolution of interdisciplinary research, with a comprehensive view of the scientific collaboration landscape within Italian design research. It specifically highlights the growth of design into broader communities of researchers, focusing on a unique range of topics that have evolved over time.

We have shown how a pipeline that combines data-driven and network science approaches can indeed provide a cohesive mapping of research. Looking ahead, we can imagine that with the help of Deep Learning and improvements in the OpenAlex data structure, this task could yield even more detailed representations of an interdisciplinary research field. Moreover, AI presents a dual opportunity: it can both strengthen the integration of knowledge across disciplines by uncovering new connections and patterns, while also enhancing specialization by enabling more refined and focused analysis within specific domains. This flexibility positions AI as a powerful tool in navigating the complexity of modern research.

While this work examines the discipline of design as a whole, the long-term goal is to track the evolution of trends to better understand interdisciplinary scientific production and impact, even for individual authors. In trying to balance the effects of an increase in scientific production (i.e., quantity over quality), objective metrics could also be identified in research contexts where citations are not used to evaluate scientific production. Therefore, the scientific community should strive to be able to analyze and understand the evolution of the disciplines that make it up.

Although our investigation effectively depicted this scenario, we acknowledge that the limitations we faced may make this methodology primarily suitable for bibliometric disciplines, such as computer science or medicine. However, we believe it is crucial for the scientific community to urgently address the question of the future of research not indexed by large open-access databases like OpenAlex.

References

1. Buchanan R. Wicked problems in design thinking. Design issues. 1992;8(2):5–21.
- View Article
- Google Scholar
2. Ledford H. Team science. Nature. 2015;525(7569):308.
- View Article
- Google Scholar
3. Viseu A. Integration of social science into research is crucial. Nature. 2015;525(7569):291–291. pmid:26381948
- View Article
- PubMed/NCBI
- Google Scholar
4. Alvarez A, Caliskan A, Crockett M, Ho SS, Messeri L, West J. Science communication with generative AI. Nature Human Behaviour. 2024; p. 1–3. pmid:38438654
- View Article
- PubMed/NCBI
- Google Scholar
5. Fortunato S, Bergstrom CT, Börner K, Evans JA, Helbing D, Milojević S, et al. Science of science. Science. 2018;359(6379):eaao0185. pmid:29496846
- View Article
- PubMed/NCBI
- Google Scholar
6. Vian A, Carella G, Pretolesi D, Barla A, Zurlo F. Mapping the evolution of design research: A Data-Driven analysis of interdisciplinary trends and intellectual landscape. In: Proc. of DRS 2024—to appear. Boston, MA USA; 2024.
7. IRIS;. Online. Available from: https://www.cineca.it/sites/default/files/IRIS_Cineca_web.pdf.
8. Scopus;. Online. Available from: https://www.scopus.com/.
9. Web of Science;. Online. Available from: https://https://www.webofscience.com/wos/.
10. Priem J, Piwowar H, Orr R. OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts; 2022.
- View Article
- Google Scholar
11. Hood WW, Wilson CS. The literature of bibliometrics, scientometrics, and informetrics. Scientometrics. 2001;52:291–314.
- View Article
- Google Scholar
12. Culbert J, Hobert A, Jahn N, Haupka N, Schmidt M, Donner P, et al. Reference Coverage Analysis of OpenAlex compared to Web of Science and Scopus. arXiv preprint arXiv:240116359. 2024;.
13. Palmer CL. Structures and strategies of interdisciplinary science. Journal of the American society for information science. 1999;50(3):242–253.
- View Article
- Google Scholar
14. Hong L, Page SE. Groups of diverse problem solvers can outperform groups of high-ability problem solvers. Proceedings of the National Academy of Sciences. 2004;101(46):16385–16389. pmid:15534225
- View Article
- PubMed/NCBI
- Google Scholar
15. Okamura K. Interdisciplinarity revisited: evidence for research impact and dynamism. Palgrave Communications. 2019;5(1).
- View Article
- Google Scholar
16. Messeri L, Crockett M. Artificial intelligence and illusions of understanding in scientific research. Nature. 2024;627(8002):49–58. pmid:38448693
- View Article
- PubMed/NCBI
- Google Scholar
17. Bromham L, Dinnage R, Hua X. Interdisciplinary research has consistently lower funding success. Nature. 2016;534(7609):684–687. pmid:27357795
- View Article
- PubMed/NCBI
- Google Scholar
18. Bellotti E, Kronegger L, Guadalupi L. The evolution of research collaboration within and across disciplines in Italian Academia. Scientometrics. 2016;109:783–811. pmid:27795593
- View Article
- PubMed/NCBI
- Google Scholar
19. Archambault É, Larivière V. The limits of bibliometrics for the analysis of the social sciences and humanities literature. World social science report 2009/2010. 2010; p. 251–254.
- View Article
- Google Scholar
20. Haustein S, Larivière V. The use of bibliometrics for assessing research: Possibilities, limitations and adverse effects. In: Incentives and performance: Governance of research organizations. Springer; 2014. p. 121–139.
21. Laursen BK, Motzer N, Anderson KJ. Pathways for assessing interdisciplinarity: A systematic review. Research Evaluation. 2022;31(3):326–343.
- View Article
- Google Scholar
22. Barabási AL. Network science. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 2013;371(1987):20120375. pmid:23419844
- View Article
- PubMed/NCBI
- Google Scholar
23. Porter A, Rafols I. Is science becoming more interdisciplinary? Measuring and mapping six research fields over time. Scientometrics. 2009;81(3):719–745.
- View Article
- Google Scholar
24. Barabâsi AL, Jeong H, Néda Z, Ravasz E, Schubert A, Vicsek T. Evolution of the social network of scientific collaborations. Physica A: Statistical mechanics and its applications. 2002;311(3-4):590–614.
- View Article
- Google Scholar
25. Grodzinski N, Grodzinski B, Davies BM. Can co-authorship networks be used to predict author research impact? A machine-learning based analysis within the field of degenerative cervical myelopathy research. Plos one. 2021;16(9):e0256997. pmid:34473796
- View Article
- PubMed/NCBI
- Google Scholar
26. Nichols LG. A topic model approach to measuring interdisciplinarity at the National Science Foundation. Scientometrics. 2014;100:741–754.
- View Article
- Google Scholar
27. Antons D, Grünwald E, Cichy P, Salge TO. The application of text mining methods in innovation research: current state, evolution patterns, and development priorities. R&D Management. 2020;50(3):329–351.
- View Article
- Google Scholar
28. Urru S, Sciannameo V, Lanera C, Salaris S, Gregori D, Berchialla P. A topic trend analysis on COVID-19 literature. Digital health. 2022;8:20552076221133696. pmid:36325437
- View Article
- PubMed/NCBI
- Google Scholar
29. Dalla Costa G, Comi G. Emerging trends in multiple sclerosis research. Multiple Sclerosis and Related Disorders. 2022;68:104124. pmid:36063731
- View Article
- PubMed/NCBI
- Google Scholar
30. Nobre GC, Tavares E. Scientific literature analysis on big data and internet of things applications on circular economy: a bibliometric study. Scientometrics. 2017;111:463–492.
- View Article
- Google Scholar
31. Kim E, Huang K, Saunders A, McCallum A, Ceder G, Olivetti E. Materials synthesis insights from scientific literature via text extraction and machine learning. Chemistry of Materials. 2017;29(21):9436–9444.
- View Article
- Google Scholar
32. Pretolesi D, Garbarino D, Giampaoli D, Vian A, Barla A. Geometric Deep Learning Strategies for the Characterization of Academic Collaboration Networks. IEEE Transactions on Emerging Topics in Computing. 2023;.
- View Article
- Google Scholar
33. Sinha A, Shen Z, Song Y, Ma H, Eide D, Hsu BJP, et al. An Overview of Microsoft Academic Service (MAS) and Applications. In: Proceedings of the 24th International Conference on World Wide Web. WWW ’15 Companion. New York, NY, USA: Association for Computing Machinery; 2015. p. 243–246. Available from: https://dl.acm.org/doi/10.1145/2740908.2742839.
34. Sun Y, Han J. Mining heterogeneous information networks: principles and methodologies. Morgan & Claypool Publishers; 2012.
35. Kozlov M. ‘Disruptive’ science has declined—and no one knows why. Nature. 2023;613(7943):225. pmid:36599999
- View Article
- PubMed/NCBI
- Google Scholar
36. Chowdhary KR. Natural Language Processing. In: Chowdhary KR, editor. Fundamentals of Artificial Intelligence. New Delhi: Springer India; 2020. p. 603–649. Available from: https://doi.org/10.1007/978-81-322-3972-7_19.
37. Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805. 2018;.
38. Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. Journal of machine Learning research. 2003;3(Jan):993–1022.
- View Article
- Google Scholar
39. Blei DM, Lafferty JD. Topic models. In: Text mining. Chapman and Hall/CRC; 2009. p. 101–124.
40. Goodfellow I, Bengio Y, Courville A. Deep Learning. Bach F, editor. Adaptive Computation and Machine Learning series. Cambridge, MA, USA: MIT Press; 2016.
41. Hastie T, Tibshirani R, Friedman T. The Elements of Statistical Learning; 2009.
- View Article
- Google Scholar
42. Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics. 1987;20:53–65.
- View Article
- Google Scholar
43. Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, Aleman FL, et al. Gpt-4 technical report. arXiv preprint arXiv:230308774. 2023;.
44. Green W, Mitchell F, Chen S, Aarti. Topic Modeling Bert+LDA; 2020. Available from: https://www.kaggle.com/code/dskswu/topic-modeling-bert-lda.
45. Haak LL, Fenner M, Paglione L, Pentz E, Ratner H. ORCID: a system to uniquely identify researchers. Learned publishing. 2012;25(4):259–264.
- View Article
- Google Scholar

[ref1] 1. Buchanan R. Wicked problems in design thinking. Design issues. 1992;8(2):5–21.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Ledford H. Team science. Nature. 2015;525(7569):308.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Viseu A. Integration of social science into research is crucial. Nature. 2015;525(7569):291–291. pmid:26381948
View Article
PubMed/NCBI
Google Scholar

[8] View Article

[9] PubMed/NCBI

[10] Google Scholar

[ref4] 4. Alvarez A, Caliskan A, Crockett M, Ho SS, Messeri L, West J. Science communication with generative AI. Nature Human Behaviour. 2024; p. 1–3. pmid:38438654
View Article
PubMed/NCBI
Google Scholar

[12] View Article

[13] PubMed/NCBI

[14] Google Scholar

[ref5] 5. Fortunato S, Bergstrom CT, Börner K, Evans JA, Helbing D, Milojević S, et al. Science of science. Science. 2018;359(6379):eaao0185. pmid:29496846
View Article
PubMed/NCBI
Google Scholar

[16] View Article

[17] PubMed/NCBI

[18] Google Scholar

[ref6] 6. Vian A, Carella G, Pretolesi D, Barla A, Zurlo F. Mapping the evolution of design research: A Data-Driven analysis of interdisciplinary trends and intellectual landscape. In: Proc. of DRS 2024—to appear. Boston, MA USA; 2024.

[ref7] 7. IRIS;. Online. Available from: https://www.cineca.it/sites/default/files/IRIS_Cineca_web.pdf.

[ref8] 8. Scopus;. Online. Available from: https://www.scopus.com/.

[ref9] 9. Web of Science;. Online. Available from: https://https://www.webofscience.com/wos/.

[ref10] 10. Priem J, Piwowar H, Orr R. OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts; 2022.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref11] 11. Hood WW, Wilson CS. The literature of bibliometrics, scientometrics, and informetrics. Scientometrics. 2001;52:291–314.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref12] 12. Culbert J, Hobert A, Jahn N, Haupka N, Schmidt M, Donner P, et al. Reference Coverage Analysis of OpenAlex compared to Web of Science and Scopus. arXiv preprint arXiv:240116359. 2024;.

[ref13] 13. Palmer CL. Structures and strategies of interdisciplinary science. Journal of the American society for information science. 1999;50(3):242–253.
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref14] 14. Hong L, Page SE. Groups of diverse problem solvers can outperform groups of high-ability problem solvers. Proceedings of the National Academy of Sciences. 2004;101(46):16385–16389. pmid:15534225
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref15] 15. Okamura K. Interdisciplinarity revisited: evidence for research impact and dynamism. Palgrave Communications. 2019;5(1).
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref16] 16. Messeri L, Crockett M. Artificial intelligence and illusions of understanding in scientific research. Nature. 2024;627(8002):49–58. pmid:38448693
View Article
PubMed/NCBI
Google Scholar

[41] View Article

[42] PubMed/NCBI

[43] Google Scholar

[ref17] 17. Bromham L, Dinnage R, Hua X. Interdisciplinary research has consistently lower funding success. Nature. 2016;534(7609):684–687. pmid:27357795
View Article
PubMed/NCBI
Google Scholar

[45] View Article

[46] PubMed/NCBI

[47] Google Scholar

[ref18] 18. Bellotti E, Kronegger L, Guadalupi L. The evolution of research collaboration within and across disciplines in Italian Academia. Scientometrics. 2016;109:783–811. pmid:27795593
View Article
PubMed/NCBI
Google Scholar

[49] View Article

[50] PubMed/NCBI

[51] Google Scholar

[ref19] 19. Archambault É, Larivière V. The limits of bibliometrics for the analysis of the social sciences and humanities literature. World social science report 2009/2010. 2010; p. 251–254.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

[ref20] 20. Haustein S, Larivière V. The use of bibliometrics for assessing research: Possibilities, limitations and adverse effects. In: Incentives and performance: Governance of research organizations. Springer; 2014. p. 121–139.

[ref21] 21. Laursen BK, Motzer N, Anderson KJ. Pathways for assessing interdisciplinarity: A systematic review. Research Evaluation. 2022;31(3):326–343.
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref22] 22. Barabási AL. Network science. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 2013;371(1987):20120375. pmid:23419844
View Article
PubMed/NCBI
Google Scholar

[60] View Article

[61] PubMed/NCBI

[62] Google Scholar

[ref23] 23. Porter A, Rafols I. Is science becoming more interdisciplinary? Measuring and mapping six research fields over time. Scientometrics. 2009;81(3):719–745.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref24] 24. Barabâsi AL, Jeong H, Néda Z, Ravasz E, Schubert A, Vicsek T. Evolution of the social network of scientific collaborations. Physica A: Statistical mechanics and its applications. 2002;311(3-4):590–614.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref25] 25. Grodzinski N, Grodzinski B, Davies BM. Can co-authorship networks be used to predict author research impact? A machine-learning based analysis within the field of degenerative cervical myelopathy research. Plos one. 2021;16(9):e0256997. pmid:34473796
View Article
PubMed/NCBI
Google Scholar

[70] View Article

[71] PubMed/NCBI

[72] Google Scholar

[ref26] 26. Nichols LG. A topic model approach to measuring interdisciplinarity at the National Science Foundation. Scientometrics. 2014;100:741–754.
View Article
Google Scholar

[74] View Article

[75] Google Scholar

[ref27] 27. Antons D, Grünwald E, Cichy P, Salge TO. The application of text mining methods in innovation research: current state, evolution patterns, and development priorities. R&D Management. 2020;50(3):329–351.
View Article
Google Scholar

[77] View Article

[78] Google Scholar

[ref28] 28. Urru S, Sciannameo V, Lanera C, Salaris S, Gregori D, Berchialla P. A topic trend analysis on COVID-19 literature. Digital health. 2022;8:20552076221133696. pmid:36325437
View Article
PubMed/NCBI
Google Scholar

[80] View Article

[81] PubMed/NCBI

[82] Google Scholar

[ref29] 29. Dalla Costa G, Comi G. Emerging trends in multiple sclerosis research. Multiple Sclerosis and Related Disorders. 2022;68:104124. pmid:36063731
View Article
PubMed/NCBI
Google Scholar

[84] View Article

[85] PubMed/NCBI

[86] Google Scholar

[ref30] 30. Nobre GC, Tavares E. Scientific literature analysis on big data and internet of things applications on circular economy: a bibliometric study. Scientometrics. 2017;111:463–492.
View Article
Google Scholar

[88] View Article

[89] Google Scholar

[ref31] 31. Kim E, Huang K, Saunders A, McCallum A, Ceder G, Olivetti E. Materials synthesis insights from scientific literature via text extraction and machine learning. Chemistry of Materials. 2017;29(21):9436–9444.
View Article
Google Scholar

[91] View Article

[92] Google Scholar

[ref32] 32. Pretolesi D, Garbarino D, Giampaoli D, Vian A, Barla A. Geometric Deep Learning Strategies for the Characterization of Academic Collaboration Networks. IEEE Transactions on Emerging Topics in Computing. 2023;.
View Article
Google Scholar

[94] View Article

[95] Google Scholar

[ref33] 33. Sinha A, Shen Z, Song Y, Ma H, Eide D, Hsu BJP, et al. An Overview of Microsoft Academic Service (MAS) and Applications. In: Proceedings of the 24th International Conference on World Wide Web. WWW ’15 Companion. New York, NY, USA: Association for Computing Machinery; 2015. p. 243–246. Available from: https://dl.acm.org/doi/10.1145/2740908.2742839.

[ref34] 34. Sun Y, Han J. Mining heterogeneous information networks: principles and methodologies. Morgan & Claypool Publishers; 2012.

[ref35] 35. Kozlov M. ‘Disruptive’ science has declined—and no one knows why. Nature. 2023;613(7943):225. pmid:36599999
View Article
PubMed/NCBI
Google Scholar

[99] View Article

[100] PubMed/NCBI

[101] Google Scholar

[ref36] 36. Chowdhary KR. Natural Language Processing. In: Chowdhary KR, editor. Fundamentals of Artificial Intelligence. New Delhi: Springer India; 2020. p. 603–649. Available from: https://doi.org/10.1007/978-81-322-3972-7_19.

[ref37] 37. Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805. 2018;.

[ref38] 38. Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. Journal of machine Learning research. 2003;3(Jan):993–1022.
View Article
Google Scholar

[105] View Article

[106] Google Scholar

[ref39] 39. Blei DM, Lafferty JD. Topic models. In: Text mining. Chapman and Hall/CRC; 2009. p. 101–124.

[ref40] 40. Goodfellow I, Bengio Y, Courville A. Deep Learning. Bach F, editor. Adaptive Computation and Machine Learning series. Cambridge, MA, USA: MIT Press; 2016.

[ref41] 41. Hastie T, Tibshirani R, Friedman T. The Elements of Statistical Learning; 2009.
View Article
Google Scholar

[110] View Article

[111] Google Scholar

[ref42] 42. Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics. 1987;20:53–65.
View Article
Google Scholar

[113] View Article

[114] Google Scholar

[ref43] 43. Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, Aleman FL, et al. Gpt-4 technical report. arXiv preprint arXiv:230308774. 2023;.

[ref44] 44. Green W, Mitchell F, Chen S, Aarti. Topic Modeling Bert+LDA; 2020. Available from: https://www.kaggle.com/code/dskswu/topic-modeling-bert-lda.

[ref45] 45. Haak LL, Fenner M, Paglione L, Pentz E, Ratner H. ORCID: a system to uniquely identify researchers. Learned publishing. 2012;25(4):259–264.
View Article
Google Scholar

[118] View Article

[119] Google Scholar

Figures

Abstract

Introduction

Related works

Materials

Methods

Data handling

Heterogeneous graphs as a representation of academic collaboration networks

Text analysis and semantic topic modeling

Bigram ranking over time.

Semantic topic modeling.

Results

Data characterisation

Design landscape with network analysis and graph structures

Painting the landscape of research interests evolution over time

Discussion

Limitations

Conclusions

References