Knowledge flows from science to AI technology: Identifying core and brokerage technological roles

doi:10.1371/journal.pone.0341005

Fig 1.

Research framework.

This figure presents the overall research framework used to identify AI technologies and examine science-to-technology knowledge flow across four five-year periods spanning 2002 to 2021. The framework combines CPC-based network analysis for classifying core and brokerage technologies, BERTopic modeling for extracting topic clusters from both technological patents and their cited scientific publications, and generative-AI-based topic labeling to enhance interpretability. By aligning patent topics with those of the scientific literature they reference, the framework traces how scientific knowledge develops and transforms into technological knowledge within the AI domain.

More »

Expand

Fig 2.

Designed prompt structure.

This figure illustrates the structured prompt design used to generate interpretable topic labels for BERTopic clusters derived from both patents and scientific publications. The prompt is composed of an input section containing extracted keywords, a task description outlining the labeling objective, and an output instruction specifying the required format and constraints. This structured approach guides the generative AI model toward producing clear, relevant, and consistent topic labels across all clusters.

More »

Expand

Table 1.

Top 10 main group CPCs by centrality in network analysis (5-year periods). This table presents the top ten main-group CPC codes for each five-year period of AI patent filings, identified through analysis of the CPC co-occurrence network and ranked according to weighted degree centrality and betweenness centrality. Weighted degree centrality captures the overall connectivity of a CPC with other CPCs and serves as an indicator of core technologies, while betweenness centrality captures the brokerage technologies that link otherwise heterogeneous technological areas.

More »

Expand

Fig 3.

Technology Identification for AI Technologies.

This figure visualizes major AI technology topics derived from centrality analysis of the CPC co-occurrence network. The horizontal axis indicates core-technology intensity (weighted degree centrality), ranging from domain-centric technologies on the left to AI-centric technologies on the right. The vertical axis represents brokerage (betweenness centrality), with higher positions showing stronger cross-domain connectivity. Based on these two dimensions, patents are grouped into four structural categories: mainstream, main AI application, AI-specific, and peripheral application technologies. Topic labels were generated using BERTopic and refined with generative AI. The figure summarizes consistent structural patterns across four five-year periods of AI patent data over 20 years.

More »

Expand

Fig 4.

Knowledge Flow from science to technology in AI Technologies.

This figure compares the semantic alignment between technological topics extracted from AI patents and scientific topics from their cited publications, highlighting category-specific patterns of science-to-technology knowledge flow. The horizontal axis represents the degree of core technological orientation, ranging from domain-centric flows on the left to AI-centric flows on the right. The vertical axis distinguishes two modes of scientific utilization: Science-embedded Technological Development at the top, where technological objectives guide the embedding and adaptation of scientific knowledge, and Science-based Technological Realization at the bottom, where scientific findings are directly translated into technological solutions. The four resulting quadrants capture the intersections of these dimensions and reveal distinct structural patterns of knowledge flow. The figure synthesizes stable patterns observed across four consecutive five-year periods covering 20 years of AI patent data.

More »

Expand

Table 2.

Top 10 main group CPCs by centrality in network analysis (10-year periods). This table presents the top ten main-group CPC codes for each ten-year period of AI patent filings, identified through CPC co-occurrence network analysis and ranked by weighted degree centrality and betweenness centrality. Weighted degree centrality reflects the extent to which a CPC connects to other CPCs and indicates its role as a core technology, while betweenness centrality captures its brokerage technology in bridging distinct technological domains.

More »

Expand

Fig 5.

Knowledge Flow from science to technology in AI Technologies (10-years periods).

Unlike the earlier five-year analyses, this figure presents a ten-year aggregated analysis to assess the robustness of science-to-technology knowledge flow patterns. Each category is divided into two stacked panels, with the upper and lower boxes representing the first and second ten-year periods, respectively. As before, the horizontal axis reflects core technological centrality, and the vertical axis captures brokerage-based modes of knowledge integration. The results show that core technologies maintain AI-centric knowledge flows, while brokerage technologies exhibit science-embedded patterns in which technological objectives guide the embedding and use of scientific knowledge. The figure confirms that these structural patterns remain stable even when the temporal aggregation window is expanded.

More »

Expand

Table 3.

ZINB regression results for scientific citation intensity by technology category. This table presents the Zero-Inflated Negative Binomial (ZINB) regression results examining the relationship between four technology categories, defined by their core and brokerage characteristics, and the number of scientific publications cited at the patent level. The dependent variable is the number of scientific publications cited by each patent, while the independent variables are dummy indicators for the four technology categories. The control variables include CLAIMS, FAMILY, INVENTORS, APPLICANTS, and PAT AGE, all of which were log-transformed to adjust for skewness and scale differences. Period dummies are incorporated in Model 3. The results indicate that Category 2 patents exhibit the highest level of scientific citation, followed by Categories 1, 3, and 4.

More »

Expand