The increasing use of mathematical techniques in scientific research leads to the interdisciplinarity of applied mathematics. This viewpoint is validated quantitatively here by statistical and network analysis on the corpus PNAS 1999–2013. A network describing the interdisciplinary relationships between disciplines in a panoramic view is built based on the corpus. Specific network indicators show the hub role of applied mathematics in interdisciplinary research. The statistical analysis on the corpus content finds that algorithms, a primary topic of applied mathematics, positively correlates, increasingly co-occurs, and has an equilibrium relationship in the long-run with certain typical research paradigms and methodologies. The finding can be understood as an intrinsic cause of the interdisciplinarity of applied mathematics.
Citation: Xie Z, Duan X, Ouyang Z, Zhang P (2015) Quantitative Analysis of the Interdisciplinarity of Applied Mathematics. PLoS ONE 10(9): e0137424. https://doi.org/10.1371/journal.pone.0137424
Editor: Eduardo G. Altmann, Max Planck Institute for the Physics of Complex Systems, GERMANY
Received: March 16, 2015; Accepted: August 17, 2015; Published: September 9, 2015
Copyright: © 2015 Xie et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data discussed in this report is the corpus of the papers published in the Proceedings of the National Academy of Sciences (PNAS) in 1999--2013. All interested researchers may locate the data at the websites from http://www.pnas.org/content/96/1.toc to http://www.pnas.org/content/110/52.toc.
Funding: This work is funded by the program for new century excellent talents in university, state education ministry in China (No. 10-0893), key laboratory of high performance computing (No. 201403-01), and the national university of defense technology graduate teaching reform project (No. 201406-01).
Competing interests: The authors have declared that no competing interests exist.
Interdisciplinary research means that data, techniques, concepts, and theories from two or more disciplines are integrated to solve problems whose solutions are beyond the scope of a single discipline or area of research practice [1, 2]. Mathematical science plays an important role in interdisciplinary research, because many problems in various disciplines of physical science, biological science, and social science are using increasingly mathematical techniques . The increasing application of mathematical theories and methods to other disciplines have therefore led to the development of mathematical science, especially applied mathematics .
The panoramic view of the relationships between disciplines can be drawn as a network, regarding the disciplines as nodes and the interdisciplinary relationships as edges. The network is built here by the disciplinary information of the papers published in the Proceedings of the National Academy of Sciences (PNAS, http://www.pnas.org) in 1999–2013. Two disciplines are connected if there is a paper belonging to them both. Then, the interdisciplinarity of disciplines is quantitatively expressed by the network indicators about the strength and breadth of the connections between disciplines, such as degree, betweenness centrality , etc. Those indicators show that applied mathematics not only widely and directly participates in interdisciplinary research, but also makes bridges for carrying interdisciplinary research between other disciplines.
In order to get a more comprehensive understanding of the interdisciplinarity of applied mathematics, we analyze the contents of the papers. The tests of cointegration and correlation on the quarterly numbers of papers containing certain topic words, e.g. “algorithm”, show that the development of algorithms and that of certain research paradigms [6–9] (model, experiment, simulation, and data-driven) and transdisciplinary topics [10–12] (system, network, and control) obey equilibrium relationships in the long-run, and are positively correlated. The co-word occurrence analysis shows the increasing trends of algorithmization of those research paradigms and transdisciplinary topics. Those found relationships can be considered as causes of the interdisciplinarity of applied mathematics.
This paper is organized as follows. The data processing is introduced in Section 2. The network analysis is shown in Section 3. The statistical analysis is presented in Section 4. The conclusion is drawn in Section 5.
The journal PNAS publishes high quality research reports, commentaries, reviews, perspectives and letters. The corpus analyzed here consists of 52,803 papers published in PNAS in 1999–2013. The journal provided the discipline information of the papers (Fig 1). There are 3 first level disciplines, viz. biological science, physical science, and social science, and 39 second level disciplines, such as mathematics, computer science, etc. So the papers can be classified according to their discipline information.
The panels (a,b) respectively come from http://www.pnas.org/content/110/18.toc, http://www.pnas.org/content/110/18.toc#PhysicalSciences.
Most of the papers have been classified by the first and second level disciplines. Some papers are only classified by the first level disciplines. For those papers, we considered their second level discipline to be the same as their first level one. Hence we added the first level disciplines into the set of second level disciplines. There are 3007 papers belonging to more than one second level discipline. For example, Ref  belongs to applied mathematics and ecology. Those papers can be considered to be interdisciplinary papers. The discipline information of the papers will be used to build a network describing the interdisciplinary relationships between disciplines in Section 3.
Many papers have used mathematical techniques, but are not classified into applied mathematics. Thus, we should analyze the contents of the papers. The python package Natural Language Toolkit (NLTK, http://www.nltk.org) is used to build the dictionary for the corpus by its function of morphological reduction. The dictionary contains 31,542 words (S1 Text). Those words belong to the lexicon of NLTK, which includes the English WordNet. Based on the dictionary, the document-term matrix for the corpus is generated, in which the rows correspond to the papers in the corpus and columns correspond to the words. Together with the publication dates of the papers, the quarterly numbers of the papers containing certain words are extracted for analyzing the relationships of algorithms to certain research paradigms and transdisciplinary topics in Section 4.
Network analysis of the interdisciplinarity of applied mathematics
Based on the discipline information of the corpus, a network describing the connections among disciplines is constructed (The discipline network, Fig 2), in which the nodes are the second level disciplines, and two disciplines are connected if there is a paper belonging to them both. For example, applied mathematics and ecology are connected, because Ref  belongs to them both. The network is connected, which means no discipline is isolated. The edges of the network can be assigned weights: the number of interdisciplinary papers between two connected disciplines. The network data is provided in S1 Network.
It contains 42 nodes and 354 edges. Two disciplines are connected if there is a paper in PNAS 1999-2013 belonging to them simultaneously.
The phenomenon of the dense relationships between disciplines is quantitatively described by the network indicators , viz. the average clustering coefficient 0.55, the diameter 3, the average (weighted) degree 16.87 (148.38), and the graph density 0.41. Those indicators also show the small-world property of the discipline network.
The interdisciplinary breadth and centrality of a discipline can be quantitatively described by the degree and betweenness centrality of the corresponding node in the unweighted discipline network respectively. The degree of a node is the number of nodes connecting to it. The betweenness centrality relates to the number of shortest paths from all nodes to all others that pass through that node. If item transfer through the network follows the shortest paths, a node with high betweenness centrality has a large influence on the transfer behavior.
The interdisciplinary strength of a discipline can be expressed by the number of the interdisciplinary papers involving with that discipline, namely the degree of the corresponding node in the weighted discipline network. PageRank also gives a rough estimate of the importance of nodes (receive more connections from other nodes) in a given network. Hence the interdisciplinary breadth and strength of a discipline can be expressed by the PageRank value of the corresponding node in the unweighted and weighted discipline network respectively.
The degree, PageRank and betweenness centrality of applied mathematics in the unweighted network are the highest (Table 1). The degree of applied mathematics is 30, which means the theories and methods of applied mathematics have been directly used by 73.17% of the second level disciplines listed by PNAS, and members of all 3 first level disciplines (Fig 3). The highest value of betweenness centrality means that applied mathematics is a hub node for transferring the ideas, theories, and methods from one discipline to others, and then making bridges for carrying on interdisciplinary research between other disciplines. For example, network cosmology and its application [14–17] are typical interdisciplinary works among the theory of relativity, network science, and scientometrics, which are connected by geometry.
A discipline connects to applied mathematics if there is a paper in PNAS 1999-2013 belonging to that discipline and applied mathematics simultaneously.
The degree and PageRank of the discipline of chemistry in the weighted network are the highest, which means the interdisciplinary strength of chemistry is the highest. Those indicators of applied mathematics are low, comparing with those of chemistry. This is caused by that PNAS only published a few applied mathematical papers (350 papers in 1999–2013), comparing with the papers of chemistry (8,645 papers in 1999–2013). So we need a more fair indicator to measure the interdisciplinary strength, which is defined as follows.
The relative interdisciplinary strength S(i) of discipline i is defined here as S(i) = M(i)/N(i), where N(i) is the number of papers of discipline i in the corpus, and M(i) is the number of interdisciplinary papers in discipline i. A simple proxy considering both the interdisciplinary strength and breadth is C(i) = S(i)K(i), where K(i) is the degree of i in the discipline network. The proxy is named the cross indicator. Notice that, for certain discipline i, e.g. applied mathematics, M(i) is slight less than the weighted degree KW(i) (Table 1). This is caused by that some papers belong to more than two disciplines.
Sort the disciplines by the cross indicator (Table 1). The top three are applied mathematics, statistics in mathematical science, and computer science (whose theory closely relates to mathematical science). The reasons for the high cross indicators differ in different disciplines. Applied mathematics, statistics, computer science, and applied physical science are “output type” disciplines. The ideas and theories of those disciplines have provided a growing arsenal of methods for all of the sciences. Engineering, social science, and economic science are “input type” disciplines. Those disciplines integrate data, techniques, theories, etc. from other disciplines to create new approaches for their problems whose solutions are beyond their own scope.
The high values of the aforementioned indicators in applied mathematics are due to the increasing use of mathematical techniques in scientific research. A growing body of work in physics or computer science is indistinguishable from research done by mathematicians, and similar overlap occurs with medical science, astronomy, economic sciences, and an increasing number of fields. It is difficult today to find any discipline that does not have connections to mathematics, even political science .
Statistical analysis of the relationships of typical research paradigms and methodologies to algorithms
To understand the underlying causes of the interdisciplinarity of applied mathematics, we discuss the relationships of some typical research paradigms and methodologies to applied mathematics by statistically analyzing the corpus content. A paper containing a topic word means the topic expressed by the word is used or discussed by that paper . The topic words expressing the four basic research paradigms (model, experiment, simulation, and data driven) and the methodologies given by the three typical transdisciplinary topics (system, network and control) can be considered to be “model”, “experiment”, “simulation”, “data”, “system”, “network”, and “control” respectively. For each topic word, the high or increasing proportion of the papers containing that word at certain levels reflects the typicality of the corresponding research paradigm or transdisciplinary topic (Fig 4).
The topic words respectively represent four research paradigms, viz. model, experiment, simulation, and data-driven, and three transdisciplinary topics, viz. system, network, and control.
There are 31,542 words appearing in the corpus and also belonging to the lexicon of NLTK, in which there are 976 words appearing in more than 10% of papers (S1 Text). We manually selected typical topic words of applied mathematics from the 976 words, and found the word “algorithm”, which appears in 11.34% of papers. The relationship of a research paradigm or a transdisciplinary topic to algorithms, at certain degrees, can be expressed by the cointegration and correlation between the quarterly numbers of the papers containing the corresponding word and that of the papers containing “algorithm” (S1 Table).
Let the scalars of nominal significance levels of the following tests be 0.05. The augmented Dickey-Fuller test  (maxlags = 3) shows that all of the time series in S1 Table are first order integrated. The Johansen test  shows that almost all of the time series pairs in Table 2 are cointegrated. This means that, based on the 60 quarters of data from PNAS 1999-2013, the development of algorithms and that of any one of the mentioned research paradigms or transdisciplinary topics obey an equilibrium relationship in the long-run in the academic system.
In general, correlation analysis for non-stationary series probably gives spurious results, unless the series are cointegrated . Hence the cointegrations in Table 2 guarantee the validity of the correlation analysis: the Spearman’s rank correlation coefficients  and the Pearson product-moment correlation coefficients  show that the development of algorithms are positively correlated with that of the mentioned research paradigms and transdisciplinary topics (Table 3).
The co-word occurrence analysis is also an efficient method to measure the relationship between topic words, which is based on the assumption that a paper containing two topic words means the topics expressed by the words are used or discussed by that paper simultaneously . The proportions of the papers simultaneously containing “algorithm” and an aforementioned topic word amongst the papers containing that word, and amongst all of the papers are calculated respectively, annually and quarterly (Fig 5). The time series needed for the calculation are listed in S2 Table. The positive slopes of the linear fitting of the annual proportions (Table 4), except “algorithm” + “simulation” in “simulation”, show the increasing trends of algorithmization of the research paradigms and the methodologies given by the transdisciplinary topics. The reason for this exception is that the slope of the linear fitting of the annual proportion of the papers containing “algorithm” in all of the papers (0.0030) is lower than that of “simulation” (0.0064).
Those cointegrations, positive correlations and increasing trends of algorithmization appear naturally and can be considered as some causes for the interdisciplinarity of applied mathematics. As simplifications of relevant aspects of research problems, models are generally described by mathematical concepts and language for systematic study . Simulation, especially numerical simulation, has become a common method to algorithmically test how well the models are coherent to the experimental results. The widespread availability of computers and economic considerations make many of today’s sciences increasingly rely on simulation via mathematical models and algorithms. The scale of the data collected or generated from experiments and simulations can only be analyzed by algorithms [8, 9]. In fact, today’s science is becoming data-driven at a scale unimagined. Meanwhile, the theories of algorithms now guide researchers in mining the results from the collected data .
System science gives a unified methodology to research the complexity in epistemology by expressing the complex phenomena as complex systems, thus it is considered a transdisciplinary discipline . A variety of abstract complex systems are studied as a field of mathematics. Ignoring the functionalities and characteristics of the original systems, systems can be investigated by abstracting them as networks. Researchers from different fields can investigate their respective problems under the unified network framework . Algorithms play an important role in the analysis of the topological properties of the networks, such as distance and centrality finding algorithms, graph partitioning and clustering algorithms, and so on [27, 28].
Understanding of a system is reflected in our ability to control it. Control theory has a distinctly transdisciplinary mission to provide theories and approaches for comprehending complex phenomena . The modern study of control uses various mathematical theories and approaches, such as neural networks, Bayesian probability, fuzzy logic, evolutionary computation, etc., which are all closely related to algorithms, e.g. genetic algorithms [29, 30].
The connections between applied mathematics and other disciplines are not only caused by algorithms, but also by some other mathematical topics. In fact, certain mathematical topics words, such as “equation”, “statistic” can be found in S1 Text. The quantitative analysis of the relationships between them and research paradigms or methodologies can be discussed as above, so is not addressed here.
The interdisciplinarity of applied mathematics is quantitatively analyzed by using statistical and network methods on the corpus PNAS 1999–2013. A network is built based on the discipline information of the corpus, which gives a panoramic view of the relationships between disciplines. Some network indicators, e.g. betweenness centrality, quantitatively described the hub role of applied mathematics in interdisciplinary research. The statistical analysis on the corpus content found that a primary topic of applied mathematics, algorithms, cointegrates, correlates, and increasingly co-occurs with certain typical research paradigms and methodologies. Those findings can be considered as some of the underlying causes of the interdisciplinarity of applied mathematics.
S1 Table. The quarterly number of papers in total (papers) and the quarterly number of papers containing a certain topic word in PNAS 1999–2003.
S2 Table. The quarterly number of the papers simultaneously containing “algorithm” and a certain topic word in PNAS 1999–2003.
S1 Network. The weighted discipline network data.
Conceived and designed the experiments: ZX XJD. Performed the experiments: ZX ZZOY PYZ. Analyzed the data: ZX XJD. Contributed reagents/materials/analysis tools: ZZOY. Wrote the paper: ZX XJD.
- 1. Klein JT (1990) Interdisciplinarity: history, theory, and practice. Wayne state university press.
- 2. Ausburg T (2006) Becoming interdisciplinary: an introduction to interdisciplinary studies. 2nd edition. Kendall Hunt press.
- 3. National research council of the national academies (2013) The mathematical sciences in 2025. The national academies press.
- 4. Cojocaru M, Kotsireas IS, Makarov RN, Melnik R, Shodiev H. ed. (2015) Interdisciplinary topics in applied mathematics, modeling and computational science. Springer.
- 5. Newman M (2010). Networks: an introduction. Oxford University Press.
- 6. Sokolowski JA, Banks CM (2009) Principles of Modelling and Simulation. John Wiley and Sons.
- 7. Quarteroni A (2009) Mathematical models in science and engineering. Notices Amer Math Soc 56: 10–19.
- 8. Hey T, Tansley S, Tolle K, ed. (2009) The fourth paradigm: data-intensive scientific discovery, Microsoft research, Redmond, Washington.
- 9. Johannes L, Günter K, Terry S, ed. (2006) Simulation: Pragmatic Constructions of Reality. Springer.
- 10. Nicolescu B (2010) Methodology of transdiciplinarity–levels of reality logic of the included middle and complexity. Transdisc J eng sci 1: 19–38.
- 11. Brier S (2013) Cybersemiotics: a new foundation for transdisciplinary theory of information, cognition, meaningful communication and the interaction between nature and culture. Integr Rev 9: 222–263.
- 12. Barabási AL (2012) The network takeover. Nat Phys 8: 14–16.
- 13. Roper M, Seminara A, Bandi MM, Cobb A, Dillard HR, Pringle A (2010) Dispersal of fungal spores on a cooperatively generated wind. Proc Natl Acad Sci USA 107: 17474–17479. pmid:20880834
- 14. Krioukov D, Kitsak M, Sinkovits RS, Rideout D, Meyer D, Boguñá M (2012) Network cosmology. Sci Rep 2: 793. pmid:23162688
- 15. Xie Z, Ouyang ZZ, Zhang PY, Yi DY, Kong DX (2015) Modeling the citation network by network cosmology. Plos One 10(3): e0120687. pmid:25807397
- 16. Xie Z, Rogers T, (2015) Scale-invariant geometric random graphs, arXiv:1505.01332.
- 17. Xie Z, Zhu J, Kong DX, Li JP (2015) A random geometric graph built on a time-varying Riemannian manifold. Physica A 436: 492–498.
- 18. Majumder SR, Diermeier D, Rietz TA, Amaral LAN (2009) Price dynamics in political prediction markets. Proc Natl Acad Sci USA 106: 679–684. pmid:19155442
- 19. Mane KK, Börner K (2004) Mapping topics and topic bursts in PNAS. Proc Natl Acad Sci USA 101: 5287–5290. pmid:14978278
- 20. Fuller WA (2009) Introduction to statistical time series. John Wiley & Sons.
- 21. Johansen S (1991) Estimation and hypothesis testing of cointegration vectors in Gaussian vector autoregressive models. Econometrica 59: 1551–1580.
- 22. Li ZN, Pan WQ (2010) Econometrics (3rd ed.) (Chinese Edition). Higher Education Press.
- 23. Best DJ, Roberts DE (1975) Algorithm AS 89: the upper tail probabilities of Spearman’s rho. J. Roy. Statist. Soc. Ser. C 24: 377–379.
- 24. Pearson K (1895) Notes on regression and inheritance in the case of two parents. Proc Roy Soc Lond 58: 240–242.
- 25. Cohen J (1988) Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence erlbaum associates.
- 26. Yi L, Duan XJ, Zhao CL, Li DX (2012) Systems science, methodological approaches. CRC press.
- 27. Erciyes K. (2014) Complex Networks: An Algorithmic Perspective. CRC Press.
- 28. Xie Z, Dong E, Li J, Kong DX, Wu N (2014) Potential links by neighbor communities. Physica A 406: 244–252.
- 29. Christopher K (2005) Modern control technology. Thompson delmar learning.
- 30. Sontag E (1998) Mathematical control theory: deterministic finite dimensional systems (2nd ed.). Springer.