On the advancement of highly cited research in China: An analysis of the Highly Cited database

This study investigates the progress of highly cited research in China from 2001 to 2016 through the analysis of the Highly Cited database. The Highly Cited database, compiled by Clarivate Analytics, is comprised of the world’s most influential researchers in the 22 Essential Science Indicator fields as catalogued by the Web of Science. The database is considered an international standard for the measurement of national and institutional highly cited research output. Overall, we found a consistent and substantial increase in Highly Cited Researchers from China during the timespan. The Chinese institutions with the most Highly Cited Researchers- the Chinese Academy of Sciences, Tsinghua University, Peking University, Zhejiang University, the University of Science and Technology of China, and BGI Shenzhen- are all top ten universities or primary government research institutions. Further evaluation of separate fields of research and government funding data from the National Natural Science Foundation of China revealed disproportionate growth efficiencies among the separate divisions of the National Natural Science Foundation. The most development occurred in the fields of Chemistry, Materials Sciences, and Engineering, whereas the least development occurred in Economics and Business, Health Sciences, and Life Sciences.


Introduction
The Highly Cited Researchers database (http://hcr.stateofinnovation.com/) is comprised of the most influential researchers in the 22 Essential Science Indicator fields (ESI) of the Web of Science [1]. Citation analysis measures the propagation of research results, and the Highly Cited Researchers are considered to be the most influential in their field, with respect to the dissemination of their publications. The database is an easy standard to measure the performance of the researchers, along with the performance of their institutional affiliations and nations, and has been commonly used in rankings [2][3][4]. The Highly Cited database was previously the property of Thomson Reuters, and is now a product of Clarivate Analytics. It is created by cataloging the papers in the top 1% based upon citation count in the corresponding field in the Science Citation Index Expanded, which does not include conference proceedings. Using this list of papers, the authors of the papers are then selected, based upon the number of highly cited papers they have published and the number of citations that their highly cited papers PLOS  This study investigated the growth of highly cited research in China from 2001 to 2016 utilizing the 2001, 2014, 2015, and 2016 Highly Cited Researchers databases, along with government funding data for the divisions of science from the NSF [1,12]. The NSF grouped the fields of research into seven divisions prior to 2010, whereas in 2010, it separated the preexisting Health Sciences branch from the Life Sciences Division, which led to the current eight divisions. We found momentous overall growth in the initial overall evaluation. Further analysis of development in each of the divisions revealed disproportionate improvements in the separate fields of research, especially when economic factors such as government funding allocation and number of projects were considered.

Methodology
The 2001, 2014, 2015, and 2016 Highly Cited Researchers lists were downloaded as Excel spreadsheet files from the Clarivate Analytics Highly Cited Researchers website archive (http://www.hcr.stateofinnovation.com/page/archives). In Excel, we restricted the datasets to only researchers with primary affiliations located in Mainland China. Researchers from Taiwan, Hong Kong, and Macau were not included. We tallied the total number of researchers, and ranked the institutions by number of researchers in all fields for each year, thus generating Tables 1, 2, 3 and 4. Table 1 included all institutions with a Highly Cited Researcher, whereas Tables 2, 3 and 4 only included the organizations in Mainland China with more than one  Table 4, the number of academic faculty staff was obtained, and the number of Highly Cited Researchers per 1000 academic faculty staff was calculated. The number of  [12]. We calculated the percentage of general government funding that was allocated to each of the divisions of research, as delineated by the NSF, and also the percentage of total projects in each division for 2001, 2006, 2011, and 2016, producing Table 9. To evaluate efficiency for each of the separate divisions, we separated the 22 ESI fields according to the eight divisions of the NSF. Using the funding data previously obtained, along with calculated percentage of Highly Cited Researcher frequency for a division out of the global Highly Cited Researcher frequency from 2014 to 2016, we calculated an efficiency index for each division of the NSF, which is included in Table 10. As in Bornmann 2015, Highly Cited Researcher frequency is defined to be the number of entries which are in the Highly Cited database [2]. Thus, a single researcher in both Chemistry and Materials Science would count twice. Since the database is large, with over 3,000 entries, when considering the percentage of frequency, the uniform error would theoretically cancel out.  [13,14].
In 2001, China had seven Highly Cited Researchers, each with a different primary institution. The institutions with Highly Cited Researchers in China were Beijing University of Aeronautics and Astronautics, the Chinese Academy of Sciences, Fudan University, Donghua University, Jilin University, Shanghai Jiao Tong University, and the University of Science and Technology of China, with the last being very closely associated with Chinese Academy of Sciences, the main government research institution of China. (Table 1) By 2014, China had 15 institutions with more than one Highly Cited Researcher. (The numbers in parentheses following the name of the institution is the number of Highly Cited Researchers from the institution for a given year.) The top five institutions were the Chinese Academy of Sciences (50), the University of Science and Technology of China (6), BGI Shenzhen (5), Peking University (5), and Tsinghua University (4). BGI Shenzhen is a genomics and bioinformatics institute that is the principle bioinformatics center of the Chinese Academy of Sciences. (Table 2) In 2015, China had 18 institutions with more than one Highly Cited Researcher. The top institutions were the Chinese Academy of Sciences (29), Peking University (7), Tsinghua University (5), BGI Shenzhen (4), China University of Geosciences (4), the University of Science and Technology of China (4), Harbin Institute of Technology (4), and Zhejiang University (4). (Table 3) In 2016, China had 21 institutions with more than one Highly Cited Researcher. The top institutions were the Chinese Academy of Sciences (25), Tsinghua University (10), Peking University (6), Zhejiang University (6), South China University of Technology (5), and the University of Electronic Science and Technology (5). Here, we also included the number of academic faculty staff for each institution with two or more Highly Cited Researchers. Dividing the number of Highly Cited Researchers by the number of academic faculty staff, we obtained a measure of efficiency for the organizations. However, this measure of efficiency is likely unreliable, because most institutions have less than ten Highly Cited Researchers. Thus, Liaoning University of Technology, a second-tier institution with very few academic faculty staff has the highest efficiency calculated. The top five institutions with the highest number of Highly Cited Researchers per faculty have their corresponding ratios bolded. The number of academic faculty staff is simply a rough metric of the sizes of the institutions. (Table 4) The top Chinese institutions overall, the Chinese Academy of Sciences, Tsinghua University, Peking University, Zhejiang University, the University of Science and Technology of China, and BGI Shenzhen, are all either top ten universities, or in the case of Chinese Academy of Sciences and BGI Shenzhen, primary government research institutions. (Tables 1-4) In 2001, China only had Highly Cited Researchers in four ESI fields. However, four out of a total of seven Highly Cited Researchers were in Materials Science, with the remaining three in Geosciences, Engineering, and Physics. (Table 5) (4)    However, it lost its only Highly Cited Researcher in Microbiology. Again, the fields of Chemistry and Materials Science grew as rapidly as in the timespan from 2014 to 2015. Computer Science increased significantly, from four to ten. However, Engineering only increased by one, Geosciences by two, and Mathematics by one. Physics decreased by one, and not much change was observed in other fields. (Table 8) Government general funding data from the NSF provided two significant results.  (Table 9) The national efficiency index, calculated as described in the methodology, revealed that the Chemical Sciences Division (1.481) and the Engineering and Materials Science Division (1.096) had relatively high efficiencies when compared to the Management Sciences Division

Discussion
China significantly increased highly cited research activity during the timespan from 2001 to 2016. Compared to 2001, by 2016 China had increased the number of active research projects by 3.8 fold, multiplied its government funding by 12.8 fold, and raised its GDP by 8.5 fold, which signified economic growth along with increased interest in research [12][13][14]. Taking the percentage of Highly Cited Researchers from China out of the total number of Highly Cited Researchers in the world, and comparing it to China's GDP for the four years, it is seen that highly cited research increased by nearly 4.5 fold for a corresponding two fold increase in GDP. As reflected by the Highly Cited database, China has seen the greatest improvement in highly cited research prominence on the global scale from 2001 to 2016.
A peculiar observation is the yearly decrease of the number of Highly Cited Researchers from the Chinese Academy of Sciences. This phenomenon is primarily driven by university rankings. Numerous rankings of universities, such as the Academic Ranking of World Universities Shanghai Ranking and the US News World Report, use the Highly Cited database to evaluate the performance of an institution [16,20]. With the recent increase of frequency in updating the Highly Cited Researcher list, and the increased pressure for universities to climb up the rankings, universities are beginning to pay researchers large sums of money to be listed as the primary institution. Notable examples of universities that do this are the King Saud University and the King Abdulaziz University in Saudi Arabia, both of which are known to pay salaries of upwards of three hundred thousand USD for Highly Cited Researchers to list them as their primary institution, or three dollars per citation to be listed as a secondary affiliation [21]. Thus, although many of the researchers are primarily affiliated with the Chinese Academy of Sciences, it is seen that some of them credit another institution as their primary On the advancement of highly cited research in China affiliation. Therefore, the number of Highly Cited Researchers that list the Chinese Academy of Sciences as their primary affiliation has decreased, leading to the decline of the Chinese Academy of Sciences in rankings.
China has seen the majority of its growth in the Chemical Sciences Division, and the Engineering and Materials Science Division, which corresponds to the ESI fields of Chemistry, Engineering, and Materials Science. These three fields consistently increased in Highly Cited Researchers, and constituted about two thirds of the Highly Cited Researcher frequency from institutions in China. In 2001, five out of the seven Highly Cited Researchers from China were in Materials Science (4) and Engineering (1) combined. In 2014, 2015, and 2016, the percentage of Highly Cited Researcher frequency for institutions in China that were in the three aforementioned fields are 59%, 72%, and 70% respectively. China has historically been an industrial giant, with much of the country's economy being reliant upon manufacturing, export-based production, and chemical synthesis and processing businesses. Upon investigation of the economic situation in China, 40.7% of China's GDP originated from industry, according to the CIA World Factbook [22]. Thus, it is expected that much research and government funding has gone to benefit the economy, in accordance with the Ricardo principle. (Tables 1-8) From the government general funding data obtained from the NSF, we found consistent increases in the number of active research projects and the amount of general funding provided to each of the divisions. We observed that indeed, a significant portion, approximately 27% of general research funding, went toward the Chemical Sciences Division (10.38%) and the Engineering and Materials Science Division (16.8%). However, we also found that there were two other divisions that were also allocated more than 15% of the government general fund, the Health Sciences Division and the Life Sciences Division, which collectively accounted for 38% of the funds when summed. Prior to 2010, only the Life Sciences Division existed. The Life Sciences Division was allocated approximately 36% of the funding prior to 2010, the year that the Health Sciences branch was established as a separate division. Post 2010, the Life Sciences Division allocation was reduced to 16%, and the Health Sciences Division funding was set to 22.4%, thus summing up to about 38% of the annual total general fund. (Table 9) For China, the number of projects is directly proportional to the amount of spending. The correlation coefficient between the percentage of funding for a division out of the total amount of funding, and the percentage of projects in a division out of the total number of projects is 0.989, with the two percentages being almost equal. Although spending does not have a direct relation to number of total Highly Cited Researchers, it does have a direct relation to the number of projects. This close relation between percentage of total funding and percentage of total projects results from the funding policy of the NSF. For most types of projects, the amount of grant money provided by the NSF begins at a predetermined fixed amount. Publications are expected to be produced with no further funding, unless if the NSF deems the research to be a national priority. Thus, more funding allows for proportionally more projects, which should correspond to more highly cited publications, and more highly cited publications generally lead to more Highly Cited Researchers [23]. Because the percentage of projects and percentage of funding are approximately equal, the following discussion on efficiency by amount of funding also applies likewise to efficiency by project number.
In the national efficiency by funding evaluation, we found that the most efficient division was the Chemical Sciences Division (1.481), followed by the Engineering and Materials Science Division (1.096). The two divisions were significantly more efficient than the Life Sciences Division (0.018), the Health Sciences Division (0.007), and the Management Sciences Division (0). We found that efficiency corresponds to productivity, as the most efficient and most productive divisions are the same, and the least efficient and least productive divisions are the same. Efficiency was calculated to be inversely correlated to percentage of funding and project number, but only weakly (r = -0.3), indicating that division size does not affect efficiency significantly. (Table 10) The Mathematical and Physical Sciences Division, Life Sciences Division, Earth Sciences Division, and Information Sciences Division all saw some degree of growth, but not nearly as much as the Chemical Sciences Division and the Engineering and Materials Science Division. The three divisions with the lowest efficiency were found to be the Management Sciences Division (0), Health Sciences Division (0.018), and Life Sciences Division (0.007). We evaluated the Life Sciences Division with the Health Sciences Division included, as the two are closely related, and commonly overlap in operation. Although they were formally separated, the two divisions still function as if they were integrated.
The  (Table 10) China, in the past 16 years, has invested more in the Life Sciences Division (16%) and the Health Sciences Division (22.4%) combined than in any other division. The percentage of government funding invested into the two divisions (38%) is more than double the percentage allocated to the Engineering and Materials Science Division (16.8%), the division with the next highest funding percentage, and with the most Highly Cited Researchers. The efficiency of the Life Science Division (0.018), especially the Health Sciences Division (0.007) branch off, was far too low to be practical, and the amount of money and projects involved in the effort to advance high volume and highly cited research was disproportionate to the results. Very few organizations are productive at outputting highly cited research in the two divisions. For example, BGI Shenzhen is consistently responsible for the large majority of China's Highly Cited Researchers in the fields of Molecular Biology & Genetics and Biology & Biochemistry. Also, all of the Highly Cited Researchers had a primary institution that is either a top university in China, a private company, or a government research facility which is closely affiliated with Chinese Academy of Sciences. (Tables 1-8 and 10) Further investigation into the situation reveals that China is attempting to encourage research output in the Life Sciences and Health Sciences Divisions through increased investment in the preexisting medical department based research system, and by promoting a new translational medical research system. In the current situation, it would benefit the country to separate the funding for researchers and physicians, as they are not involved in the same field, nor are they equivalent. Full time physicians in most hospitals in China do not have the time nor the resources to conduct high quality research. Previous efforts, such as those described in Wang et al. (2011) and Leng (2012) [24,25], have resulted in excessive awarding of grants to full-time physicians, along with systematic misconduct due to the lack of proper management structure [26]. Not only did this lead to inefficient and inappropriate usage of funding, it also put unnecessary pressure on full time physicians, as well as hospital departments, which resulted in incidents of corruption and academic dishonesty. The restructuring of medical research in China has failed, yet the broken system remains unfixed due to the scale of the issue at hand. Thus, China should focus on improving the performance and efficiency of the Life Sciences Division and the Health Sciences Division by reallocating the funds in order to push for high quality instead of high quantity research, and also by learning from the success of the Chemical Sciences Division and the Engineering and Materials Science Division. (Table 10) The primary purpose of this study is to identify which fields China is and isn't performing well in relative to the amount of expenditure, and reveals the fields China should invest more heavily into in order to obtain the greatest impact, along with the fields that should undergo organizational and institutional restructuring to optimize their research efficiency and maximize results. Although the efficiency metric serves as a general guideline of research efficiency in terms of impact per unit of expenditure, and should be ideally maximized on a per field basis, the value of the index should not be compared exactly across fields, given that there are limitations, as citations are not the only indicator of research performance. In future studies, other metrics, especially patent application data, should be included and considered as a factor in evaluating scientific advancement. Also, although the Highly Cited database is the largest, most stable, and most comprehensive, there are other smaller rankings and metrics, such as the Google Scholar ranking and the Nature Index. The usage of multiple metrics to evaluate research performance of individual fields of nations allows for more comprehensive results. However, it is very difficult to incorporate data from them for comparison purposes in this study, due to differences in methodology and scope. Doing so will make the paper unnecessarily long for its purpose. Therefore, analyzing and comparing the different databases will be best addressed in a future continuation study.

Conclusion
Overall, China's performance in the 21 st century is excellent, especially in the ESI fields of Chemistry, Materials Science, and Engineering. During the span of the previous 15 years, China became a major contributor to the global pool of highly cited research. In 2001, China had seven Highly Cited Researchers. By 2014, China became, and has remained as of 2017, one of the top five contributors to the Highly Cited database in the world. Through strategic increases in expenditure, China produced numerous prolific research institutions. Although the growth is disproportionate in the different fields of research, the development is still significant, and demonstrates the high quality highly cited research output capability of the world's second most economically powerful nation as measured by GDP [13]. However, it will highly benefit the nation to improve efficiency, organizational and operational structure, and project management style by learning from the Chemical Sciences Division and the Engineering and Materials Science Division, especially regarding low efficiency divisions such as the Life Sciences Division and the Health Sciences Division.