Fig 1.
The flow diagram has been adapted to show the combination of human and machine coding.
Table 1.
This table gives a summary of the performance of each classifer model expressed as precision, recall, F1-score and accuracy.
Fig 2.
Publication per year and word frequencies.
The top plots shows the number of publications per year for the period Jan 2008—July 2023. The bottom plot shows the word cloud of the first 500 most frequent words (left) and word pairs (left). The size of each word/word pairs indicates the relative weight from the whole corpus whose frequency is obtained from the TFIDF abstracts lemmatized using SpaCy.
Fig 3.
Proportion of publications and citations.
Top 20 journals based on the number of citations per title (top) and number of papers per title (bottom). For the sake of clarity only the top 20 journals are shown.
Fig 4.
Proportion of words per topic.
This figure shows the first 10 words associated with each topic based on their prevalence. The topics are shown in descending order from top left to bottom right, according to Fig 5, top.
Fig 5.
Proportion of papers per topic and topic trends over time.
This top figure shows the results of applying “argmax” to determine the most likely topic assignment for a specific paper. The “argmax” operation determines the topic with the maximum probability within a given paper abstract. The bottom figure shows the number of documents over time per each topic between 2008 and mid-2023.
Fig 6.
This figure shows the condensed topic summaries for the abstracts falling within a given topic category.
Fig 7.
Each dot corresponds to a paper in a two dimensional space. The different colors depict the different topics.
Fig 8.
The heat map shows the Spearman correlation amongst the ten different topics. The heatmap is generated using seaborn and the color and annotated number within each cell indicate the strength of the correlation [65].
Fig 9.
Geographical distribution of studies.
The figure shows the geographical distributions of studies based on ISO country codes.
Fig 10.
Association of countries and clusters.
The figure shows the country counts per each of the more prevalent topics. The darker the color, the higher the number of papers.