Table 1.
Statistics of citation networks of scientific publications in Web of Science.
We consider three scientific fields and the entire Web of Science. See text for the definitions of the statistics and the details of the data collection procedure.
Table 2.
Graph partitioning and community detection methods.
We consider a large number of methods divided into different classes. See text for the details of methods implementation and parameters setting.
Fig 1.
Pair-wise distances between the clusterings obtained by the considered methods.
Panel A shows the heatmaps of clustering distances for the Scientometrics citation network, where the methods are clustered into 5 and 11 classes (left- and right-hand side, respectively). Note that this merely implies the ordering of the rows/columns. Insets on the right show the method silhouette coefficients. Panel B shows the same for the Library & Information Science citation network. See Methods for the definition of the clustering distance and text for the details of the method clustering procedure.
Fig 2.
Size distributions of the clusterings obtained by representative methods.
Panels A and B show cluster size distributions P(s) for the Library & Information Science and Physics citation networks, respectively. Wherever plausible, the power-laws s−γ are fitted to the tails of the distributions by maximum likelihood estimation, γ = 1 + n (∑i log si/smin) for smin > 1.
Table 3.
Structural statistics of the clusterings obtained by representative methods.
The methods are applied to the Library & Information Science citation network. See Methods for the definitions of the statistics and text for the interpretation.
Fig 3.
Robustness of the clusterings obtained by representative methods.
Panels A and B show clustering robustness plots V(α) for the Scientometrics and Library & Information Science citation networks, respectively. These show the distances between the clusterings obtained after randomly rewiring α links. See Methods for the definitions of clustering distance and robustness.
Fig 4.
Degeneracy of the clusterings obtained by representative methods.
Panels A and B show clustering degeneracy diagrams D for the Library & Information Science and Physics citation networks, respectively. These display the non-degenerate ranges of the clusterings, while the percentages show the fraction of nodes in tiny clusters ∑si < stiny si/n and in the largest cluster sL/n (left- and right-hand side, respectively). See text for the definition of clustering degeneracy.
Table 4.
Bibliometric statistics of the clusterings obtained by representative methods.
The methods are applied to the Library & Information Science citation network. See Methods for the definitions of the statistics and text for the interpretation.
Table 5.
Statistics of the clusterings obtained by the map equation methods Metimap and Infomap.
The methods are applied to the Library & Information Science citation network and the largest scientometric clusters with s ≥ 50 are shown. See Fig 5 for a comparison of the clusterings and text for the interpretation.
Fig 5.
Alluvial diagram of the clusterings obtained by the map equation methods Metimap and Infomap.
The diagram shows the overlap between the largest scientometric clusters returned by Metimap and Infomap on the Library & Information Science citation network (left and right, respectively). ‘Remaining publications’ are included in one of the clusters in the Metimap (Infomap) clustering but not included in any of the clusters in the Infomap (Metimap) clustering. See Table 5 for details of the clusterings.
Table 6.
Bibliometric statistics of the clusterings obtained by selected methods.
The methods are applied to Physics citation network and bibliometric statistics of the clusterings with and without post-processing are shown. See Methods for the definitions of statistics and the details of clustering post-processing approach.
Fig 6.
Size distributions and degeneracy of the clusterings obtained by the selected methods.
The methods with and without post-processing are applied to the Physics citation network, while the panels A and B show cluster size distributions P(s) and clustering degeneracy diagrams D, respectively. Vertical lines in panel A represent the threshold size stiny = 15. See text for the definition of clustering degeneracy and Methods for the details of the clustering post-processing approach.
Table 7.
Statistics of the clusterings obtained by the selected methods.
The methods are applied to the All Fields citation network and different statistics of the clusterings with and without post-processing are shown. See Methods for the definitions of the statistics and the details of the clustering post-processing approach.
Fig 7.
Sizes and coverage of the largest clusters obtained by the selected methods.
The methods with and without post-processing are applied to the All Fields citation network, while the panels A and B show the sizes s and coverage K/k of the largest 50 clusters, respectively. Horizontal lines in panel A represent the threshold size sgiant = 104. See text for the definition of cluster coverage.