Fig 1.
Literature of Matrix factorization and MOO.
The figure shows the cooccurrences of keywords in articles from 497 publications queried using Scopus. The search expressions/keywords were matrix factorization and ranking, multi-objective optimization, decision support and multi-objective decision-making. Unique keywords, denoted by the nodes, are connected if they cooccur at least four times. Only publications before the end of December, 2021 are included.
Fig 2.
Flowchart of the non-negative matrix factorization-based many-objective optimization method.
The input of the workflow requires a dataset containing N number of solutions (x) and n objectives. The sparse NMF reduces the objectives to a p amount (p < < n). Information for both the solutions and objectives is attained. The objectives are compared, and the Pareto front is applied to the solutions in the low-rank representation of the objective space (the reduced notation for x is denoted by w). A solution’s rank is established in the lower-dimensional space as r(w). The Pareto front is evaluated through the application of Entropy, and Hypervolume (i.e., high saturation and being far from the ideal front may show that the Pareto front is underperforming). The analysis workflow also aims to uncover the relationships between the objectives: the information uncovered from the sparse NMF is verified by Sum of Ranking differences.
Fig 3.
Pareto fronts can be convex (A), concave (B), or the mixture of both. Additionally, the set of solutions can have multiple Pareto fronts, which can be identified by ignoring previously determined non-dominated solutions and repeating the search for Pareto fronts. Consider an onion; peel off the first layer to see the second, which will then become the outermost layer. The Pareto front can also be clustered.[34, 35].
Fig 4.
Graphical illustration of PCA(A), NMF(B), and sparse NMF(C).
The matrix factorizations are reduced from three to two dimensions. PCA covers the solutions with a hyperplane, NMF constraints them between two vectors, and sparse NMF forces the solutions to be in the vicinity to the vectors.
Fig 5.
Illustration of non-negative matrix factorization in two dimensions.
When the linear combination of WN×p and Hp×n is considered, an approximation () of XN×n is obtained. Moreover, the technique has inherent clustering properties, with which the objectives are automatically clustered in W as are the solutions and the Pareto Front. hk denotes the coefficients of objectives in relation to the components, which define the relationship between the objectives.
Fig 6.
The hypervolume indicator measures the volume between the Pareto front and the chosen/ideal reference. A Monte Carlo simulation approximates the Hypervolume measure due to its quickness. The red dashed line illustrates the Pareto front, while the grey area refers to the calculated hypervolume. If both the reference, and the Pareto front consists of maximum values, then the hypervolume is preferably close to zero. The illustration describes the calculation of hypervolume in 2D space.
Fig 7.
Pareto Front of the two objectives with the highest entropy.
The Pareto fronts only consider those non-dominated solutions that are optimal from the viewpoints of the ‘total number of open-access publications’ (O2) and the ‘number of green access publications’ (O6). Coincidentally, the chosen objectives are one of the most correlating pairs. The other 44 objectives have no input to the ranking. Should another two dominant objectives taken into consideration, different non-dominated solutions would be obtained.
Fig 8.
In the first PC, objectives that focus on quantity are dominant, while the second PC promotes a high proportion of publications that perform well as scientific indicators, open-access as well as in terms of collaborations. Rockefeller University is a small private university but has an outstanding proportion of objective values compared to other, more robust universities. At the other end is Harvard University, where resources are available to create an immense amount of publications. Note that only universities from the Anglosphere are on the first two Pareto fronts.
Fig 9.
Relationship between objectives by PCA.
The biplot depicts both scores and loadings, however, the scores are scaled differently in Fig 8. The axes split the objectives; the vertical axis distinguishes between the effects of the objectives that either impair or benefit the solutions, in this case, universities. In the fourth quarter, the objectives are quantity indicators, while in other quarters, the proportional indicators can be identified. Six clusters can be observed: three gender proportional indicators, the proportional variables of collaboration and open-access indicators, the proportional variables of scientific indicators, the quantity indicators, and the two drawback clusters. There is an outlier objective (S8). When a university improves their indicators in the drawback clusters, their performance decrease, moving the farther from the Pareto front.
Fig 10.
Representation of the objectives by SRD.
The SRD range between 0 and 8 is empty, i.e., the total number of authorships (G1) is close to the ideal reference. The range between the lines XX1 (5%) and XX19 (95%) shows the random ranking. Objectives to the right-hand side of the line of XX19 are ranked in reverse order (G7-G9, O13). The solid black line on the second vertical axis indicates the cumulative random frequencies of SRD vectors (CrFr% of SRD(RndV)).
Table 1.
The worst objectives, in terms of positive ranking, are proportional indicators: short-distance collaborations (C10); male (G7) and female (G8) authorships in proportion to the total number of authorships and female authorships (G9) in proportion to the authors who disclosed their gender.
Fig 11.
NMF-based Pareto front with scaled data as an input.
Each indicator in the components is positively correlating with each other. Therefore there are no reverse objectives, only perpendicular ones. For this reason, the Pareto front is determined with the help of maximizing the objective functions. The NMF is applied to the scaled data before the Pareto front ranks the institutions. This method clusters the institutions into Eastern and Western cultures concerning education. The Chinese and Taiwanese universities are placed into a cluster on the left-hand side, while the European, American, and the remaining academies can be found on the right-hand side. South Korean and Iranian universities connect the two dominant scientific circles.
Fig 12.
Coefficients of the ranking by NMF with ranked data as an input.
The non-negativity constraint forces the objectives into the positive quarter of the Cartesian coordinate system. By providing ranked input to NMF, an insight was gained into the relationship between the objectives. The ones close to the axes are almost identical to the SRD result of best/worst objectives, but the randomly located ones are in between. Furthermore, clusters of indicators appear as well. Cluster A) describes the alignment of proportional collaboration variables (C9–10).
Fig 13.
Heatmap of the coefficients in sparse NMF.
Sparse NMF forces the coefficients of the objectives to take on the form of basis vectors, as well as sorts them into quantity (H1) and proportional (H2) objectives.
Fig 14.
Pareto front of the sparse NMF.
The α = 0.4 and β = 0.4 parameters are selected to incorporate a fair number of members to the Pareto front, and also have most of the objectives incorporated into a component.
Fig 15.
Relative entropies of the methods.
The smaller the range of the y-axis (solutions with ranks), the bigger the range of the x-axis (ranks) is, and vice versa. Data with high REs have more ranks and fewer solutions with the same rank. The histograms represent the saturation of the Pareto fronts. From the viewpoint of Entropy, PCA and sparse NMF performs more efficiently than NMF. As the evaluation based on PCA consists of more ranks, the evaluation of the sparse NMF clarifies a more specific set of optimal solutions, which SRD validates. Furthermore, bimodal distributions can be observed (or suspected), except in the top left figure.
Table 2.
The ideal solution has a high relative entropy but minimal hypervolume, which is indicated by the arrows ((↑) represents that higher value is generally better, (↓) denotes that lower is preferable). The aim of dimensionality reduction is to compress the high number of objectives and retain as much information for ranking as possible. This results in a similar solution to the highest entropy pair, but takes all other objectives into consideration.