^{1}

^{*}

^{2}

^{3}

^{4}

^{1}

The authors have declared that no competing interests exist.

Conceived and designed the experiments: TQ ZKZ GC. Performed the experiments: TQ GC. Analyzed the data: TQ ZKZ. Wrote the paper: TQ ZKZ.

Finding a universal description of the algorithm optimization is one of the key challenges in personalized recommendation. In this article, for the first time, we introduce a scaling-based algorithm (SCL) independent of recommendation list length based on a hybrid algorithm of heat conduction and mass diffusion, by finding out the scaling function for the tunable parameter and object average degree. The optimal value of the tunable parameter can be abstracted from the scaling function, which is heterogeneous for the individual object. Experimental results obtained from three real datasets,

Favored by increasing information, people can enjoy an abundant life. However, people are also brought into a quandary decision of getting what they actually prefer. For example, how to select a satisfactory dress from various dress brands, or get an interesting book to read from the book sea. As a powerful tool, recommendation engine emerges to help people out of the overloaded information

A great many algorithms have been proposed, and have led to a considerable progress, such as the collaborative filtering (CF) algorithms

Among these numerous physical-concept-based recommendation algorithms, a representative work is a hybrid algorithm of heat conduction and mass diffusion (HHP)

However, for a number of different algorithms, the algorithm performance is usually controlled by some ‘tunable parameter’. What challenges these algorithms in common is how to find out the optimal value of the tunable parameter. By far, most algorithms take a one-evaluator-based parameter selection, namely, choosing the optimal value of the tunable parameter according to the recommendation performance of one evaluator

Motivated by the explicit dilemma to choose a proper reference of the algorithm optimization, in the present paper, for the first time, we introduce a scaling-based algorithm (SCL) independent of the recommendation list length, based on the hybrid method of heat conduction and mass diffusion (HHP). By testing our algorithm on three real datasets,

A single curve independent of the recommendation list length is obtained by rescaling the tunable parameter and the object average degree, and we describe it by a scaling function. The optimal value of the tunable parameter can be abstracted from the scaling function, which is heterogeneous for the individual object.

The present algorithm shows a high accuracy in recommendation. More importantly, it greatly improves the personalized recommendation in three other challenging aspects: solving the accuracy-diversity dilemma, presenting a high novelty, and solving the cold start problem.

The remainder of this paper is organized as follows. In the next section, we detail the bipartite network and the investigated recommendation algorithms. Some popular indicators to evaluate the recommendation algorithm performance are introduced in the section of metrics, and followed by the description of the datasets in the data section. Then, we compare the results of the present algorithm with a highly accurate mass diffusion algorithm, the original both highly accurate and diverse hybrid method, and even an improved version of the hybrid method which well resolves the cold start problem in the section of results and discussion. Finally comes to the conclusion.

A recommendation system can be described by a bipartite network composed of a user set and an object set. The user set includes

In the following algorithms, a so-called “resource” is introduced to objects. At first, objects are assigned an initial resource

The mass-diffusion based algorithm, referring to the probability spreading (PBS) process based algorithm, is reported as a highly accurate method

(a) for the PBS method, and (b) for the HTS method.

By incorporating heat-conduction analogous process, the heat conduction (HTS) method is proposed, with an illustration of how resources are reallocated shown in

To achieve a high accuracy and diversity of recommendation, a hybrid method (HHP) is proposed

Based on the HHP method, an improved object-oriented hybrid method (OHHP) is proposed

The common question in most algorithms is how to find out the optimal value of the tunable parameter. For example, the optimal value obtained by utilizing the ranking score as the reference is usually different from that obtained by utilizing the diversity as the reference. Moreover, diversity performance varies with the recommendation list length. We show the tunable parameter

The black, red, green, blue and dark yellow lines are for the recommendation list lengths of

In order to obtain an

Inspired by the above theoretical analysis, we propose the Scaling-based (SCL) algorithm, making use of the formula in

The black, red, green, blue and dark yellow lines are for the recommendation list lengths of

Make the polynomial fit

Having the coefficients

Having the optimal value of the tunable parameter

Recommendation accuracy is with no doubt one of the most important indicators to evaluate the performance of an algorithm. As an adjunct to accuracy, recommendation diversity and novelty are addressed to be important evaluators to quantify the personalized recommendation. In our study, we take the ranking score, precision and recall to quantify the recommendation accuracy, the object average degree to quantify the novelty, the inter-diversity and inner-diversity to quantify the recommendation diversity. Moreover, to specifically investigate the recommendation accuracy of cold objects, we further study an object-dependent ranking score, an object-dependent precision, and an object-dependent recall.

The ranking score

To focus on the recommendation accuracy of cold objects, we define an object-degree dependent ranking score

The recommendation precision _{i}

Similarly, to better understand the recommendation accuracy of the cold objects, we define an object-degree dependent precision by,_{i}

The recall _{i}_{i}

The object-degree dependent recall is analogously defined as,_{i}_{i}

The average degree of objects in the recommendation list is widely used to identify the novelty of a recommender system, which is defined by,

We test the algorithm performance on three datasets, the

We divide a dataset into two subsets of the training set and the test set. We randomly delete

To provide a solid investigation of the performance of the SCL algorithm, we compare the performance of the SCL with three typical and excellent algorithms, the PBS, the HHP, and the OHHP. The PBS is highly accurate, and the HHP well resolves the great challenge of accuracy-diversity dilemma, and the OHHP further outperforms the HHP in resolving the cold start problem. A summary of the performance of the PBS, the HHP, the OHHP and the SCL is presented in

_{k≤}_{10} |
_{k≤}_{10} |
_{k≤}_{10} |
_{Inter} |
_{Inner} |
||||||

PBS | 0.051 | 0.484 | 0.054 | 0.0000 | 0.420 | 0.0003 | 2336.0 | 0.637 | 0.423 | |

HHP | 0.045 | 0.417 | 0.062 | 0.0006 | 0.470 | 0.0176 | 1843.7 | 0.720 | 0.672 | |

OHHP | 0.044 | 0.350 | 0.058 | 0.0009 | 0.437 | 0.0255 | 2048.3 | 0.691 | 0.575 | |

SCL | 0.046 | 0.060 | 0.426 | |||||||

PBS | 0.105 | 0.562 | 0.074 | 0.0000 | 0.477 | 0.0000 | 233.5 | 0.645 | 0.616 | |

HHP | 0.083 | 0.408 | 0.085 | 0.0011 | 0.527 | 0.0441 | 157.2 | 0.717 | 0.839 | |

OHHP | 0.083 | 0.364 | 0.084 | 0.0015 | 0.528 | 0.0527 | 170.6 | 0.707 | 0.818 | |

SCL | 0.087 | 0.080 | 0.469 | |||||||

PBS | 0.069 | 0.480 | 0.042 | 0.0002 | 0.497 | 0.0080 | 465.7 | 0.829 | 0.874 | |

HHP | 0.048 | 0.250 | 0.050 | 0.0024 | 0.557 | 0.0924 | 329.7 | 0.850 | 0.940 | |

OHHP | 0.050 | 0.189 | 0.047 | 0.0048 | 0.542 | 0.1578 | 374.8 | 0.849 | 0.919 | |

SCL | 0.050 | 0.048 | 0.539 |

To detect how much the SCL outperforms the other three algorithms, we define an improvement percentage

_{k≤}_{10} |
_{k≤}_{10} |
_{k≤}_{10} |
_{Inter} |
_{Inner} |
||||||

_{PBS} |
||||||||||

_{HHP} |
−2.2% | −3.2% | −9.4% | |||||||

_{OHHP} |
−4.5% | −2.0% | −2.5% | |||||||

_{PBS} |
−1.7% | |||||||||

_{HHP} |
−4.8% | −5.9% | −11.0% | |||||||

_{OHHP} |
−4.8% | −4.8% | −11.2% | |||||||

_{PBS} |
||||||||||

_{HHP} |
−4.2% | −4.0% | −3.2% | |||||||

_{OHHP} |
0.0% | −0.6% |

From

For the recommendation accuracy, we focus on the overall recommendation accuracy and the recommendation accuracy of the cold objects. Compared with the highly accurate PBS method, the SCL outperforms the PBS for almost all the metrics. Taking the

The HHP is excellent in both the accuracy and the diversity at the optimal value of the tunable parameter. Compared with the HHP at the optimal value of the tunable parameter evaluated by the ranking score, the SCL presents a very little lower overall recommendation accuracy, but a much greater advantage in the recommendation accuracy of the cold objects. Moreover, the SCL outperforms the HHP in the novelty

The OHHP method has been reported to be more advantageous in the cold start problem than the HHP. Compared with the OHHP at the optimal value of the tunable parameter defined by the ranking score, the SCL method further improves the recommendation accuracy of the cold objects. Also, the SCL outperforms the OHHP in the novelty, the inter-diversity and the inner-diversity for all the three datasets.

The cold start problem is a long-standing challenge in traditional recommendation system, since it is difficult for users to be aware of the cold objects due to the lack of sufficient accessorial information

To further understand the cold start efficiency of the four algorithms, we investigate the object-degree-dependent ranking score

The black, red, green and blue lines are for the HHP, PBS, OHHP and SCL methods, respectively.

We then study the degree distribution

The black, red, green and blue lines are for the HHP, PBS, OHHP and SCL methods, respectively.

Besides the cold start problem, diversity and novelty are also significant to mark the vitality of personalized recommendation. Recommendation accuracy and diversity has been addressed to a dilemma pair, as well as accuracy-novelty. Typical examples are the PBS and HTS algorithms, where the PBS is more accurate but less diverse and novel, whereas the HTS is more diverse and novel but less accurate.

Intuitively, the improvement of recommendation accuracy of the cold objects would meanwhile upgrade the recommendation novelty and diversity. However, by comparing the OHHP with the original HHP, we find that the novelty, the inter-diversity and the inner-diversity of the HHP outperform those of the OHHP for all the three datasets, though the OHHP greatly improves the recommendation accuracy of the cold objects. To better understand the observed phenomena, we show the optimal value of the tunable parameter on the object average degree of the OHHP and the SCL in

The black and red lines are for the SCL and OHHP methods, respectively.

To manifest how the novelty evolves with the recommendation list length, we then study the novelty

The black, red, green and blue lines are for the HHP, PBS, OHHP and SCL methods, respectively.

Further investigation of the inter-diversity

The black, red, green and blue lines are for the HHP, PBS, OHHP and SCL methods, respectively.

Similar advantage of the SCL is also found for the inner-diversity

The black, red, green and blue lines are for the HHP, PBS, OHHP and SCL methods, respectively.

Taken together, while not searching for the optimal value of the tunable parameter according to any particular evaluator, but abstracting it from the scaling function, the SCL remarkably outperforms the PBS, the HHP, and the OHHP in the recommendation accuracy of cold objects, as well as the recommendation novelty and diversity, and simultaneously keeps a high overall recommendation accuracy.

In conclusion, we have proposed a scaling-based (SCL) recommendation algorithm, in which the optimal value of the tunable parameter can be abstracted from the scaling function independent of the recommendation list length via a rescaled procedure. Based on three real datasets,

The dilemma existing most in common in a number of algorithms is how to find out the proper value of the tunable parameter for different recommendation focuses, e.g., the accuracy, the diversity, or the cold start problem. It is with no doubt that recommendation accuracy is one of the most important evaluators of the algorithm performance. However, even using the recommendation accuracy as the reference to search for the optimal value of the tunable parameter, the optimal value might also be different for using different accuracy evaluators. By finding out a scaling function independent of the recommendation list length based on empirical data, we resolve the explicit dilemma of the optimal value selection of the tunable parameter for the complex contradiction among different recommendation focuses.