Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A fast combination method in DSmT and its application to recommender system

  • Yilin Dong ,

    Contributed equally to this work with: Yilin Dong, Xinde Li

    Roles Investigation, Methodology, Validation, Visualization, Writing – original draft

    Affiliation Key Laboratory of Measurement and Control of CSE, Ministry of Education, School of Automation, Southeast University, Nanjing, Jiangsu Province, China

  • Xinde Li ,

    Contributed equally to this work with: Yilin Dong, Xinde Li

    Roles Methodology, Supervision

    xindeli@seu.edu.cn

    Affiliation Key Laboratory of Measurement and Control of CSE, Ministry of Education, School of Automation, Southeast University, Nanjing, Jiangsu Province, China

  • Yihai Liu

    Roles Supervision

    Affiliation Jiangsu Automation Research Institute, Lianyungang, Jiangsu Province, China

A fast combination method in DSmT and its application to recommender system

  • Yilin Dong, 
  • Xinde Li, 
  • Yihai Liu
PLOS
x

Abstract

In many applications involving epistemic uncertainties usually modeled by belief functions, it is often necessary to approximate general (non-Bayesian) basic belief assignments (BBAs) to subjective probabilities (called Bayesian BBAs). This necessity occurs if one needs to embed the fusion result in a system based on the probabilistic framework and Bayesian inference (e.g. tracking systems), or if one needs to make a decision in the decision making problems. In this paper, we present a new fast combination method, called modified rigid coarsening (MRC), to obtain the final Bayesian BBAs based on hierarchical decomposition (coarsening) of the frame of discernment. Regarding this method, focal elements with probabilities are coarsened efficiently to reduce computational complexity in the process of combination by using disagreement vector and a simple dichotomous approach. In order to prove the practicality of our approach, this new approach is applied to combine users’ soft preferences in recommender systems (RSs). Additionally, in order to make a comprehensive performance comparison, the proportional conflict redistribution rule #6 (PCR6) is regarded as a baseline in a range of experiments. According to the results of experiments, MRC is more effective in accuracy of recommendations compared to original Rigid Coarsening (RC) method and comparable in computational time.

Introduction

The theory of belief functions, known as Dempster-Shafer Theory (DST) was developed by Shafer [1] in 1976 from Dempster’s works [2]. Belief functions allow one to model epistemic uncertainty [3] and they have been already used in many applications since the 1990’s [4], mainly those relevant to expert systems, decision-making support and information fusion. To palliate some limitations (such as high computational compelxity) of DST, Dezert and Smarandache proposed an extended mathematical framework of belief functions with new efficient quantitative and qualitative rules of combinations, which was called DSmT (Dezert and Smarandache Theory) in literature [5, 6] with applications listed in [7]. One of the major drawbacks of DST and DSmT is their high computational complexities, on condition that the fusion space (i.e. frame of discernment—FoD) and the number of sources to combine are large. DSmT is more complex than DST, and the Proportional Conflict Redistribution rule #6 (PCR6 rule) becomes computationally intractable in the worst case as soon as the cardinality of the Frame of Discernment (FoD) is greater than six.

To reduce the computational cost of operations with belief functions when the number of focal elements is very large, several approaches have been proposed by different authors. Basically, the existing approaches rely either on efficient implementations of computations as proposed for instance in [8, 9], or on approximation techniques of original Basic Belief Assignment (BBA) to combine [1014], or both. From a fusion standpoint, two approaches are usually adopted: 1) one can approximate at first each BBA in subjective probabilities and use Bayes fusion rule to get the final Bayesian BBA [11, 12], or 2) one can fuse all the BBAs with a fusion rule, typically Dempster-Shafer’s, or proportional conflict redistribution rule #6 (PCR6) rules (which is very costly in computations), and convert the combined BBA in a subjective probability measure [10, 14]. The former method is the simplest method but it generates a high loss of information included in the original BBAs, whereas the latter method is intractable for high dimension issues.

This paper presents a new combination method, called modified rigid coarsening (MRC), to get the final Bayesian BBAs based on hierarchical decomposition (coarsening) of the frame of discernment, which can be seen as an intermediary approach between the two aforementioned methods. This hierarchical structure allows to encompass bintree decomposition and mass of coarsening FoD on it. To prove the practicality of our proposed method, MRC is applied to combine users’ preferences so as to provide the suitable recommendation for RSs. Preliminary work on original rigid coarsening (RC) has been published in our recent work [15] (This is an extended version of the paper presented at the 20th IEEE International Conference on Information Fusion, XIAN, China). In this paper, more detailed analyses of this new combination method are provided. More importantly, this innovative method is also applied into the real application. These are all added values (contributions) of this paper.

The main contributions of this paper are:

  1. the presentation of the FoD bintree decomposition on which will be done the BBAs approximations;
  2. user preferences in Recommender Systems (RSs) are modeled by DSmT-Modeling Function.

In order to measure the efficiency and effectiveness of the MRC, it is integrated in the RSs based on DSmT and compared to traditional methods in the experiments. The results show that regarding the accuracy of recommendations, MRC is extremely close to classical PCR6; and the computational time of MRC can be obviously superior to that of PCR6.

The remainder of this paper is organized as follows. In section 2, we review relevant prior work on DST and DSmT first. In section 3, MRC is presented. In section 4, a recommendation system based on DSmT, that employs MRC to combine users’ preferences, is shown. In section 5, we evaluate our proposed algorithm based on two public datasets: Movielens and Flixster. Finally, we conclude and discuss future work.

Mathematical background

This section provides a brief reminder of the basics of DST and DSmT, which is necessary for the presentation and understanding of the more general MRC of Section 3.

In DST framework, the frame of discernment (Here, we use the symbol ≜ to mean equals by definition.) (n ≥ 2) is a set of exhaustive and exclusive elements (hypotheses) which represents the possible solutions of the problem under consideration and thus Shafer’s model assumes θiθj = ∅ for ij in {1, …, n}. A basic belief assignment (BBA) m(⋅) is defined by the mapping: 2Θ ↦ [0, 1], verifying m(∅) = 0 and ∑A∈2Θ m(A) = 1. In DSmT, one can abandon Shafer’s model (if Shafer’s model doesn’t fit with the problem) and refute the principle of the third excluded middle. The third excluded middle principle assumes the existence of the complement for any elements/propositions belonging to the power set 2Θ. Instead of defining the BBAs on the power set of the FoD, the BBAs are defined on the so-called hyper-power set (or Dedekind’s lattice) denoted whose cardinalities follows Dedekind’s numbers sequence, see [6], Vol.1 for details and examples. A (generalized) BBA, called a mass function, m(⋅) is defined by the mapping: DΘ ↦ [0, 1], verifying m(∅) = 0 and ∑ADΘ m(A) = 1. The DSmT framework encompasses DST framework because 2ΘDΘ. In DSmT, we can take into account also a set of integrity constraints on the FoD (if known), by specifying all the pairs of elements which are really disjoint. Stated otherwise, Shafer’s model is a specific DSm model where all elements are deemed to be disjoint. ADΘ is called a focal element of m(.) if m(A) > 0. A BBA is called a Bayesian BBA if all of its focal elements are singletons and Shafer’s model is assumed, otherwise it is called non-Bayesian [1]. A full ignorance source is represented by the vacuous BBA mv(Θ) = 1. The belief (or credibility) and plausibility functions are respectively defined by and . is called the belief interval of X. Its length measures the degree of uncertainty of X.

In 1976, Shafer did propose Dempster’s rule and we use DS index to refer to Dempster-Shafer’s rule (DS rule) because Shafer did really promote Dempster’s rule in in his milestone book [1]) to combine BBAs in DST framework. DS rule is defined by mDS(∅) = 0 and ∀A ∈ 2Θ\{∅}, (1) The DS rule formula is commutative and associative and can be easily extended to the fusion of S > 2 BBAs. Unfortunately, DS rule has been highly disputed during the last decades by many authors because of its counter-intuitive behavior in high or even low conflict situations, and that is why many rules of combination were proposed in literature to combine BBAs [16]. To palliate DS rule drawbacks, the very interesting PCR6 was proposed in DSmT and it is usually adopted (PCR6 rule coincides with PCR5 when combining only two BBAs [6]) in recent applications of DSmT. The fusion of two BBAs m1(.) and m2(.) by the PCR6 rule is obtained by mPCR6(∅) = 0 and ∀ADΘ\{∅} (2) where m12(A) = ∑B,CDΘ|BC=Am1(B)m2(C) is the conjunctive operator, and each element A and B are expressed in their disjunctive normal form. If the denominator involved in the fraction is zero, then this fraction is discarded. The general PCR6 formula for combining more than two BBAs altogether is given in [6], Vol. 3. We adopt the generic notation to denote the fusion of m1(.) and m2(.) by PCR6 rule. PCR6 is not associative and PCR6 rule can also be applied in DST framework (with Shafer’s model of FoD) by replacing DΘ by 2Θ in Eq (2).

Modified rigid coarsening for fusion of Bayesian BBAs

Here, we introduce the principle of MRC of FoD to reduce the computational complexity of PCR6 combination of original Bayesian BBAs. Considering the case of non-Bayesian BBAs, it requires decoupling all non-singletons in these BBAs in advance, The fusion of original non-Bayesian BBAs needs to be decoupled by using DSmP in advance, which will explain in Section 4.

Rigid coarsening

This proposal was initially called rigid coarsening (RC) in our previous works [1719] and currently improved in our recent work [15]. The goal of this coarsening is to replace the original (refined) FoD Θ by a set of coarsened ones to make computation of the PCR6 rule tractable. Because we consider here only Bayesian BBA to combine, their focal elements are only singletons of the FoD , with n ≥ 2, and we assume Shafer’s model of the FoD Θ. A coarsening of the FoD Θ means to replace it with another FoD less specific of smaller dimension Ω = {ω1, …, ωk} with k < n from the elements of Θ. This can be done in many ways depending the problem under consideration. Generally, the elements of Ω are singletons of Θ, and disjunctions of elements of Θ. For example, if Θ = {θ1, θ2, θ3, θ4}, then a possible coarsened frame built from Θ could be, for instance, Ω = {ω1 = θ1, ω2 = θ2, ω3 = θ3θ4}, or Ω = {w1 = θ1θ2, ω2 = θ3θ4}, etc.

Definition 1: When dealing with Bayesian BBAs, the projection (For clarity and convenience, we put explicitly as upper index the FoD for which the belief mass refers) mΩ(.) of the original BBA mΘ(.) is simply obtained by taking (3)

The rigid coarsening process is a simple dichotomous approach of coarsening obtained as follows:

  • If n = |Θ| is an even number:
    The disjunction of the n/2 first elements θ1 to of Θ define the element ω1 of Ω, and the last n/2 elements to θn of Θ define the element ω2 of Ω, that is and based on Eq (3), one has (4) (5) For example, if Θ = {θ1, θ2, θ3, θ4}, and one considers the Bayesian BBA mΘ(θ1) = 0.1, mΘ(θ2) = 0.2, mΘ(θ3) = 0.3 and mΘ(θ4) = 0.4, then Ω = {ω1 = θ1θ2, ω2 = θ3θ4} and mΩ(ω1) = 0.1 + 0.2 = 0.3 and mΩ(ω2) = 0.3 + 0.4 = 0.7.
  • If n = |Θ| is an odd number:
    In this case, the element ω1 of the coarsened frame Ω is the disjunction of the [n/2 + 1] (The notation [x] means the integer part of x) first elements of Θ, and the element ω2 is the disjunction of other elements of Θ. That is and based on Eq (3), one has (6) (7) For example, if Θ = {θ1, θ2, θ3, θ4, θ5}, and one considers the Bayesian BBA mΘ(θ1) = 0.1, mΘ(θ2) = 0.2, mΘ(θ3) = 0.3, mΘ(θ4) = 0.3 and mΘ(θ5) = 0.1, then Ω = {ω1 = θ1θ2θ3, ω2 = θ4θ5} and mΩ(ω1) = 0.1 + 0.2 + 0.3 = 0.6 and mΩ(ω2) = 0.3 + 0.1 = 0.4.

Of course, the same coarsening strategy applies to all original BBAs , s = 1, …S of the S > 1 sources of evidence to work with less specific BBAs , s = 1, …S. The less specific BBAs (called coarsened BBAs by abuse of language) can then be combined with the PCR6 rule of combination according to formula Eq (2). This dichotomous coarsening method is repeated iteratively l times as schematically represented by a bintree. Here, we consider bintree only for simplicity, which means that the coarsened frame Ω consists of two elements only. Of course a similar method can be used with tri-tree, quad-tree, etc. The last step of this hierarchical process is to calculate the combined (Bayesian) BBA of all focal elements according to the connection weights of the bintree structure, where the number of layers l of the tree depends on the cardinality |Θ| of the original FoD Θ. Specifically, the mass of each focal element is updated depending on the connection weights of link paths from root to terminal nodes. This principle is illustrated in details in the following example.

Example 1: Let’s consider Θ = {θ1, θ2, θ3, θ4, θ5}, and the following three Bayesian BBAs can be seen in Table 1:

The rigid coarsening and fusion of BBAs is deduced from the following steps:

Step 1: We define the bintree structure based on iterative half split of FoD as shown in Fig 1.

The connecting weights are denoted as λ1, …, λ8. The elements of the frames Ωl are defined as follows:

  • At layer l = 1:
  • At layer l = 2:
  • At layer l = 3:

Step 2: The BBAs of elements of the (sub-) frames Ωl are obtained as follows:

  • At layer l = 1, we use Eqs (6) and (7) because |Θ| = 5 is an odd number. Therefore, we get the BBAs in Table 2:
  • At layer l = 2: We work with the two subframes and of Ω2 with the BBAs in Tables 3 and 4:
    These mass values are obtained by the proportional redistribution of the mass of each focal element with respect to the mass of its parent focal element in the bin tree. For example, is derived by taking Other masses of coarsening focal elements are computed similarly using this proportional redistribution method.
  • At layer l = 3: We use again the proportional redistribution method which gives us the BBAs of the sub-frames Ω3 in Table 5:

Step 3: The connection weights λi are computed from the assignments of coarsening elements. In each layer l, we fuse sequentially the three BBAs using PCR6 formula Eq (2). Because PCR6 fusion is not associative, we should apply the general PCR6 formula to get best results. Here we use sequential fusion to reduce the computational complexity even if the fusion result is approximate. More precisely, we compute at first and then . Hence, we obtain the following connecting weights in the bintree:

  • At layer l = 1:
  • At layer l = 2:
  • At layer l = 3:

Step 4: The final assignments of elements in original FoD Θ are calculated using the product of the connection weights of link paths from root (top) node to terminal nodes (leaves). We eventually get the combined and normalized Bayesian BBA:

Modified rigid coarsening

One of the issues with RC described in the previous section is that no extra self-information of focal elements is embedded into the coarsening process. In this paper, the elements θi selected to belong to the same group are determined using the consensus information drawn from the BBAs provided by the sources. Specifically, the degrees of disagreement between the provided sources on decisions (θ1, θ2, ⋯, θn) are first calculated using the belief-interval based distance dBI [20] to obtain disagreement vector. And then all focal elements in FoD are sorted in an ascending order. Finally, the simple dichotomous approach is utilized to hierarchical coarsen those Re-sorted focal elements.

Calculating the disagreement vector.

Let us consider several BBAs , (s = 1, …, S) defined on same FoD Θ of cardinality |Θ| = n. The specific BBAs mθi(.), i = 1, …, n entirely focused on θi are defined by mθi(θi) = 1, and for Xθi mθi(X) = 0.

Definition 2: The disagreement of opinions of two sources about θi is defined as the L1-distance between the dBI distances of the BBAs , s = 1, 2 to mθi(.), which is expressed by (8)

Definition 3: The disagreement of opinions of S ≥ 3 sources about θi, is defined as (9) where dBI distance is defined by [20] and proof of Definition 3 is given in S1 Appendix. For simplicity, we assume Shafer’s model so that |2Θ| = 2n, otherwise the number of elements in the summation of Eq (10) should be |DΘ| − 1 with another normalization constant nc. (10) Here, nc = 1/2n−1 is the normalization constant and dI([a, b], [c, d]) is the Wasserstein’s distance defined by . And BI(θi) = [Bel(θi), Pl(θi)].

The disagreement vector D1−S is defined by (11)

Modified rigid coarsening by using the disagreement vector.

Once D1−S is derived, all focal elements {θ1, θ2, ⋯, θn} are sorted according to their corresponding values in D1−S.

Let us revisit example 1 presented in the previous section. It can be verified in applying formula Eq (9) that the disagreement vector D1−3 for this example is equal to The derivation of D1−3(θ1) is given below for convenience.

Based on the disagreement vector, a new bintree structure is obtained and shown in Fig 2. Compared with Fig 1, the elements in FoD Θ are grouped more reasonably. In vector D1−3, θ1 and θ5 lie in similar degree of disagreement so that they are put in the same group. Similarly for θ2 and θ4. However, element θ3 seems weird, which is put alone in the process of coarsening. Once this new bintree decomposition is obtained, other steps can be implemented which are identical to rigid coarsening in section to get the final combined BBA.

Step 1: According to Fig 2, the elements of the frames Ωl are defined as follows:

  • At layer l = 1:
  • At layer l = 2:
  • At layer l = 3:

Step 2: The BBAs of elements of the (sub-) frames Ωl are obtained as follows:

  • At layer l = 1, we use Eqs (6) and (7) and we get (Table 6)
  • At layer l = 2: We use again the proportional redistribution method which gives us Tables 7 and 8. Here, masses of ω21, ω22 in are not considered because the mass of their parent focal element () in bintree is 0.
  • At layer l = 3: We work with the two subframes of Ω3 with the BBAs in Table 9:

thumbnail
Table 6. The BBAs of elements of the sub-frames Ω1 Using MRC for Example 1.

https://doi.org/10.1371/journal.pone.0189703.t006

thumbnail
Table 7. The BBAs of elements of the sub-frames Ω21 Using MRC for Example 1.

https://doi.org/10.1371/journal.pone.0189703.t007

thumbnail
Table 8. The BBAs of elements of the sub-frames Ω22 Using MRC for Example 1.

https://doi.org/10.1371/journal.pone.0189703.t008

thumbnail
Table 9. The BBAs of elements of the sub-frames Ω3 Using MRC for Example 1.

https://doi.org/10.1371/journal.pone.0189703.t009

Step 3: The connection weights λi are computed from the assignments of coarsening elements. Hence, we obtain the following connecting weights in the bintree:

  • At layer l = 1:
  • At layer l = 2:
  • At layer l = 3:

Step 4: We finally get the following combined and normalized Bayesian BBA

Summary of the proposed method

The fusion method of BBAs to get a combined Bayesian BBA based on hierarchical decomposition of the FoD consists of several steps (Algorithm 1) below illustrated in Fig 3. It is worth noting that when the given BBAs are not Bayesian, the first step is to use the existing Probabilistic Transformation (PT) to transform them to Bayesian BBAs. In order to use the proposed combination method in the RSs, modified rigid coarsening is mathematically denoted as ⨁ in the following sections.

Algorithm 1: Modified Rigid Coarsening Method

Input: All original BBAs , s = 1, 2, ⋯, s

Output: The final combined BBA mΘ(⋅)

1 if Compound focal elements in Θ: θiθj ≠ ∅ or θiθj ≠ ∅ then

2  Probabilistic transformation:

3 end

4 for in do

5  for sS do

6   Calculate

7  end

8 end

9 for in do

10  Sorting D1−S(θi) in an ascending order.

11 end

12 while |Θ| ≥ 2 do

13  if n is an even number then

14   ;

15   ;

16  else

17   ;

18   ;

19  end

20  Then connection weights λ is calculated: PCR6(mΩ(ω1), mΩ(ω2))

21 end

22 foreach focal element θi, i ∈ 1, ⋯, n do

23  mΘ(θi) equals to the product of path link weights from root to terminal nodes.

24 end

Simulation considering accuracy and computational efficiency

  • Accuracy:
    Assuming that the FoD is Θ = {θ1, θ2, θ3, θ4, θ5, θ6, θ7, θ8, θ9, θ10, θ11, θ12, θ13, θ14, θ15, θ16, θ17, θ18, θ19, θ20}, then 1000 BBAs are randomly generated to be fused with three methods: modified rigid coarsening, rigid coarsening and also PCR6. And then distances of fusion results are computed using dBI between two pairs: modified rigid coarsening and PCR6; rigid coarsening and PCR6. Comparisons are made in Fig 4, which show the superiority of our new approach proposed in this paper (The average value of the approximation of modified rigid coarsening is 97.5% and original rigid coarsening is 94.5%). Here, similarity represents the approximate degree between fusion results using hierarchical approximate method (both rigid and modified rigid coarsening) and PCR6.
  • Computational efficiency:
    As we mentioned before, another advantage of the hierarchical combination method is the computational efficiency. Here, two experiments are conducted (All experiments are implemented on a PC with I3 CPU, Integrated graphics chipsets and 4G DDR): 1) the number of singletons is unchanged while the number of BBAs to be fused is increasing; 2) the number of BBAs is unchanged while the number of singletons in FoD is increasing. The results are illustrated in Figs 5 and 6. From experiment 1, all these three methods (classical PCR6, rigid coarsening and also modified rigid coarsening) calculate quickly (less than 1.2s) even the number of BBAs increases from 100 to 1000. However, such situation deteriorates when the number of focal elements increases. In Fig 6, when the number of focal elements increases to 500, time consumption of three combinations is: PCR6: 20.6857s; modified rigid coarsening: 7.3320s; rigid coarsening: 5.9748s. This phenomenon also proves that it is reasonable to map original FoD to the coarsening FoD, with the aim of reducing the number of focal elements at the time of fusion. But in any case, computing efficiency of rigid coarsening or modified rigid coarsening is still better than PCR6. On the other hand, modified rigid coarsening makes a significant improvement (accuracy) at the expense of parts of the computational efficiency.

thumbnail
Fig 5. Efficiency comparisons between MRC, RC and PCR6 (With the number of BBAs increasing).

https://doi.org/10.1371/journal.pone.0189703.g005

thumbnail
Fig 6. Efficiency comparisons between MRC, RC and PCR6 (With the number of focal elements increasing).

https://doi.org/10.1371/journal.pone.0189703.g006

A recommender system integrating with hierarchical coarsening combination method

In today’s e-commerce, online providers often recommend proper goods or services to each consumer based on their personal opinions or preferences [21], [22]. However, it is a tough task to provide appropriate recommendation which may confront several difficulties. One difficulty is that users’ preferences are usually characterized as uncertain, imprecise or incomplete [23], [24], which cannot be used directly in RSs. Besides, it is easy to understand that when the more information about user preferences are, the more accurate prediction of RSs will be [25], [26]. But, the problem is that which method we adopt to integrate multi-source uncertain information?

As a general framework for information fusion, DST can not only model uncertain information, but also provide an efficient way to combine multi-source information. These mentioned features make this theory a wide range of applications [2729], especially in RSs [23, 25, 3032]. According to DST, users’ comments on products in RSs are described by using mass functions and rules of combination method are used frequently in order to provide appropriate recommendation.

As mentioned in previous sections, both the performances of combination rules in DST or in DSmT suffer from computational complex which is obviously ignored in [23, 25]. Thus, in this paper, modified rigid coarsening method is applicable to combine the imprecise users’ preferences in RSs. First, we are required to introduce the relevant knowledge of RSs. Actually, almost all characteristics of RSs have been introduced in [23, 25, 3032].

First, we give the corresponding representation of the mathematical notation in RSs based on DSmT. RSs usually contain two objects: {Users, Items}. A set of M users and a set containing N items is respectively denoted by U = {U1, U2, ⋯, UM} and I = {I1, I2, ⋯, IN}. Besides, we assume that users can give the corresponding ratings to the items, which include L rating levels (Θ = {θ1, θ2, ⋯, θL}.). Here, L preference levels means multi-level evaluation results. For example, four-levels of user evaluation on the product are {Excellent, Good, Fair, Poor}. ri,k means a rating of user Ui on item Ik and a rating matrix R = {ri,k} comprises all the ratings of users on items. It should be noted that ri,k is originally modeled as a mass function mi,k: DΘ → [0, 1]. Additionally, let and denote the set of items rated by user Ui and the set of users having rated item Ik, respectively.

Contextual information can often be summarized into several genres that significantly affect user’s rating of items. Normally, we represent contextual information by a set containing P genres, denoted by S = {S1, S2, ⋯, SP}. And each genre Sp, with 1 ≤ pP contains at most Q groups, denoted by Sp = {gp,1, gp,2, ⋯, gp,q, ⋯, gp,Q}, 1 ≤ qQ. For a genre SpS, a user UiU can be interested in several groups and also an item IiI can belong to one or some groups of this genre, which can be seen in Fig 7.

Definition 4: In order to facilitate such expression, two functions κ(⋅) and φ(⋅) are defined to determine the groups in which user Ui is interested and the groups to which item Ik belongs, respectively: (12) (13)

Generally, the main steps of a recommendation system is illustrated in Fig 8, which will be presented in details as follows:

  1. DSmT-Modeling Function
    Regarding the DS-partial probability models proposed in [23], the existing ratings ri,k, of user Ui on item Ik, are modeled by DSmT-modeling function M(⋅) in order to transform such hard ratings into the corresponding soft ratings represented as mi,k as below:
    Definition 5: (14) with where αi,k ∈ [0, 1] and σi,k are a trust factor and a dispersion factor, respectively [23].
    Referring to the partial probability model analysis in [23], we also give the corresponding user profiles which can be seen in Fig 9. Compared to [23], the difference is that we not only consider the union (black and gray rectangle), but also consider the intersection (red rectangle) of the hard ratings, which is also the distinction between DS theory and DSmT theory.
    Lemma 1: Referring to Definition 5, we can also generate the relative refined BBA in the framework of DS theory: (15) with where αi,k ∈ [0, 1] and σi,k are a trust factor and a dispersion factor, respectively [23].
    After soft ratings are generated, DSmP [33] is applied to decouple non-Bayesian mi,k, since the hierarchical fusion algorithm is currently just available for Bayesian BBAs.
    Definition 6: DSmP is a new generalized pignistic transformation defined by DSmPε(∅) = 0 and for any singleton θi ∈ Θ by (16) As shown in [33], DSmP makes a remarkable improvement compared with BetP and CuzzP, since a more judicious redistribution of the ignorance masses to the singletons has been adopted by DSmP.
  2. Predicting unrated items:
    Assuming that users who are keen on the similar groups tend to have common preferences. In this RS, it is necessary to predict the unrated items first. Considering a group gp,qSp with gp,qφ(Ik), every soft rating, mi,k, of user Ui, who is keen on group gp,q, on item Ik is regarded as a block of common preference for group gp,q. Thus, Gmp,q,k: DΘ → [0, 1] which represents all users’ group preferences on item Ik regarding group gp,q, is computed as follows (17) Supposing that item Ik has not been rated by user Ui, it usually contains three steps to generate unprovided rating ri,k of user Ui which are shown as below
    • Step one: Considering a concept Sp, for each group gp,qκp(Ui) ∩ φp(Ik), it is assumed that all users’ group preferences on item Ik regarding group gp,q imply common preference of Ui on Ik regarding group gp,q. Furthermore, this group preference is regarded as a piece of user Ui’s concept preference on item Ik regarding concept Sp. Therefore, concept preference of user Ui on item Ik regarding concept Sp, denoted by mass function Smp,q,k: DΘ → [0, 1], can be computed as below (18)
    • Step two: If there exists at least one common group in concept Sp which item Ik belongs to and also user Ui is interested in, then Ui’s concept preference on item Ik regarding concept Sp is regarded as a piece of context preference. Therefore, this user’s contextual preference on item Ik, denoted by mass function Smi,k: DΘ → [0, 1], is achieved as follows (19)
    • Step three: Context preference of Ui on item Ik is assigned to unprovided rating as below (20)

    So far, all unprovided ratings are predicted in this RS. Subsequently, user-user similarities are computed depending on both provided and predicted ratings in the following steps.
  3. Computing user-user similarities:
    Here, we use the distance measure proposed in [34] to calculate distances between two users Ui and Uj with ij, which is defined as below (21) where mi,k and mj,k are the soft ratings of user Ui and user Uj on item Ik respectively. Afterwards, the degree of similarity between Ui and Uj, denoted by si,j, is calculated as follows (22) Obviously, if the value of si,j is high, it means the user Ui and user Uj are very close, and vice versa. Eventually, a mathematical matrix S = {si,j|Ui, UjU, ij} is employed to represent the similarities among all users.
  4. Selecting neighbors based on user-user similarities:
    Taking into account an active user Ui, for each unrated item Ik by user Ui, a set containing K nearest neighborhoods, denoted by , is chosen by using the method proposed in [35]. Two simple steps of this method are shown below
    • Step one: the process of such selection depends on two criteria: 1. Those users who rated Ik and 2. The corresponding user-user similarities with user Ui are equal or greater than the threshold τ. denotes the selected set, which is acquired as follows: (23)
    • Step two: all of members in is descending sorted by si,j and top K members are selected as the neighborhood set .
  5. Estimating ratings according to neighborhoods:
    Supposing that item Ik has not been rated by user Ui. The predicted rating of Ui on item Ik is denoted as . Thus, is calculated according to the ratings of user Ui’s nearest users. Mathematically, is given as below (24) where is the mass regarding the neighborhoods’ whole preference in the set Eq (23) on item Ik. Considering user , and supposing that si,j is the similarity between user Ui and user Uj. We use a discount rate 1 − si,j to discount the rating of user Uj on item Ik. Therefore, is: (25)
  6. Generating recommendations:
    In order to generate appropriate recommendations for the candidate user Ui, predicted ratings of Ui on all unprovided items are sorted, and then based on the sorted list, the appropriate recommendations are generated.

Experiments

To evaluate the performance of modified rigid coarsening in precision of recommendation and computational time, original rigid coarsening method and also classical PCR6 combination method are selected to be regarded as baselines. Besides, we use DS-MAE [23] to measure the precision of recommendations.

Definition 7: DS-MAE is mathematically given as follows (26) where Dj is the testing set identifying the user-item pairs whose true rating is θj ∈ Θ.

Those specific users’ interested information about genres is unknown. Thus, we define a rule that if a user has rated an item then this user is interested in all genres to which the item belongs.

  1. Experiment One:
    Movielens (http://grouplens.org/datasets/movielens) is a movie recommendation dataset widely used for benchmarking process. There are nearly 100,000 hard ratings on 19 different types of movies (Action, Comedy and so on). The domain of such rating given in Movielens includes 5 levels, denoted as Θ = {1, 2, 3, 4, 5},. At the same time, each user is required to evaluate at least 20 movies, so as to ensure adequate rating information.
    The relevant parameters used in RSs are setted: γ = 10−4 and ∀(i, k){αi,k, σi,k} = {0.9, 2/3}. However, Setting parameter τ to be a fixed value is obviously unreasonable because the similarity between two users is quite different when using different combination methods. Hence, in this paper, the value of parameter τ will not be setted in advance. Instead, it is determined based on the similarity in matrix S. Specifically, the highest value of top 30% in S is selected for τ.
    Additionally, we adopt the robust strategy of 10-fold cross validation to conduct experiments, which is widely applied in experimental verification. Specific steps are as follows: original ratings in Movielens are first randomly divided into 10-folds and the experiments are thus carried out 10 times: in each sub-experiment, nine tenths of the ratings are chosen as training data and the remaining ratings are regarded as testing data. It’s worth noting that all results illustrated in the following experiments are the average values of 10 times.
    Fig 10 demonstrates the values of overall DS-MAE varying with changing neighborhood size K. And the smaller values of DS-MAE indicate the better ones. As can be seen in Fig 10, with K ≤ 70 performances of the three methods increase sharply as well as being the same as each other. With K ≥ 70, performances of both methods become stable. Especially, performance of modified rigid coarsening method is very close to classical PCR6 rules. However, original rigid coarsening is slightly worse than the other two algorithms.
    Fig 11 depicts the computational time varying with changing neighborhood size K. In this figure, the time taken by hierarchical coarsening combination methods (both rigid coarsening and modified rigid coarsening method) is quite faster compared to classical PCR6. Besides, modified rigid coarsening is relatively slower than original rigid coarsening. All these results illustrate that modified rigid coarsening method sacrifices some of the computational efficiency, in exchange for upgrading the accuracy of approximation.
  2. Experiment Two:
    Flixster (http://www.cs.ubc.ca/jamalim/datasets/) is a classical recommendation dataset which nearly contains 535013 hard ratings on 19 different types of movies (Drama, Comedy and so on). The domain of such rating given in Flixster includes 10 levels, denoted as Θ = {0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0},. At the same time, each user is required to evaluate at least 15 movies, so as to ensure adequate rating information. The relevant parameters used in RSs are setted: γ = 10−4 and ∀(i, k){αi,k, σi,k} = {0.9, 2/3}. However, Setting parameter τ to be a fixed value is obviously unreasonable because the similarity between two users is quite different when using different combination methods. Hence, in this paper, the value of parameter τ will not be setted in advance. Instead, it is determined based on the similarity in matrix S. Specifically, the highest value of top 50% in S is selected for τ.
    Fig 12 demonstrates the values of overall DS-MAE varying with changing neighborhood size K. And the smaller values of DS-MAE indicate the better ones. As can be seen in Fig 12, we can get a similar result to the previous data set(Movielens). Especially, performance of modified rigid coarsening method is in the middle of the comparison methods. However, original rigid coarsening is worse than the other two algorithms. Fig 13 depicts the computational time varying with changing neighborhood size K. From this figure, we can also get the same conclusion that the time taken by hierarchical coarsening combination methods (both rigid coarsening and modified rigid coarsening method) is quite faster compared to classical PCR6.

thumbnail
Fig 11. Overall computational time between three combination methods.

(Movielens).

https://doi.org/10.1371/journal.pone.0189703.g011

thumbnail
Fig 13. Overall computational time between three combination methods.

(Flixster).

https://doi.org/10.1371/journal.pone.0189703.g013

Conclusion

In this paper, we propose a new combination method, called modified rigid coarsening method. This new method can map the original refined FoD to the new coarsening FoD in the process of combination. Compared to traditional fusion method PCR6 in DSmT, this approach can not only reduce computational complexity, but also ensure high approximation accuracy. Besides, in order to verify the practicality of our approach, we apply this approach to fuse soft ratings in RSs. To be specific, user preferences are first transformed by DSmT-partial probability model to accurately represent uncertain information. Then, information about user preferences from different sources can be easily combined. In the future work, more helpful information will be mined to discern focal element in FoD so as to improve the accuracy of approximation and more data sets will be applied.

Supporting information

S1 File. The compressed file package of all datasets used in this paper.

https://doi.org/10.1371/journal.pone.0189703.s001

(RAR)

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 61573097, 91748106, in part by Key Laboratory of Integrated Automation of Process Industry (PAL-N201704), in part by the Qing Lan Project and Six Major Top-talent Plan, and in part by the Priority Academic Program Development of Jiangsu Higher Education Institutions. The authors thank the reviewers and editors for giving valuable comments, which are very helpful for improving this manuscript.

References

  1. 1. Shafer G. A mathematical theory of evidence. Princeton Univ. Press, 1976.
  2. 2. Dempster A. Upper and lower probabilities induced by a multivalued mapping. Annals of Mathematical Statistics. 1967;38:325–339.
  3. 3. Jiang W, Wang S, Liu X, Zheng H, Wei B. Evidence conflict measure based on OWA operator in open world. PLoS ONE. 2017;12(5):e0177828. http://doi.org/10.1371/journal.pone.0177828. pmid:28542271
  4. 4. Smets P. Practical uses of belief functions. In K.B. Laskey and H. Prade Editors, 15th Conf. on Uncertainty in Artificial Intelligence, pp. 612–621, Stockholm, Sweden, 1999.
  5. 5. Dezert J. Foundations for a new theory of plausible and paradoxical reasoning. Information & Security: An Int. Journal. 2002:9.
  6. 6. Smarandache F, Dezert J (Editors). Advances and applications of DSmT for information fusion. American Research Press, Rehoboth, NM, U.S.A., Vol. 1–4, 2004–2015. Available at webpage 2 of [7].
  7. 7. http://www.onera.fr/staff/jean-dezert .
  8. 8. Kennes R. Computational aspects of the Möbius transform of graphs, IEEE Trans. on SMC. 1992;22:201–223.
  9. 9. Shafer G, Logan R. Implementing Dempster’s rule for hierarchical evidence, Artificial Intelligence. 1987;33:271–298.
  10. 10. Yang Y, Liu YL. Iterative approximation of basic belief assignment based on distance of evidence. PLoS ONE. 2016;11(2):e0147799. pmid:26829403
  11. 11. Denœux T. Inter and outer approximation of belief structures using a hierarchical clustering approach. Int. J. of Uncertainty, Fuzziness and Knowledge-Based System. 2001;9(4):437–460.
  12. 12. Yang Y, Han DQ, Han CZ, Cao F. A novel approximation of basic probability assignment based on rank-level fusion. Chinese Journal of Aeronautics. 2013;26(4):993–999.
  13. 13. Han DQ, Yang Y, Dezert J. Two novel methods of BBA approximation based on focal element redundancy. Proc. of Fusion 2015, Washington, D.C., USA, July 2015.
  14. 14. Li MZ, Zhang Q, Deng Y. A New Probability Transformation Based on the Ordered Visibility Graph. International Journal of Intelligent Systems. 2016;31(1):44–67.
  15. 15. Dong YL, Li XD, Dezert J. A Hierarchical Flexible Coarsening Method to Combine BBAs in Probabilities, accepted in 20th International Conference on Information Fusion (Fusion 2017), Xi’an, China, July 10-13, 2017.
  16. 16. Smets P. Analyzing the combination of conflicting belief functions, in Information Fusion.2006;8:387–412.
  17. 17. Li XD, Dezert J, Huang XH, Meng ZD, Wu XJ. A fast approximate reasoning method in hierarchical DSmT (A). Acta Electronica Sinica.2010;38(11):2567–2572.
  18. 18. Li XD, Yang WD, Wu XJ, Dezert J. A fast approximate reasoning method in hierarchical DSmT (B). Acta Electronica Sinica.2011;39(3A):32–36.
  19. 19. Li XD, Yang WD, Dezert J. A fast approximate reasoning method in hierarchical DSmT (C). J. Huazhong Univ. of Sci. and Tech. (Natural Science Edition).2011;39:151–156.
  20. 20. Han DQ, Dezert J and Yang Y. Belief Interval-Based Distance Measures in the Theory of Belief Functions. IEEE Transactions on Systems, Man and Cybernetics: Systems.2016:1–18.
  21. 21. Bobadilla J, Orrega F, Hernando A and Gutierrez A. Recommender systems survey. Knowledge-Based System. 2013;46:109–132.
  22. 22. Chen T. Ubiquitous Hotel Recommendation Using a Fuzzy-Weighted-Average and Backpropagation-Network Approach. International Journal of Intelligent Systems. 2017;32(4):00–00.
  23. 23. Wichramarathne TL, Premaratne K, Kubat M, Jayaweera DT. CoFiDS: a belief-theoretic approach for automated collaborative filtering. IEEE Transaction on Knowledge and Data Engineering. 2011;23(2):175–189.
  24. 24. Ladyzynski P and Grzegorzewski P. Vague preferences in recommender systems. Expert Systems With Applications. 2015;42(24):9402–9411.
  25. 25. Nguyen VD, Huynh VN. Two-probabilities focused combination in recommender systems. International Journal of Approximate Reasoning.2017;80:225–238.
  26. 26. Bagher RC, Hassanpour H and Mashayekhi H. User trends modeling for a content-based recommender system. Expert Systems With Applications. 2017; 87:209–219.
  27. 27. Denoeux T. Maximum likelihood estimation from uncertain data in the belief function framework. IEEE Transaction on Knowledge and Data Engineering.2013;25(1):119–130.
  28. 28. Kanjanararakul O, Sriboonchitta S, Denoeux T. Forecasting using belief functions: an application to marketing econometrics. International Journal of Approximate Reasoning. 2014;55(5): 1113–1128.
  29. 29. Masson M, Denoeux T. Ensemble clustering in the belief functions framework. International Journal of Approximate Reasoning. 2011;52(1): 92–109.
  30. 30. Troiano L, Rodriguez-Muniz LJ, Diaz J. Discovering user preferences using Dempster-Shafer Theory. Fuzzy Sets Systems.2015;278: 98–117.
  31. 31. Nguyen VD, Huynh VN. A reliably weighted collaborative filtering system. ECSQARU 2015:429–439.
  32. 32. Jglesias J, Bernardos AM, Casar JR. An evidential and context-aware recommendation strategy to enhance interactions with smart spaces. HAIS 2013:242–251.
  33. 33. Dezert J, Smarandache F. A new probabilistic transformation of belief mass assignment. In Proc. of 11th Int. Conf. on Information Fusion, Cologne, Germany, pp. 1-8, June-July 2008.
  34. 34. Chan H, Darwiche A. A distance measure for bounding probabilistic belief change. International Journal of Approximate Reasoning.2005;38(2): 149–174.
  35. 35. Herlocker JI, Konstan JR, Borchers A and Riedl J. An algorithmic framework for performing collaborative filtering. SIGIR’99, ACM, 1999: 230–237.