Iterative Approximation of Basic Belief Assignment Based on Distance of Evidence

In the theory of belief functions, the approximation of a basic belief assignment (BBA) is for reducing the high computational cost especially when large number of focal elements are available. In traditional BBA approximation approaches, a focal element’s own characteristics such as the mass assignment and the cardinality, are usually used separately or jointly as criteria for the removal of focal elements. Besides the computational cost, the distance between the original BBA and the approximated one is also concerned, which represents the loss of information in BBA approximation. In this paper, an iterative approximation approach is proposed based on maximizing the closeness, i.e., minimizing the distance between the approximated BBA in current iteration and the BBA obtained in the previous iteration, where one focal element is removed in each iteration. The iteration stops when the desired number of focal elements is reached. The performance evaluation approaches for BBA approximations are also discussed and used to compare and evaluate traditional BBA approximations and the newly proposed one in this paper, which include traditional time-based way, closeness-based way and new proposed ones. Experimental results and related analyses are provided to show the rationality and efficiency of our proposed new BBA approximation.


Introduction
The theory of belief functions [1], also called Dempster-Shafer theory (DST), has many advantages in uncertainty modeling and reasoning [2,3]; however, it has also been argued due to its limitations [4,5]. One of the limitation is the high computational cost encountered in the procedure such as the evidence combination, conditioning, marginalization, and belief and plausibility degrees evaluation [1,6], especially when large amount of focal elements are available. This will confine the use of DST in practical applications.
The approaches of BBA approximation were then proposed to deal with the high computational cost encountered. The BBA approximation aims to obtain a simpler BBA by removing some focal elements. In existing works, such a removal of focal elements was implemented according to three different criteria related to the focal element's own characteristics. The first criterion is the mass assignment of a focal element, where the focal elements with smaller mass assignments are deemed unimportant, which are first removed. k − l − x [7], Summarization [8] and D1 [9] are representatives for this criterion. The second criterion is the cardinality of a focal element, where the focal elements with larger cardinalities, which might cause more computational cost, are first removed. k-additive approximation [10] is a representative of the 2nd criterion. The third criterion is to jointly use the mass assignments and cardinality of focal elements to determine which focal element should be first removed. Denoeux's inner and outer approximations [11], our proposed rank-level fusion based approximations [12], and our proposed non-redundancy based BBA approximations [13] follow the 3rd criterion.
Although all the available work referred above are rational and make sense to some extent, they still have their limitations. For example, the 1st and 2nd criteria only emphasize one aspect of the characteristic of focal elements. The 3rd criterion is the right direction, which jointly uses the mass assignments and cardinalities. This will make the related approximations more comprehensive, which means that they consider multiple aspects and thus avoid to be onesided. The available works based on the 3rd criteria are good attempts; however, the joint use is still an open problem, whose theoretical strictness and the generalization capability need further verification. To approximate the BBA more effectively, in this paper, we propose another approximation according to the 3rd criterion (i.e., the joint use of the 1st and 2nd) from a different angle. We directly start from the performance evaluation criterion of BBA approximations to design new approaches. Besides the computational cost, the distance between the approximated BBA and the original one, which represents the loss of information, is an important criterion to evaluate BBA approximations. An iterative approximation approach is proposed based on minimizing the distance, i.e., maximizing the closeness between the approximated BBA in current iteration and the BBA obtained in the previous iteration. The iteration stops when the desired number of focal elements is reached. Furthermore, as aforementioned, the available performance evaluations of BBA approximations often use the computational time and the closeness between the original BBA and the approximated one to judge the goodness of BBA approximation approaches. How to evaluate the performance of BBA approximations is still an open problem. Therefore, in this paper, besides the new BBA approximation approach, the evaluations of BBA approximations are briefly reviewed. Two new approaches of performance evaluation for BBA approximations are also proposed from new perspectives including the preservation of uncertainty order, the preservation of probabilistic decision making. Experimental results based on the comparisons with other approaches and related analyses show that our new BBA approximation is rational and effective.
The rest of this paper is organized as follows. Section 2 introduces the basics of the theory of belief functions including the problem of high computational cost encountered, especially in evidence combination. Section 3 provides a brief review of major existing works on BBA approximations. In Section 4, a new iterative BBA approximation approach is proposed based on the distance of evidence. Illustrative examples are provided. The tradeoff between tractability and the optimality is discussed and analyzed based on some experiment in Section 4. Section 5 further discuss on the performance evaluation for the BBA approximations, based on which, the experiments, simulations and related analyses are provided in Section 7. Section 7 concludes this paper.

Basics of The Theory of Belief Functions
The theory of belief function was proposed by Dempster and then further developed by Shafer [1], therefore, it is also called Dempster-Shafer theory (DST) or evidence theory, which is attractive in the fields related to uncertainty modeling and reasoning. The frame of discernment (FOD) Θ = {θ 1 , . . ., θ n } is a basic concept in DST, where the elements are mutually exclusive and exhaustive. A basic belief assignment (BBA) over an FOD Θ is defined as X If m(A)>0 holds, A is called a focal element. Based on a BBA, the corresponding belief function and plausibility function are defined respectively as is called the belief interval of A representing the degree of imprecision for the focal element A.
Dempster's rule of combination is for combining distinct pieces of evidence, which is defined as follows. 8A 2 2 Θ : where A i represents a focal element of m 1 ; B j represents a focal element of m 2 , and is called the coefficient of conflict, which represents the total degree of conflict between pieces of evidence. Many alternative rules [5] were proposed to redistribute the conflict. Besides the evidence combination, there are also other operations based on BBAs in DST such as marginalization, conditioning, etc [1].
To describe the degree of closeness between two pieces of evidence, the definitions on distance of evidence are required. Jousselme's distance d J [14] is one of the most commonly used one, which is defined as Here, Jac is Jaccard's weighting matrix and its elements Jac(A, B) for focal elements A and B are defined as Jousselme's distance has been proved to be a strict distance metric [15]. DST has been argued due to its limitations [5]. One of the limitations is the high computational cost encountered in all kinds of operations including the evidence combination, conditioning, marginalization, and belief and plausibility degrees calculation in DST [6], especially when large amount of focal elements are available. In the next subsection, we illustrate the high computational cost problem by the example of evidence combination using the classical Dempster's rule of combination.

Computational cost analysis for evidence combination
Suppose that an FOD |Θ| = n. A BBA m is defined on Θ. Therefore, there are at most 2 n − 1 focal elements for m. Here we analyze the computational cost for the combination between m and itself according to Dempster's rule in Eq (3) in terms of the multiplication operation included. There are at most (2 n − 1) × (2 n − 1) times of multiplication. Suppose that the actual number of focal elements (with non-zero mass assignments) for the BBA m is s. Then, there are s Á s = s 2 times of multiplication, because it is meaningless for those 2 n − 1 − s focal elements with zero mass assignments to attend the combination. Therefore, when there are too many focal elements, the computational cost will be high, which is harmful for the practical use of Dempster's rule of combination [6]. If we can reduce the value of s, that is, to reduce the number of focal elements, the computational cost could be reduced. Therefore, various BBA approximation approaches [7][8][9][10][11][12][13] were proposed, which approximate the original BBA with a simpler one (having less focal elements), thus to reduce the computational cost of evidence combination.
Note that some researchers design efficient algorithms for evidence combination. The representatives of this type of approaches include [16,17], and [18]. Besides the design of efficient combination algorithms, the BBA approximation is another way to reduce the computational cost, which is more intuitive for human to catch the meaning [19]. In this paper, we focus on the BBA approximations.

Brief Review of BBA Approximations
The available BBA approximation approaches can be categorized into two types from the viewpoint of the technical implementation, i.e., to preset the number of remaining focal elements or to preset the maximum allowed cardinality of the remaining focal elements.
They can also be categorized into three major types depending on the criterion used for removing focal elements as shown in Table 1.
Major available BBA approximations are briefly reviewed below.
Existing BBA Approximations: Type I The first type of BBA approximation approaches is according to the criterion of the focal element's mass assignment, where the focal elements with smaller mass assignments are deemed unimportant, which are first removed. In this type of approaches, there is usually a presetting of the number of remaining focal elements k. The removal of focal elements is continued until the number of remaining focal elements reaches k. k − l − x [7], Summarization [8] and D 1 [9] are representatives for this type. k − l − x method [7]. This classical approach was proposed by [7], which has three parameters. The approximated BBA is obtained by 1. keeping no less than k focal elements; 2. keeping no more than l focal elements; 3. deleting the masses which are no larger than x.
In k − l − x approach, all focal elements in the original BBA are sorted according to their mass assignments in a decreasing order. Then, the top p focal elements are selected so that k p l and so that the sum of mass assignments of these top p focal elements is no less than 1 − x. The removed mass assignments are then redistributed to those remaining focal elements by a classical normalization procedure. In fact, in k − l − x method, the focal elements with smaller mass assignments are regarded as "unimportant" ones and thus removed first. Summarization method [8]. This method also keeps focal elements having largest mass values as in k − l − x method. The mass assignments of removed focal elements are accumulated and assigned to their union set. Suppose that m is the original BBA and k is the desired number of remaining focal elements in the approximated BBA m S . Let M denotes the set of k − 1 focal elements having the largest mass assignments in m(Á). Then m S (Á) is obtained from m(Á) by Note that the number of remaining focal elements could be k D1 method [9]. Suppose that m is the original BBA and m S denotes the approximated BBA. The desired number of remaining focal elements is k. Let M be the set of k − 1 focal elements with the largest mass assignments in m and M − be the set which includes all the other focal elements of the original BBA m. D1 method is to keep all members of M as the focal elements of m S and to re-assign the mass assignments of the focal elements in M − among those focal elements in M according to the following procedure.
For a focal element A 2 M − , in M, find out all the supersets of A to construct a collection M A . If M A is non-empty, A's mass assignment is uniformly assigned among those focal elements having smallest cardinality in M A . When M A is empty, then construct M 0 A as Then, if M 0 A is non-empty, m(A) is assigned among those focal elements having smallest cardinality in M 0 A . The value assigned to a focal element B depends on the cardinality of B\A, i.e., | B\A|. The above procedure is iteratively executed until all m(A) have been re-assigned to those focal elements in collection M.
If M 0 A is empty, we have two possible cases: 1. If the univeral set Θ 2 M, the sum of mass assignments of those focal elements in M − will be added to Θ; 2. If Θ = 2 M, then let Θ be a focal element of m S (Á) and assign the sum of mass assignments of those focal elements in M − to m S (Θ).
Note that the number of remaining focal elements is k − 1, if Θ 2 M. See [9] for more details of D1 method.

Existing BBA Approximations: Type II
The second type of BBA approximation approaches are based on the criterion of the focal elements' cardinality. The focal elements with larger cardinalities, which might cause more computational cost, are first removed. k-additive approximation [10], hierarchical proportional redistribution approach [20] are representatives of the 2nd type, where the maximum allowed cardinality of the remaining focal element is preset. See related reference for details.

Existing BBA Approximations: Type III
The third type of BBA approximation approaches are based on the criterion of joint using the mass assignments and cardinality of focal elements to determine which focal element should be first removed. Denoeux's inner and outer approximations [11], and our previous works including the rank-level fusion based approximation [12], and the non-redundancy based BBA approximations [13] belong to this type. In these approaches, the number k of remaining focal elements is preset.
Denoeux's inner and outer approximations [11]. In Denoeux's inner and outer approximations, two different distances between focal elements (intersection-based δ \ and unionbased δ [ ) are used, which are defined as Such distances between focal elements consider both the information of the cardinality and mass assignment of the focal element. "Similar" focal elements defined based on δ \ are aggregated by the intersection operation to reduce the number of focal elements, which is called the inner approximation. "Similar" focal elements defined based on δ [ are aggregated by the union operation to reduce the number of focal elements, which is called the outer approximation. The focal elements are pairwise aggregated till the desired focal elements number is reached. Note that the focal element obtained using aggregation (no matter union or intersection) might be a focal element of the original BBA, therefore, the number of remaining focal elements cannot be estimated precisely. Furthermore, inner approximation might bring the empty set ; using intersection operation, which is not allowed in the classical "closed-world" DST.
See [11] for more details and illustrative examples. Rank-level fusion based approximation [12]. In rank-level fusion based approximations, two ranks of all focal elements in the original BBA are obtained based on each focal element's cardinality and mass assignment, respectively. Then two ranks are fused to generate a new rank, which can describe both the information of mass assignment and the cardinality. The focal elements with smaller rank position will be removed at first, which represents that they are with smaller mass assignments and with bigger cardinality size at the same time. The procedure is briefly introduced below.
1. Sort all the focal elements of an original BBA (with L focal elements) according to the mass assignment values (in ascending order which is due to the assumption that the focal element with small mass should be deleted as early as possible). The rank vector obtained is where r m (i) denotes the rank position of the ith focal elements (i = 1, . . ., L) in the original BBA according to the mass assignment.
2. Sort all the focal elements of the original BBA according to the cardinalities (in descending order, this is due to the assumption that the focal element with big cardinality should be deleted as early as possible). The rank vector can be obtained as r c ¼ ½r c ð1Þ; r c ð2Þ; :::; r c ðLÞ ð13Þ where r c (i) denotes the rank position of the ith focal elements in the original BBA according to the cardinality.
3. According to the rank-level fusion, we can obtain a fused rank as and α 2 [0, 1] is used to show the preference of two different criteria. Such a fused rank can be regarded as a more comprehensive criterion containing both the information of mass assignments and cardinality.
4. Sort f f in ascending order and find the focal element with the smallest r f value, i.e., r f ðjÞ ¼ min i rf ðiÞ. Then remove the jth focal element in the original BBA.
5. Repeat steps 1) -5) till only k focal elements remain. Do re-normalization of the remaining k focal elements. Finally, output the approximated BBA.
See [12] for more details and illustrative examples.
Non-redundancy based approximations [13]. In non-redundancy based approximations, the degrees of non-redundancy are defined based on the distances of focal elements as shown in Eq (11). First, calculate the distance matrix for all the focal elements of m(Á) as It should be noted that Hence, it is not necessary to calculate all the elements in Mat FE because the matrix is symmetric. We define the degree of non-redundancy of the focal element A i by As we can see that the definitions on degree of redundancy also use both the information of the cardinality and mass assignment of the focal element. The larger nRd(A i ) value, the larger nonredundancy (less redundancy) for A i . The less nRd(A i ) value, the less non-redundancy (larger redundancy) for A i . Those more redundant focal elements should be removed at first. See [13] for more details and illustrative examples. The categorization, pros and cons of the above BBA approximations are illustrated in Table 2 below.
All the available works referred above are rational and make sense to some extent. The approaches in the first and second types only emphasize one aspect of the information in a focal element. The approaches in the third type are more rational due to the joint use of the mass assignments and cardinalities. Although the available works in the third type are in better direction, the current joint use of mass assignments and cardinalities still have some limitations.
For example, in the rank-level based approximation [12], although it is simple, but it is at the price of loss of information due to rank-level fusion when compared with the data-level fusion. Furthermore, there exists the problem of parameter (the weights of mass assignment and cardinality) selection.
In Denoeux's inner and outer approximations [11], the strictness of the distances between focal elements are not seriously checked till now. Furthermore, when using inner approximation, the empty set ; can be generated as a focal element, which is not allowed in the classical DST based on the closed-world assumption.
Also in non-redundancy degree based approximations [13], the rationality and strictness of the definitions on non-redundancy for focal elements still need further inspections.
In summary, the current joint use of mass assignments and cardinalities for the BBA approximation is still an open problem. Therefore, in this paper, we propose another approximation according to the joint use of the mass assignment and the cardinality from a different angle, which is introduced in the next section.

A Novel BBA Approximation Using Distance of Evidence
To be more direct and to use more information, in this paper, we start from the performance evaluation of BBA approximations for designing new BBA approximation approaches. The basic idea is as follows.
In the available related literatures, the performance evaluation of BBA approximations often includes the evaluation in terms of the computational cost and the evaluation in terms of the information loss.
Intuitively, a better BBA approximation should output a BBA having less computational cost for the operations in DST, e.g., the evidence combination, and being more similar to the original BBA (less loss of information). The similarity can be described using the distance of evidence in Eq (5). So far as we remove some focal elements, the computational cost will be decreased more or less. Here, we focus on the closeness between the approximated BBA and the original one.
In all possible approximated BBAs, the one having smaller distance from the original one is preferred. If one want to obtain such a BBA denoted by m opt , one can pick the one minimizing the chosen distance with respect to the original BBA among all the possible BBAs having k focal elements according to ( However, such an optimal way is time-consuming and might cause the BBA approximation intractable, especially when the number of focal elements is very large (we provide a detailed analysis on this at the end of this section). Therefore, we try to make a trade-off between the optimality and the tractability by introducing an iterative implementation. We design an iterative BBA approximation approach based on minimizing the distance of evidence between the approximated BBA in current iteration and the BBA obtained in the previous iteration.
Given a BBA m with L focal elements A 1 , . . ., A L . Suppose that the desired number of remaining focal elements is k. The specific implementation is as follows.
Step 1: Remove a focal element A i from m. Normalize the mass of the remaining A j , where j 6 ¼ Step 5: Assign m = m s . If the desired number of remaining focal elements is not reached, i.e., L > K, then goto Step 1. If the desired number of remaining focal elements is reached, then output m as the final approximated BBA.
The whole procedure is illustrated in Fig 1. Here we provide an illustrative example to show how the BBA approximation using distance of evidence works.  Table 3.
Here the number k of the remaining focal elements is set to 3. Now, we apply the iteration BBA approximation based on distance of evidence to m.
In the first iteration, when we remove A 1 = {θ 1 , θ 2 }, the BBA obtained is as follows.   Here, we also provide the approximation results of the available BBA approximations referred in the previous section for comparisons.
Using k − l − x method [7]. Here k and l are set to 3.
x is set to 0.1. The focal elements A 4 = {θ 3 , θ 4 } and A 5 = {θ 4 , θ 5 } are removed without violating the constraints in k − l − x. The remaining total mass value is 1 − 0.05 − 0.05 = 0.9. Then, all the remaining focal elements' mass values are divided by 0.9 to accomplish the normalization. The approximated BBA m klx S ðÁÞ obtained by k − l − x method is listed in Table 4, where A 0 i , i = 1, 2, 3 are the focal elements of m klx S ðÁÞ. Using summarization method [8]. Here k is set to 3. According to the summarization method, the focal elements  Table 5 below. Using D1 method [9]. Here k is still 3. It can be obtained that A 1 , A 2 belong to M, and A 3 , A 4 , A 5 belong to M − . The focal element A 1 = {θ 1 , θ 2 } has empty intersection with the focal elements in M − , therefore its value will be unchanged. In M, A 2 is the unique superset of A 3 and A 4 , therefore, m(A 3 ) + m(A 4 ) = 0.10 + 0.05 is added to its original mass value. A 2 also covers half of A 5 , therefore, m(A 5 )/2 = 0.025 is further added to the mass of A 2 . Finally, the rest mass value is assigned to the total set Θ. The approximated BBA m D1 S is listed in Table 6. Using Rank-level fusion based approximation [12]. The number of remaining focal elements is 3. The approximate BBA is as shown in Table 7. It should be noted that although for Example 1, m Rank S ðÁÞ ¼ m klx S ðÁÞ, they are two different approaches. Using Denoeux inner and outer approximation [11]. With the inner approximation method, the focal elements pair with smallest Denouex's inner distance are removed, and then their intersection is set as the supplemented focal element whose mass value is the sum of the removed two focal elements' mass values. Such a procedure is repeated until the desired number of focal elements is reached. The results at each step are listed in Table 8.
As we can see in Table 8, it generates the empty set as a focal element, which is not allowed in the classical Dempster-Shafer evidence theory under close-world assumption.
With the outer approximation method, the focal elements pair with smallest Denouex's outer distance are removed, and then their union is set as the supplemented focal element whose mass value is the sum of the removed two focal elements' mass values. Such a procedure is repeated until the desired number of focal elements is reached. The results at each step are listed in Table 9.
Using the redundancy-based batch approximation method [13]. The desired remaining focal element is set to k = 3. We first calculate the distance matrix Mat FE , based on which, the degree of non-redundancy for each focal elements of m(Á) can be obtained. It is listed in Table 10.
Since A 3 and A 4 at the bottom have the two least nRd values, they correspond the two focal elements with the lowest non-redundancy, i.e., the highest redundancy. Therefore, they are Further discussion on the iterative approximation based on distance of evidence As aforementioned, the BBA obtained using the above iterative approximation is usually not the BBA which is closest to the original BBA given a desired number k of remaining focal elements, however it can significantly reduce the computational cost caused by the optimization, thus, is more tractable. It can also obtain relative smaller distance when compared with other BBA approximations as verified in the experiment below. If we are too greedy for obtaining the minimum distance, the price is the significantly increased computational cost in BBA approximation. Given the number of remaining focal elements k, there are totally C k 2 n À1 different BBAs. The approximated BBA with minimum distance to the original one should be selected out of the C k n different BBAs. Using our iterative approximation, there are totally BBAs. When the |Θ| = 5, the size of searching space for the one finding the minimum one and our iterative approximation are compared at different k as shown in Table 12.
The distance of evidence between the approximated BBA and the original one for the one finding the minimum one and our iterative approximation are compared at different k as shown in Fig 2. As shown in Fig 2 and Table 12, the approach to finding the minimum distance does provide the minimum loss of information, however, it is at the price of extremely large computational cost for the approximation, which makes the approximation intractable, especially when the |Θ| is very large. Our new proposed iterative approximation can make a good trade-off between the precision (with less loss of information) and the tractability (without so large computational cost).
To well justify our newly proposed BBA approximation, we give more detailed analyses on the evaluation of BBA approximations at first in the next section.

Evaluation criteria of BBA approximations
The evaluation criteria are very crucial for evaluate different BBA approximations, and also for design new BBA approximations. The available BBA approximation evaluation approaches are as follows.
1) Reduction of the computational cost after approximation After the BBA approximation, the computational operations like evidence combination, conditioning, marginalization, belief and plausibility degrees evaluation will be reduced. The BBA approximation with larger degree of reduction is preferred.
2) Closeness between the approximated BBA and the original one After the BBA approximation, the BBA obtained m 0 will depart from the original one. Larger degree of such a departure, i.e., less closeness between the approximated BBA m 0 and the original one m, means larger loss of information, which is not preferred. Such a degree of departure (or degree of closeness) can be described using the distance of evidence defined in Eq (5), i.e., d J (m, m 0 ).
3) Degree of ordering preservation between plausibilities of events In a BBA, each focal element corresponds to an event. Based on a given BBA m, we can calculate the corresponding plausibility function Pl according to Eq (2). Then, the ordering or ranking of the plausibilities Pl(A i ), A i 2 2 Θ of different focal elements (events) can be obtained, which is noted by Λ Pl . After the BBA approximation, we obtain the approximated BBA m 0 and can calculate its plausibilities Pl 0 (B i ), B i 2 2 Θ and its ordering denoted by Λ Pl 0 . We can calculate Iterative BBA Approximation Based on Distance of Evidence the distance between Λ Pl and Λ Pl 0 using Spearman's distance [21]: where smaller distance representing less loss of information or less distortion brought by the approximation is preferred.

4) Preservation of inclusion relation
Suppose that two BBAs m 1 and m 2 satisfy the inclusion relation (e.g., s-inclusion), which is defined as follows [22].
Suppose that m 1 's focal elements are {A 1 , . . ., A q } and m 2 's focal elements are {B 1 , . . ., B p }. If and only if there exists a non-negative matrix G = [g i,j ] such that for j = 1, . . ., p, P q i¼1 g ij ¼ 1; g ij > 0 ) A i B j , and for i = 1, . . ., q, P p j¼1 m 2 ðB j Þg i;j ¼ 1, where g ij is the proportion of B j that "flows down" to A i . That is, m 1 is s-included in m 2 (m 1 v s m 2 ) if the mass of any focal element B j of m 2 can be redistributed among subsets of B j in m 1 . This means that m 1 is less informative than m 2 .
After the approximation, if their approximated BBAs m 0 1 and m 0 2 still satisfy the inclusion relation, such a BBA approximation is preferred. This represents the informative relation between two BBAs is not affected by the approximation.
It should be noted that preservation of inclusion relation is a very strict relation. We have checked that all the BBA approximations introduced in this paper including our new approach cannot satisfy it.
Besides the above criteria, in this paper, we propose two new evaluation criteria for BBA approximations as follows.
New criterion I: Order preservation in terms of uncertainty degrees Given t different BBAs (according to Algorithm 1 [23] in Table 2): m 1 , . . ., m t and calculate their corresponding degree of uncertainty, e.g., the aggregated uncertainty (AU) [24,25] as defined below.
where the maximum is taken over all probability distributions being consistent with the given BBA. P m consists of all probability distributions hp θ |θ2Θi satisfying: For the t BBAs, their corresponding AU values are AU(m 1 ), . . ., AU(m t ). Then, sort AU values in an ascending order to obtain a ranking Λ o . Apply a BBA approximation approach S i to all the t BBAs, then t approximated BBAs can be obtained as m i 1 ; :::; m i t . Calculate and sort the AU values also in an ascending order to obtain a ranking Λ i . Calculate the distance between Λ o and Λ i .
If two rankings before and after the approximation S i are closer to each other, then S i is preferred for such an capability of order preservation, which represents a less loss of information or less distortion of the relation between BBAs.
Here, we can use the Spearman's distance [21] to measure the rankings' difference as follows.
New criterion II: Preservation of probabilistic decision After we apply the Pignisitc Probability Transformation (PPT) [26] to a BBA m according to the decision can be made by selecting the θ i with the maximum value in BetP(θ j ), j = 1, . . ., |Θ|.
If the probabilistic decision for the approximated BBA m 0 (using S i ) is the same as the probabilistic decision obtained based on the original BBA m, then the approximation S i is desired.
Such a criterion describes the probabilistic decision consistency before and after the BBA approximation.
One can use Monte Carlo method based on all the above criteria to evaluate BBA approximations. See details of implementation in the simulations in the next section.

Experiments and Simulations
In this section, we compare all the BBA approximation approaches aforementioned with the preset of remaining focal elements' number k including k − l − x (denoted by S 1 ), Summarization (S 2 ), D1 (S 3 ), rank-level fusion based approximation (S 4 ), the non-redundancy based approximation (S 5 ), and Denoeux's outer approximations (S 6 ) (we do not compare inner approximation method which might bring troubles for making the comparisons because Jousselme's distance cannot be computed if one allows to assign positive mass on empty set due to |;| = 0), with our newly proposed BBA approximation approach (S 7 ).
Simulation I-Comparisons in terms of computational time and the closeness. In Simulation I, we use the computational time of the evidence combination and the distance of evidence between the approximated BBA and the original one in average as performance measures. Our comparative analysis is based on a Monte Carlo simulation using M = 200 random runs. The size of the FOD is |Θ| = 6. In the jth simulation run, a BBA m j (Á) is randomly generated according to Algorithm 1 [23] in Table 13. In each random generation, there are 2 6 − 1 = 63 focal elements in the BBA m j (Á) to approximate. The number of remaining focal elements k for all the approaches used here are set to from 62 to 2. Then, different approximation  As we can see, when compared with the original computational time, the computation time of all the BBA approximation approaches can be reduced. This is intuitive, because the number Iterative BBA Approximation Based on Distance of Evidence of focal elements are reduced. At the same time, our newly proposed approximation usually has the smallest distance of evidence, i.e., the largest closeness, which represents the least loss of information when compared with other approximations concerned here.
Simulation II-Comparisons in terms of order-preservation for uncertainty degree. In Simulation II, we compare the information preservation capability of different BBA approximations from another aspect. Suppose that the size of FOD is 5.
2. Sort AU values in an ascending order to obtain a rank Λ o .
3. Apply a BBA approximation approach S i to all the five BBAs, then five approximated BBAs can be obtained as m i 1 ; :::; m i 5 . Iterative BBA Approximation Based on Distance of Evidence 4. Calculate and sort the AU values also in an ascending order to obtain a rank Λ i .

5.
Calculate the distance between Λ o and Λ i .
If two rankings before and after the approximation S i are closer to each other, then S i is preferred for such an capability of order preservation, which represents a less loss of information from another aspect.
The above simulation procedure is repeated 500 times (in each run, 5 BBAs are re-generated randomly), the different approximation S i 's averaged distances of ordering at different k are listed in Fig 5. The average distance of ordering over all k for different approximations are listed in Table 14.  Table 14 that our newly proposed approach usually has the smallest average distance between two orderings, which represents the high capability in orderpreservation, i.e., our new approach only causes the relatively small change to the order of the uncertainty degree. As aforementioned, this also represents the less loss of information and less distortion of relation caused by the approximation.
Simulation III-Comparisons in terms of probabilistic decision preservation. In this Simulation III, we compare all the BBA approximations in terms of the probabilistic decision preservation. It is hard to make the decision based on the original BBA and the decision based on the approximated one always have the same results, therefore, we calculate the percentage of the probabilistic preservation for S i as follows.
1. Randomly generate 1000 BBAs. One can also select 1000 BBAs out of the 10000 generated BBAs in the data set (S1 File).
2. Make probabilistic decision for the 1000 different original BBAs.
3. Apply the BBA approximation S i to the 1000 original BBAs.
4. Make probabilistic decision for the 1000 different approximated BBAs.
5. Count the number N i of cases where the probabilistic decision results for the original BBA and its corresponding approximated BBA using S i are the same.
The random generation of BBAs is according to Algorithm 1 in Table 13. A BBA approximation S i with higher Pd(i) is more preferred. In this simulation, the FOD Θ is with cardinality of 5. The simulation results (i.e., the percentage of probabilistic decision preservation for various BBA approximations with preset different remaining focal elements' number k) are shown in Fig 6. The average consistency rate over all k for different approximations are listed in Table 15.
As we can see in Fig 6 and Table 15 that the percentage of the probabilistic decision results preservation of the rank-level fusion based approximation and our newly proposed approach is usually highest (or the 2nd highest). It means that our new approach has higher possibility to keep the probabilistic decision results before and after the approximation unchanged given preset number of remaining focal elements, which represents the less loss of information from a different angle and the less of distortion of the relation caused by the approximation.
Simulation IV-Comparisons in terms of plausibility preservation. In this simulation IV, we compare all the BBA approximations in terms of the probabilistic decision preservation.
1. Randomly generate 1000 BBAs according to Algorithm 1. 2. For each BBA m j , j = 1, . . ., 1000, calculate its corresponding plausibilities and generate the ordering L i Pl . 3. After applying a BBA approximation S i , calculate the corresponding plausibilities and generate the ordering L j Pl 0 .
4. Calculate the distance between L j Pl and L j Pl 0 denoted by d j i , j = 1, . . ., 1000.
5. Output the average distance d i ¼ P 1000 j¼1 d j i =1000. The S i with smaller d i value is more preferred.
The average distance of ordering over all k for different approximations are listed in Table 16. the different approximation S i 's averaged distances of ordering at different k are listed in Fig 7. As we can see in the Fig 7 and Table 16 that the rank-level fusion based approximation and our newly proposed approach usually has the smallest average distance between two orderings, which represents the high capability in order-preservation for the plausibilities, i.e., our new approach only causes the relatively small change to the order of the plausibilities of events. It represents the less distortion caused by the BBA approximation.

Conclusions
A novel iterative BBA approximation approach based on the distance of evidence and two new evaluation approaches for BBA approximations are proposed in this paper. The new approximation can effectively simplify the BBA, and is with less loss of information. It also makes a good balance between the precision and the tractability of the approximation. Two new performance evaluation approaches for BBA approximations are related to the uncertainty orderpreservation and the consistency of probabilistic decision, respectively. Simulations and experiments are provided to illustrate and support our new BBA approximation and related evaluation approaches.
In this paper, we use the distance of evidence (closeness) to measure the loss of information in the BBA approximation. The closeness between two BBAs has strong correlation with the difference of information in two BBAs, thus, the closeness can be used to represent the loss of information. However, to be more strict, they are two different concepts. In our future work, we will try to directly use the difference between the degrees of uncertainty for BBAs (not the difference between the orders of uncertainty degree as we proposed in the uncertainty preservation) to represent the loss of information. The problem is how to design or select comprehensive uncertainty measures for BBAs, because the current total uncertainty measures in evidence theory including AU used in this paper and the ambiguity measure (AM) [27] are designed by generalizing the uncertainty measures in the probabilistic framework. They all have limitations [27,28] and cannot always comprehensively describe the uncertainty in a BBA. Furthermore, although we propose two evaluation approaches for BBA approximations, how to evaluate the BBA approximation is still an opening and challenging problem. More solid, especially quantitative performance evaluation approaches are our research focus in future. Iterative BBA Approximation Based on Distance of Evidence