Visual tracking in high-dimensional particle filter

In this paper, we propose a novel object tracking algorithm by using high-dimensional particle filter and combined features. Firstly, the refined two-dimensional principal component analysis and the tendency are combined to represent an object. Secondly, we present a framework using high-order Monte Carlo Markov Chain which considers more information and performs more discriminative and efficient on moving objects than the traditional first-order particle filtering. Finally, an advanced sequential importance resampling is applied to estimate the posterior density and obtains the high-quality particles. To further gain the better samples, K-means clustering is used to select more typical particles, which reduces the computational cost. Both qualitative and quantitative evaluations on challenging image sequences demonstrate that the performance of our proposed algorithm is superior to the state-of-the-art methods.


Introduction
Visual object tracking is a fundamental research topic in computer vision, which plays a critical role in various applications, such as human computer interaction, driverless car, surveillance, and human motion analysis, to name a few. After several decades of visual tracking research, considerable progress has been made, it still remains very challenging for developing an all-situation-handled tracker that successfully handles all scenarios, such as partial occlusions, illumination changes, fast motions, camera motions, background clutter and viewpoint, etc. [1,2]. Current visual tracking algorithms are classified as either generative or discriminative models. Both of them require filters to obtain the object's candidates. Particle filter [3] also known as Sequential Monte Carlo filter which has been studied actively on object tracking, because it can cope with difficult nonlinear/non-Gaussian dynamic problems. Particle selecting is a matter of prime importance in the particle filter, which impacts the results of tracking objects immediately. Theoretically speaking, large number and high quality of particles could achieve the optimal approximation of the probability distribution [4]. However, the large number of particles means high computation costs such that the way to get high quality of particles is always complicated. Thus the tradeoff between the speed and the discriminative ability becomes a challenge problem to be solved. The calculation of the particle tracking depends on the complexity of feature extraction and the amount of the sampling particles. Two kinds of methods are very often used to reduce the PLOS  computation costs. One is to create the easier but more effective feature. For example, Perez introduced an ingredients incorporated multi-part color modeling and a background color model to handle color clutter in the background and complete occlusion of the tracked entities over a few frames [5]. Han presented an on-line appearance modeling technique based on sequential density approximation which adopted to variations of lighting condition, pose, scale and view-point over time [6]. Wang represented an object using 2DPCA which combined 2D basis matrices and an additional sparse error matrix [7]. Kong proposed a feature selection method which chose low dimension but more discriminative features [8]. Nevertheless, another alternative way is to reduce the quantity of samples. Ridge regression was employed [9] to decrease the computational costs for it could exclude the outlying particles. Their product sparse coding guaranteed their lower calculation simultaneously. Li reduced the number of particles considerably because of their effective sampling strategy [10]. Joint tracking algorithm adopted the mean shift and particle filter with model updating which could sample fewer particles and perform better performance for similar color appearance and cluttered backgrounds [11]. Shan improved the sampling efficiency using the mean shift optimization and their real-time hand tracking run fast as it used only color and motion cues [12]. Motivated by aforementioned discussions, this paper proposes a supplementary knowledge based high-order particle filtering tracking algorithm. The contributions of this work are threefold: (1) we represent the tracked object using the combined feature including an improved 2DPCA and the supplementary information. 2DPCA is a simple but effective feature which could achieve performance comparable to PCA with less computation costs. The supplementary information such as tendency could enrich the presentation of the object. (2) The high-order particle filter can be used to increase the algorithm's accuracy because more information could be considered as well as more accurate and reliable moving object model could be supported. The traditional first-order Markov model is sensitive to loss of particle information from the previous time instant. For this reason, second-order object motion is widely used in tracking using Bayesian networks. However, it still cannot characterize the dynamics of moving objects. (3) Moreover, k-means clustering, a simple but valid algorithm, is adopted to selecting the sample particles with high possible appearance and further reduces the amount of the samples as it could decrease the computation costs.
The rest of the paper is organized as follows. In Section 2, we introduce our high-order particle filtering with combined features. Then, we present the summary of our proposed tracking algorithm in Section 3. Section 4 explains experimental results and analysis on tracking. Finally, Section 5 gives the conclusions.

High-order particle filtering with combined features
We begin with a concise review of particle filtering and then introduce our high-order particle filter tracking framework.

Review of particle filtering
Particle filter is a filtering method which has been shown to offer improvements in performance over non-linear or non-Gaussian environment. The traditional particle filter is derived on the first-order Markov chain. Its basic network structure is shown in Fig 1.  Fig 1 represents the structure of the particle filters (PF). The state parameter vector of a target and its observation at time t is denoted as x t and z t , respectively. The history of observations from time 1 to t is denoted as Z t = {z 1 , Á Á Á, z t }. The state-space model is a first-order Markov chain and the current state x t only depends on the previous state x t−1 . However, particle filters using the first-order Markov model cannot perform on fast moving objects efficiently. The situation would be worse if the object from the previous time instant is lost. Therefore, the high-order Markov Chain named as m-th-order Markov Chain is required to model the moving objects with high-order dynamics. Compared with the state-space model of the first-order particle filters, an m-th-order Markov chain's current state x t depends on the past m states and we have

Probability propagation of high-order particle filtering
In particle filtering, the tracking only uses sequential state probability propagation information. The information derived from the states and the objects' prior is also quite useful to influence the sampling strategies and the tracking. In this work, we assume that a target's appearance is modeled by a subspace model and the state space is augmented with the corresponding appearance's supplementary information such that the state x t = (c t , s t ) consists of two parts: the state c t , which models the estimated position, and the supplementary information s t , containing the moving tendency and so on. The posterior distribution now is described as P(c t , s t |Z t ). Unlike the probability P(x t |Z t , A) in the Appearance-Guided Particle Filtering [13], in which the tracking employs further prior of object appearance, we show the solutions using the assist knowledge derived between the sequential states. Fig 2 shows our first-order framework's Bayesian network structure and the network becomes: The posterior over the current state is influenced by integrating the target state and the assist knowledge at the previous time-step. Once we integrate out the assist part and approximate the filter using a hybrid Monte Carlo Method, the Bayes filter is reduced to Rao-Blackwellized particle filter (RBPF) [14] and the filter would be: In a Rao-Blackwellized particle filter, the assumption that the moving model for the state does not depend on the assist information is necessary and the marginal Bayes filter is obtained as follows: The approximation to this Bayes network model need an assumption that the motion model for the location at time t does not depend on the previous time-step's knowledge but uses the same importance sampling schemes as with the particle filter. However, if assuming the assist knowledge is independent of the state at the previous time-step and only influence the Bayesian bootstrap or the results of sampling importance resampling, the aforementioned model would degenerates to a BN structure for particle filtering. Hence, we investigate the probability P(x t | Z t ) as follows: where a first-order Markov chain of the states is considered. In order to better track the fast moving objects, high-order Markov Model is adopted and its posterior density function P(C t | Z t ) is shown in formula 6.   nodes denote the states of the object and the observations, respectively.

Weight updating of the high-order particle filtering
As aforementioned, the basic Sequential Importance Resampling (SIR) algorithm given starts from a random measure with equal weight on each of the N sample values and samples N times independently from the set with probabilities to obtain an equally weighted random measure. Exploiting the SIR method, the high-order particle filtering's posterior density can be estimated as The weight update equation is given by the equation where Pðz t jc i t Þ, Pðc i t jc i tÀ m:tÀ 1 Þ and Qðc i t jc i tÀ m:tÀ 1 ; z t Þ are the likelihood, the transition probability and the importance density, respectively. Hence, the posterior filtered density Pðc i tÀ mþ1:t jZ t Þ can be shown as For more details of the derivation, we refer readers to [1]. However, this SIR filter is vulnerable to sample impoverishment, so that the particle distribution cannot give an accurate approximation of the required PDF. Usually, researchers explore large numbers of particles in realistic applications which require more computation. For the sake of reducing computational complexity of particle filters, unequal importance weights measure before resampling is employed to refine the SIR strategy. Then, the importance density Qðc i t jc i tÀ m:tÀ 1 ; z t Þ would be changed to o i tÀ 1 Qðc i t jc i tÀ m:tÀ 1 ; z t Þ as the new one. The obvious distinction between the refined density o i tÀ 1 Qðc i t jc i tÀ m:tÀ 1 ; z t Þ and the original density Qðc i t jc i tÀ m:tÀ 1 ; z t Þ is that the refined density adds o i tÀ 1 . The weighted sample points carry more information than an equal number of unweight points. We apply o i tÀ 1 to our high-order particle filter because its superiority to the SIR filter both in terms of combating sample impoverishment and in computational cost. The paper [15] is recommended to the readers for more comprehensive and profound understanding to the weighted sample points. Combing the preceding ideas, the proposed high-order particle filter algorithm is presented in Algorithm 1.
End for t 2.4 Transition probability Pðc i t jc i tÀ m:tÀ 1 Þ and importance density Qðc i t jc i tÀ m:tÀ 1 ; z t Þ Compared with the dynamics of the objects from the traditional first-order particle filter, that is given as x t = ax t−1 + bv t , our dynamics of the tracking object is assumed as where v k is modeled as Gaussian distribution μ(0, S 1 ) and S 1 is diagonal covariance metrics. In order to calculate the coefficients, maximum-likelihood estimation method is used. The transition probability is given as follows: where N c (μ, S) = (1) / (2π|S|)exp(−(1) / (2)(c − μ) T S −1 (c − μ)). After obtaining the coefficients, the new samples based on the previous samples could be generated intuitively and the notation is as follows: Note that the value of H is the variance of the importance density function. Similar to the transition probability, the importance density could be given as:

"Extra Step": Analytical update
We have laid the theoretical foundation for the high-order particle filter and derived the probability propagation that is related to the BN as shown in Fig 3. However, our supplementary information is not covered and the valuable information can afford more prove to generate more precise particles. How to apply the supplementary information better is a confusing problem for us. It is widely accepted that large numbers of particles which distribute reasonably could improve the performance of the trackers. Chang explored a mixture distribution which generated two sets of samples from P(x t |x t−1 ) and P(x t |A), respectively [14]. This method combines particles with both dynamic model-driven information and appearance information and it is effective for visual tracking problems such as articulated hand tracking and lip-contour tracking. The amount of its particles in combined methods is more than the one in dynamic model which implies high computational cost. However, our original intention is to reduce the computation cost of implementing particle filters which means less particles. Huang used ridge regression to delete the outlying particles to obtain fewer particles [9]. Combining the two different methods above, here, we use P(c t , s t |Z t ) to reselect more suitable particles. As shown aforementioned, in order to simplify the model, M particles have been generated using the advanced SIR algorithm and only cover the information of the state c i kÀ mþ1:k . Moreover, the supplementary information s t is an important resource to resample the particles. The mixture distribution of P(c t , s t |Z t ) is set according to the following equation: Note that the probability P(c t |Z t ) and P(s t |Z t ) is caused by the state transition and the supplementary information transition, respectively. When α is set to one, it degenerates to the original particle filtering in which only the dynamical information is used. Form Algorithm 1, we have set M particles using the state transition. Similar to generate M particles, M s particles would be generated based on P(s t |Z t ). Here we use the direction of motion as the object's the supplementary information between the frames. Then, the two sets of particles are combined together as a complete sample set whose amount is more than each of them. Since the number of the particles is large, it must exists a method to reduce the computational cost for the sake of obtaining our expected idea. K-means clustering is applied in our method to reduce the amount of sample particles, which is simple but useful to decrease the computational cost. Kmeans clustering is popular for cluster analysis and aims to partition N observations into k clusters in which each observation belongs to the cluster with the nearest mean. Expressing with the mathematical language, the objective of K-means clustering is to find: Note that each of observations in the observation sets (x 1 , x 2 ,Á Á Á x n ) is a d-dimensional real vector and would be partitioned into k( n) sets S = (S 1 , S 2 , Á Á ÁS n ), and μ i is the mean of points in S i . One of the most popular heuristics for solving the k-means problem is generalized Lloyd's algorithm. In order to implement Lloyd's algorithm simply and efficiently, the filtering algorithm which is based on storing the multidimensional data points in a kd-tree is applied. A kd-tree represents a hierarchical subdivision of the point set using axis aligned splitting hyperplanes. Given n observations, this produces a tree with O(n) nodes and O(log n) depth. After using filtering algorithm, the observation set is partitioned into k clusters with its corresponding cluster center. For each cluster, we select half of the observations which are near to the center of each set. Except the sample method using the special distance from the center, a randomly sample method is another useful and practicable way. Certainly, mixed selecting method combining the specific scheme with the random way is definitely doable. Here, the points near the center are selected for the following actions because we do not cut the number of the sample particles so aggressively. Therefore, the amount of the particles would be reduced to half of the original one.

Proposed tracking algorithm summary
This tracking algorithm is under the frame of high order particle filter which can be viewed as a Bayesian inference task in a Markov model with hidden state variables. In this paper, the feature extraction is not the most important part in the whole tracking system. Principal component analysis (PCA) is a classical dimension reduction method, which has been used as a method to extract feature in many areas. The advanced PCA which is called two-dimensional principal component analysis (2DPCA) attracts more attention because of its better performance and less computational cost. Motivated by the advantage of 2DPCA, we represent an object by using 2D matrices. For a series of image matrices z = (z 1 , z 2 , Á Á Á z k ), an orthogonal left-projection matrix U 2 R d l Âk l , an orthogonal right-projection matrix V 2 R d r Âk r , and a projection coefficients Co = (Co 1 , Co 2 , Á Á Á Co k ) is obtained by solving the objective function The optimizations of the orthogonal left-projection matrix U and the orthogonal right-projections matrix V are computed according to the algorithms in the paper [16] and [17] respectively. Then Co i could be approximated using U T z i V. After the projection coefficient is calculated, the likelihood can be obtained by the reconstruction error, We summarize the algorithm in Algorithm 2.

Algorithm 2. Proposed Tracking Algorithm
Initialize: an observation Matrix Z, left-and right-projection matrices U and V. Given a random measure with M support points fc i 0 ; o i 0 g M i¼1 , the following steps are performed to construct a new set of samples. 1. select M particles using the proposed advanced high-order particle filtering algorithm 2. select N particles using the supplementary information 3. the distribution of M+N particles is according to P(c t , s t |Z t ) 4. k-means cluster is used to select the more suitable particles 5. measure each particles using Pðz i j c i Þ ¼ expðÀ . choose the best candidate as the current state

Experiments and analysis
The proposed algorithm is implemented in Matlab (R2013a) on a personal computer Inter(R) Core(TM) i5-4300U 1.90GHz CPU with 4 GB RAM. The object is initialized manually and the proposed method can process at about 10 frame/s. In order to evaluate the performance of the proposed tracking framework, 13 image sequences are tested in the experiments. These sequences are public which can be obtained from the internet easily and the website address is http://www.visual-tracking.net. Such videos cover most challenging conditions in visual tracking, scale variation, occlusion, rotation, motion blur, background clutter, illumination variation and fast motion.

Ablation study
In order to demonstrate the feasibility of the proposed high-dimensional methods, we summarize the performance of our proposed tracker and the trackers without some variants in Table 1. 2DPCA tracking is a base tracker and its MATLAB source codes could be downloaded on http://ice.dlut.edu.cn/lu/publications.html. The tracker with combined features here is an alternation of the 2DPCA tracker which changes the 2DPCA feature of the original tracker into the combined features and the combined features comprise 2DPCA and tendency. The third tracker named 2DPCA_HDPF is built by changing the l-1 regulation of the 2DPCA tracker into high-order particle filter. The combined features_HDPF tracker is made by abandoning the k-means cluster from the proposed tracker and it is also can be written as ours_without cluster tracker. Throughout the comparisons between the 5 trackers, it is easily concluded that the proposed tracker performs the best and all the parts of the proposed tracker are useful to improve the performance. Noticed that ours performs better than ours_without cluster and it only rises about 0.4 because the difference between the two trackers is a k-means cluster method. The proposed algorithm runs faster than the combined fea-tures_HDPF tracker.
To further illustrate the efficiency of the proposed method, the experiments on tracking accuracy versus number of particles have been made subsequently. Three different trackers are selected to participate the experiments and those are the 2DPCA tracker, the Combined fea-tures_HDPF tracker and ours. Fig 4 shows the average center error mean with different number of particle sampling for each tracker and its general trend appears to be decrease with increasing number of sample particle. However, the average center error mean cannot drop steadily but remains almost constant when the number of sample particle is more than 800. The proposed method could achieve an acceptable performance when the number of sample particle is 400. In the paper, we set 400 as the number of the particle sampling which reaches the balance between the computation and the performance.

Comparison with state-of-the-art trackers
We evaluate our tracker against 13 state-of-the-art algorithms, including IVT tracking [18], MIL tracking [19], DFT tracking [20], L1APG tracking [21], ASLA tracking [22], DLT tracking [23], SCM tracking [24], 2DPCA tracking [15], SCT tracking [8], TLD tracking [25], VTD tracking [26], Struck tracking [27] and SPC tracking [28]. The source codes for all the evaluated trackers can be downloaded from the internet. For fair comparisons, all the evaluated trackers are initialized with the same parameters. For the purpose of assessing the performance of the proposed tracker, we conduct quantitative comparisons between the proposed method and other algorithms using the PASCAL VOC criterion score [29]. Table 2 shows that the proposed method can achieve an excellent tracking result in most sequences in terms of both the average and the standard of center error. Even so, the proposed algorithm improves the performance about 5.8 and 6.7 pixels compared with the tracker which is second best. In addition, Table 3 shows the success rate provided by our proposed tracker and other approaches on 13 sequences. The proposed tracker shows the optimal or suboptimal performance in almost all the sequences, which obtains a mean success rate of 90.19%. However, our method runs slower than SCT. Although ours is either the most accurate or the fastest one, ours do the best within an acceptable scale.

Conclusions
In this paper we propose a high degree particle filter with the methods to reduce the particles for robust visual tracking. We represent the tracked object by 2DPCA and the tendency of the object and model tracking under the frame of high degree particle filter. In order to reduce the computational cost, K-means cluster is used to select the more suitable particles. Then, reconstruction error is adopted to judge the best candidate. Experiments on challenging video clips demonstrate the robustness of the proposed algorithm. Visual tracking in high-dimensional particle filter