Feature Selection Tracking Algorithm Based on Sparse Representation

In order to enhance the robustness of visual tracking algorithm in complex environment, a novel visual tracking algorithm based on multifeature selection and sparse representation is proposed. In the framework of particles filter, particles with low target similarity are first filtered out by a fast algorithm; then, based on the principle of sparsely reconstructing the sample label, the features with high differentiation against the background are involved in the computation so as to reduce the disturbance of occlusions and noises. Finally, candidate targets are linearly reconstructed via sparse representation and the sparse equation is solved by using APG method to obtain the state of the target. Four comparative experiments demonstrate that the proposed algorithm in this paper has effectively improved the robustness of the target tracking algorithm.


Introduction
Visual target tracking algorithm, due to its wide application in fields such as robotics visual control, human-machine interaction, intelligent assistance driving, and video surveillance, has attracted increasing attention from the researchers.However, in complex environment, due to changes in illumination and expression of the target object, object occlusion, and noises disturbance, design of an accurate and real-time visual tracker remains a challenging problem [1].
In recent years, a kind of appearance modeling technology called sparse representation has been widely used for information compression and pattern recognition.Agarwal and Roth [2] have achieved good results in object recognition via sparse representation.Researches [3][4][5] demonstrate that high recognition rate has been obtained in face recognition via sparse representation.Mei and Ling [6,7] first introduced sparse representation theory into visual tracking and developed a sparse representation tracking algorithm based on particles filter.The algorithm uses the template dictionary to linearly reconstruct the candidate targets and imposes sparsity constraints on the reconstruction coefficients.Apart from the target template, the algorithm uses the trivial template to construct the template dictionary, showing good robustness to occlusion; however, it requires a large amount of calculation because the algorithm adopts the LESSO method to solve the sparse equation.Bai and Li [8] developed a structural sparse representation model, which divides the sample into blocks and uses the Block Orthogonal Matching Pursuit (BOMP) algorithm to solve the sparse equation.It effectively improves the calculation speed, but the algorithm is not robust to illumination change.Hou et al. [9] provided a tracking algorithm of block sparse representation, in which each block is given the corresponding weights to improve the robustness to occlusion, but there may be drift due to large noises.A target tracking algorithm based on the structural sparse representation is proposed in [10], which constructs a vector pool of sparse representation coefficients by blocking the target sample and identifies the target state through the similarity information in the vector pool.But this algorithm fails to effectively use the residual error information; its robustness needs to be further improved.Based on this, Mathematical Problems in Engineering Hou et al. [11] developed a target tracking algorithm with sparse representation based on ranking.The algorithm gives full consideration to the sparse representation coefficients and residual error information while locating the target, which improves the robustness of the tracking algorithm.In the framework of structural sparse representation, the sparse coefficients of the candidate targets is classified in [12] by training the Naive Bayes Classifier, strengthening the algorithm's ability to differentiate between the target and its background.But the algorithm fails to extract the features with differentiation; its robustness needs to be further improved.Wang et al. [13] propose a least soft-threshold squares tracking algorithm based on sparse representation, which models the reconstructed residual error terms with the Gaussian-Laplacian distribution and finds the optimal solution of the objective equation by using iterative and softthreshold methods.Bao et al. [14] impose constraints on the target template coefficients and the trivial template with  1 -norm and  2 -norm, respectively.Additionally, it adopts the Accelerated Proximal Gradient (APG) method to solve the objective equation with sparsity constraints.As a result, the accuracy of the tracker and the speed of calculation are significantly improved.Zhuang et al. [15] propose a multitask concept that is similar to [14]; the algorithm also uses the Accelerated Proximal Gradient (APG) method to solve the objective equation by iteration, but it requires a fair amount of calculation.In the framework of sparse representation, Lan et al. [16] propose an objective equation based on adaptive multifeature selection and solve the equation by using the Accelerated Proximal Gradient (APG) method.The multifeature selection method has effectively improved the robustness of the tracking algorithm, but it requires a large amount of calculation.
In order to improve the robustness of visual tracking algorithm, this paper proposes a novel tracking algorithm based on multifeature fusion and sparse representation.According to the image intensity, the proposed algorithm uses the simple algorithm to filter out the candidate targets largely dissimilar to the target.Then, the discriminative features, which come from multifeature, are selected by using a method of sparsely reconstructing sample label.Finally, it uses  2 -norm to linearly reconstruct the candidate targets and obtains the state of the target.

Overview of the Tracking Algorithm
The theorem of proposed tracking algorithm based on multifeature fusion and sparse representation in this paper is shown in Figure 1.Compared with other tracking algorithms, it has the following contributions: (1) This paper proposes a fast algorithm that can quickly filter out the particles with low similarity to the target template, improving the calculation speed of the algorithm.
(2) Based on the sample feature constructed from multifeature, the proposed algorithm uses the method of sparsely reconstructing positive and negative sample label to extract sample features, which can help these extracted features more discriminatively.
(3) The conventional tracking algorithms based on sparse representation always use  1 -norm to impose the constraints on the linear reconstruction function, but they require a large amount of calculation, whereas the proposed algorithm in this paper uses APG method to solve the sparse function with nonnegative constraints, greatly improving the calculation speed of the algorithm.

Fast Particle Filter Algorithm
In the framework of particle filtering, most particles bear little resemblance to the target.So they can be filtered out through simple algorithm.Based on this idea, a fast algorithm is proposed to filter out the particles with low similarity to the target.
Let  = [ 1 ,  2 , . . .,   ] denote the sample set of the candidate targets, and  ∈   denote the target template, where   ∈   is one of the candidate target samples,  is the dimension of the sample, and  represents the number of particles.In order to reduce the amount of calculation, only the gray image information of the sample is involved in the calculation.By normalizing the candidate target   and target template , we can get    =   /(∑  =1  , ) and   = /(∑  =1   ), where  , is the th feature of   ,   is the th feature of , and   is the similarity measure of   and .If   bears much resemblance to , then there will be no remarkable difference between the values of   .So we can filter out the particles according to the fluctuation of values of   : where  , is the th element of   and mean (  ) is the mean of all elements of   .The value of   reflects the fluctuation of values of   .If the target is disturbed by occlusions or noises, the real target can possibly be filtered out when the value of   is too small.In order to solve this problem, only the elements with 50% variance value in   will be involved in the calculation.Sort the elements in   in descent order according to their values and eliminate the elements with values in the top half; then we can get the vector ũ .Suppose   = ∑ /2 =1 ũ, , where ũ, is the th element of ũ ; then we can filter out the particles according to the value of   .Through experiment, we select the   with smaller value ( 0 = /3) from all the particles to participate in the later operation.Through the algorithm, we can quickly eliminate two-thirds of particles with low similarity to the object, which effectively improves the calculation speed of the tracking algorithm.

Feature Selection
In order to improve the robustness of the tracking algorithm, we use the image intensity and LBP feature   ∈   2 to construct the sample features.Denote the gray features of the sample by   ∈   1 and the LBP feature of the sample where ‖ ⋅ ‖ is the norm operator and  represents the label vectors of the positive and negative samples.The positive and negative samples are set to 1 and −1, respectively. is the sparse adjustment coefficient and  ∈   represents the reconstruction vectors,  = [ 1 ,  2 , . . .,   ].It should be noted that, in (2), we use the 2-norm to impose constraints on the sparsity of .The benefit is that it can effectively reduce the amount of calculation in solving the sparse function.Set threshold as , and when   > , the corresponding feature has a strong ability to differentiate between the target and its background.
Construct the mapped vector of feature selection  = [ 1 ,  2 , . . .,   ]  ,  ∈   .Consider   = 0,  ≤ ,   = 1,  > . (3) Here,  is the feature vector before the feature selection.After the feature selection, the feature vector  is defined as where ⊗ denotes the multiplication of the corresponding elements within the vector.

Solution to the Sparse Equation with APG Approach
After feature selection, we get the feature vector set of the candidate targets  = [ 1 ,  2 , . . .,   0 ],  ∈   0 , and  0 denotes the dimension of each sample after feature selection.Let  = [ 0 , , −] denote the template dictionary, where  0 is the target sample set,  0 ∈   0 × , and  ∈   0 × 0 represents the unit diagonal matrix used to reduce the disturbance of occlusions and noises.Using template dictionary  to make sparse linear reconstruction for the candidate targets, we can get arg min where  = [  ,   ]  represents the sparse coefficients,   is the sparse coefficients corresponding to the target template, and   is the sparse coefficients corresponding to  and −.
Impose the nonnegative constraints on the elements in   to improve the robustness of the tracker [7].
After adding a penalty term, (5) with nonnegative constraints can be updated by arg min where (⋅) is the penalty term defined by The solution to ( 6) is equivalent to optimizing the convex function.This paper uses the Accelerated Proximal Gradient (APG) approach to find the optimal solution to (6).Set where () is a differentiable convex function and () is a discontinuous convex function.The specific algorithm is shown in Algorithm 1.

Object Tracking
The proposed algorithm in this paper is implemented in the framework of particles filter.In the first frame of the video, the initial state of the target is picked by mouse or captured through target recognition.Let { 1 ,  2 , . . .,   } denote the observation values from the first frame to the th frame of the video.   is the state of the th particle in the th frame.The target state in th frame is Here, (   |  1: ) can be obtained by solving the following equations: In (13), (  |  −1 ) is the state transfer function.The state of the sample is defined by six-dimensional affine vector [ 1 ,  2 ,  3 ,  4 ,  5 ,  6 ], which represents the  coordinate,  coordinate, length-width ratio, rotation angle, torsion angle, and scale, respectively.Suppose that the six parameters are mutually independent; then the state transfer function (  |  −1 ) can be represented as where Σ is the diagonal matrix constituted by The relation between the similarity measure (  |   ) in ( 14) and the residual error  based on sparse representation can be represented as The proposed algorithm is shown in Algorithm 2.
Algorithm 2. The proposed algorithm in this paper is as follows.
Input.Video, the target state in the first frame (1) Obtain the target dictionary of the initial target and positive and negative samples (2) for  = 1 :  ( is the number of the frames of the video)

Experiments and Discussion
The proposed algorithm in this paper is implemented in Matlab 2009b.In order to make the tracking results of the proposed algorithm more convincing, the experiment selects other three representative tracking algorithms for comparison, including  1 tracking algorithm [7], IVT tracking algorithm [17], and ASLSAM tracking algorithm.The four algorithms will be tested by four internationally used tracking test videos including "Panda," "Woman Square," "Trellis," and "ThreePassShop2cor".It is a challenging task to track these targets because all the videos pose challenging factors such as partial occlusion and variations in illumination, pose, and scale.The first test video is Panda with a cartoon panda as the tracked target.For this video, the difficulty in tracking the target lies in the partial occlusion and the target rotation.Figure 2 shows the test results of the four algorithms in the 4th, 55th, 108th, 176th, 201th, and 241th frame.The test results of the proposed algorithm, ASLSAM algorithm,  1 algorithm, and IVT algorithm are marked by the blue, red, gray, and purple boxes, respectively.As shown in Figure 2, our proposed algorithm can handle the target very well before the 201th frame but drift from the target in the 201th and 241th frame due to the occlusion.However, it has never missed the target throughout the tracking process.While the other three algorithms miss the target when there are occlusion and target rotation.
The second test video is Trellis with the face as the target.There is a great difficulty in tracking the target because of the drastic illumination and variation in pose in the video.As shown in Figure 3, our proposed algorithm can track the target faithfully throughout the tracking process without being affected by the pose and illumination change.But  1 algorithm and IVT algorithm miss the target in the 276th frame and ASLSAM algorithm misses the target in the 532th frame.
The third test video is Woman Square with a pedestrian as the target.The main challenge of tracking the target comes from the partial occlusion.As shown in Figure 4, only our proposed algorithm has successfully tracked the target without being affected by the partial occlusion while all the other three algorithms miss the target in the 143th frame as a result of the partial occlusion.
The fourth test video is ThreePassShop2Cor.The difficulty of tracking the target lies in the occlusion, the disturbance of the similar object, and scale variation.The tracking results of the four algorithms are shown in Figure 5.Because of the disturbance of occlusion and similar objects, our proposed tracker drifts from the target in the 125th frame but relocates the target later.While the ASLSAM and IVT algorithms locate the similar object instead of the tracked target in the 125th frame and the  1 algorithm misses the target in the 303th frame.
In order to compare the tracking results better, Table 1 lists the maximum, mean, and standard variance values of the tracking error of the four algorithms in four image sequences.The tracking error refers to the Euclidean distance between the center point of the target derived from the tracking algorithm and the center point of the actual target.It can be seen that the tracking results on all the four videos demonstrate that our proposed algorithm achieves more favorable performance than the other three.
Apart from handling the tracking error, in order to reflect the relation between the tracking results and the actual  appearance of the target, Table 2 lists the success rate of the four tracking algorithms based on the PASCAL VOC standard [18].It can be seen that the success rate of our proposed algorithm is remarkably higher than that of the other three.

Conclusion
A new sparse representation-based tracking algorithm in the framework of particles filter is proposed in this paper.
The new method first uses a fast algorithm to filter out

Figure 5 :
Figure 5: Tracking results on sequence "ThreePassShop2Cor." Update particles with normal distribution (4) Extract the intensity of the particles   and filter out particles (5) Extract LBP feature of the sample   and get Ouput.The state of target in each frame.

Table 1 :
Maximum, mean, and standard variance values of the tracking error.