Visual Object Tracking Based on 2DPCA and ML

We present a novel visual object tracking algorithm based on two-dimensional principal component analysis (2DPCA) and maximum likelihood estimation (MLE). Firstly, we introduce regularization into the 2DPCAreconstruction anddevelop an iterative algorithm to represent an object by 2DPCA bases. Secondly, the model of sparsity constrainedMLE is established. Abnormal pixels in the samples will be assigned with low weights to reduce their effects on the tracking algorithm. The object tracking results are obtained by usingBayesianmaximumaposteriori (MAP) probability estimation. Finally, to further reduce tracking drift, we employ a template update strategy which combines incremental subspace learning and the error matrix. This strategy adapts the template to the appearance change of the target and reduces the influence of the occluded target template as well. Compared with other popular methods, our method reduces the computational complexity and is very robust to abnormal changes. Both qualitative and quantitative evaluations on challenging image sequences demonstrate that the proposed tracking algorithm achievesmore favorable performance than several state-of-the-art methods.


Introduction
As one of the fundamental problems of computer vision, visual tracking plays a critical role in advanced vision-based applications (e.g., visual surveillance, human-computer interaction, augmented reality, intelligent transportation, and context-based video compression) [1][2][3].However, building a robust model-free tracker is still a challenging issue due to the difficulty arising from the appearance variability of an object of interest, which includes intrinsic appearance variability (e.g., pose variation and shape deformation) and extrinsic factors (illumination changes, camera motion, occlusions, etc.).
Typically, a complete tracking system can be divided into three main components: (1) an appearance observation model, which evaluates the likelihood of a candidate state belonging to the object model, (2) a motion model, which aims to model the states of an object over time (such as Kalman filtering and particle filtering), and (3) a search strategy for finding the most likely states in the current frame (e.g., mean shift and sliding window).In this paper, we are devoted to developing a robust appearance model.
Due to the power of subspace representation, subspacebased trackers (e.g., [4,5]) are robust to in-plane rotation, scale change, illumination variation, and pose change.However, they are sensitive to partial occlusion caused by their underlying assumption that the error term is Gaussian distributed with small variances.This assumption does not hold for object representation when partial occlusion occurs as the noise term cannot be modeled with small variances.
An effective tracking algorithm (called L1 tracker) based on sparse representation within a particle filter framework is developed in [6].The L1 tracker represents the tracked target by using a set of target templates and trivial templates.The target templates depict a subspace on the tracked object and the trivial templates aim to model the occlusion effectively.However, the use of trivial templates increases the number of templates significantly, which make the computational complexity of L1 tracker too high to satisfy real applications.
In [7], the authors also presented a sparse coding-based tracker by combing sparse coding and Kalman filtering and fusing the color and gradient features.To account for the variations of the tracked object during the tracking Mathematical Problems in Engineering processing, they use a template update strategy by replacing a random template of the original template library with the last tracking result.However, this simple update manner can easily introduce tracking errors when abnormal changes occur, which may cause tracking drift.
Motivated by aforementioned discussions, we propose an object tracking algorithm based on 2DPCA and MLE.Firstly, we introduce regularization into the 2DPCA reconstruction and develop an iterative algorithm to represent an object by 2DPCA bases.Secondly, the model of sparsity constrained MLE is established.Abnormal pixels in the samples will be assigned with low weights to reduce their affects on the tracking algorithm.The object tracking results are obtained by using Bayesian maximum a posteriori probability (MAP) estimation.Finally, to further reduce tracking drift, we employ a template update strategy which combines incremental subspace learning and the error matrix.This strategy adapts the template to the appearance change of the target and reduces the influence of the occluded target template as well.The experimental results show that our algorithm can achieve stable and robust performance especially when occlusion, rotation, scaling, or illumination variation occurs.

Visual Object Tracking Model Based on
2DPCA and MLE: The Theory of 2DPCA  [8]).It finds the projection directions along which the reconstruction error to the original data is minimum and projects the original data into a lower dimensional space spanned by those directions corresponding to the top eigenvalues.Recent studies demonstrate that two-dimensional principal component analysis (2DPCA) could achieve performance comparable to PCA with less computational cost [9,10].Given a series of image matrices Y = [ 1  2 ⋅ ⋅ ⋅   ], 2DPCA aims to obtain an orthogonal left-projection matrix U, an orthogonal right-projection matrix V, and the pro- Then the coefficient A  can be approximated by A  ≈ U  Y  V. We note that the underlying assumption of (1) is that the error term is Gaussian distributed with small variances.This assumption is not able to deal with partial occlusion as the error term cannot be modeled with small variances when occlusion occurs.In this paper, we propose an object tracking algorithm by using 2DPCA basis matrices and an additional MLE error matrix Y ≈ UAV  + e.
Let the objective function be the problem is min where Y denotes an observation matrix, A indicates its corresponding projection coefficient, and  is a regularization parameter.e describes the error matrix.

MLE Model.
The basic idea of sparse coding is to use the templates in a given dictionary T to represent a testing sample  (as  ≈ ), where  is sparse coding coefficient vector.Traditionally, the sparsity can be measured by L0norm and the L0-norm minimization is an NP-hard problem.
Fortunately, [11] proves that when the solution is sparse enough, L0-norm minimization is equivalent to the L1-norm minimization.
Therefore, the sparse coding problem can be defined as [12,13] where  > 0 is a very small constant.This model shows two constraints in sparse coding: one is that min  ‖‖ 1 constrains the sparsity of represented signal; the other is that ‖ − ‖ 2 2 ≤  constrains the accuracy of the represented signal [14][15][16][17].
The analysis of the two constraint terms mentioned earlier is as follows.For object tracking, the accuracy constraint is more important than the sparsity one, especially when occlusion, rotation, scaling, or illumination variation happens to the object.In that case, considering some possible abnormal changes, whether the model can accurately describe the object or not will directly determine the success or failure of tracking algorithm.Most of current algorithms are presented under the assumption that the sparse coding residual e =  − α follows the Gaussian distribution.In practice, however, this assumption is limited when abnormal changes happen which will inevitably lead to the failure of tracking algorithm.
In sparsity constraints, though L1-norm minimization is more efficient than the L0-norm minimization, the fact is that the L1-norm minimization programming is still very time consuming.Object tracking algorithms are different from face recognition algorithms in that face recognition algorithms do not demand fast processing speed in a sample training process, while in object tracking, slow processing speed will directly affect the practical value of the object tracking algorithm.In that case, the introduction of L1-norm minimization into the field of object tracking would greatly reduce the performance of tracking algorithms.
We note that the tracking accuracy and speed are two important aspects for evaluating the performance of object tracking algorithms.Therefore, in this paper, we develop an MLE-based model that improves the traditional sparse coding model from the two aspects and then apply it to achieve an effective and efficient tracker.
In the field of object tracking, accuracy is the most important issue.Hence, at first, we need to improve the accuracy constraint term in the traditional sparse coding model.
When the reconstruction error e =  − α follows the Gaussian distribution, the traditional sparse coding solution can be written as where  is a regularization parameter.For object tracking, the dictionary T = [  follows the Gaussian distribution, the solution of ( 5) is the maximum likelihood estimation.
However, in practical applications, when the object suffers from occlusion, rotation change, scale change, or illumination variation, the reconstruction errors e of abnormal pixels will not follow the Gaussian distribution.In that case, these algorithms may not track the object accurately.Therefore, we need to build a more adaptive object representing model.
Taking into account the sparsity constraint of , the MLE of  can be formulated as the following minimization: According to [6], formula ( 6) can be converted into weighted sparse coding problem where  is a diagonal matrix with diagonal elements as follows: which also stands for the th pixel's weight value. and  are positive constants.If we make  , = 2, then the model would be the traditional sparse coding problem.Hence, we can see that formula (7) is more adaptive than (3).
In this study, we choose it as the weight function where  is a scale factor (we choose  = 10 in our experiments).The physical meaning of  , is to allocate smaller weights to those pixels with bigger residuals (probably abnormal pixels) and allocate bigger weights to pixels with smaller residuals.By setting a reasonable weight threshold, we can get rid of those abnormal pixels lower than the threshold and do further sparse coding.In that case, we can effectively reduce the effect of abnormal pixels and therefore achieve good performance during the tracking processing.From ( 9), we can see that the weight value  , is bounded between 0 and 1 which makes sure that even the pixels with very small residuals would not have too large weight values.This would guarantee the stability of the algorithm.

Bayesian MAP Estimation
We can regard object tracking as a hidden state variables' Bayesian MAP estimation problem in the Hidden Markov model; that is, with a set of observed samples   = { 1 ,  2 , . . .,   }, we can estimate the hidden state variable   using Bayesian MAP theory.
According to the Bayesian theory, where (  |  −1 ) stands for a state transition model for two consecutive frames and (  |   ) stands for an observation likelihood model.We can obtain the object's best state in th frame through maximum posterior probability estimation; that is, where    stands for the th sample of state variable   in th frame.In this paper, we choose  = 400.

State Transition Model.
We choose object's motion affine transformation parameters as state variable   = {  ,   ,   ,   ,   ,   }, where   and   , respectively, represent the direction and -direction translation of the object in th We assume that the state transition model follows the Gaussian distribution; that is, where Ψ is a diagonal matrix whose diagonal elements are motion affine parameter's variation where N(⋅) means Gaussian distribution,  and  2 , respectively, represent the mean and variation of Gaussian distribution,  stands for the number of pixels of an object template, and stands for the reconstruction error of th pixel of object templates in th frame.

Templates Updating.
To consider that the appearance of the target may change during the tracking processing, it is necessary to dynamically update the template library.
In this paper, we use a method named "Half Updating Strategy" to update the templates.We take the tracking results

Experimental Results and Analysis
In order to evaluate the performance of our tracker, we conduct experiments on three challenging image sequences (Table 1 and Figures 1, 2, and 3).These sequences cover most challenging situations in object tracking: occlusion, motion blur, in-plane and out-of-plane rotation, large illumination change, scale variation, and complex background.
For comparison, we run six state-of-the-art algorithms with the same initial position of the target.These algorithms are the Frag tracking [18], IVT tracking [19], MIL tracking [20], L1 tracking [6], PN tracking [21], and VTD tracking [22] methods.We present some representative results in this section.

Conclusions/Outlook
This paper presents a robust tracking algorithm via 2DPCA and MLE.In this work, we represent the tracked object by using 2DPCA bases and an MLE error matrix.With the proposed model, we can remove the abnormal pixels and thus reduce the effect of abnormal pixels on tracking algorithms.We take the object's reconstruction error into the Bayesian maximum posterior probability estimation framework and design a stable and robust tracker.Then, we explicitly take partial occlusion and misalignment into account for appearance model update and object tracking.Experiments on challenging video clips show that our tracking algorithm performs better than several state-of-the-art algorithms.Our future work will be the generalization of our representation model into other related fields.

Table 1 :
The description of test videos.