Robust Object Tracking Based on Simplified Codebook Masked Camshift Algorithm

Moving targets detection and tracking is an important and basic issue in the field of intelligent video surveillance. The classical Codebook algorithm is simplified in this paper by introducing the average intensity into the Codebookmodel instead of the original minimal and maximal intensities. And a hierarchical matching method between the current pixel and codeword is also proposed according to the average intensity in the high and low intensity areas, respectively. Based on the simplifiedCodebook algorithm, this paper then proposes a robust object tracking algorithmcalled SimplifiedCodebookMaskedCamshift algorithm (SCMCalgorithm), which combines the simplified Codebook algorithm and Camshift algorithm together. It is designed to overcome the sensitiveness of traditional Camshift algorithm to background color interference. It uses simplified Codebook to detect moving objects, whose result is employed tomask color probability distribution image, based onwhichwe thenuseCamshift to predict the centroid and size of these objects. Experiment results show that the proposed simplified Codebook algorithm simultaneously improves the detection accuracy and computational efficiency. And they also show that the SCMC algorithm can significantly reduce the possibility of false convergence and result in a higher correct tracking rate, as compared with the traditional Camshift algorithm.


Introduction
Moving object detection and tracking is the basis of object recognition and behavior understanding and has very broad application and research prospects.There are mainly three different categories of object detection algorithms, such as interframe difference methods [1], optical flow methods [2], and background subtraction methods.Background subtraction methods are the most popular ones in real world because of their high detection accuracy and medium computational complexity.Classical background subtraction algorithms include kernel density estimation [3], Gaussian Mixture Background Modeling [4], and Codebook background modelling [5].
Codebook algorithm was first proposed in 2004 by Kim et al. [5], and it has been one of the most advanced motion detection methods because of its high memory utilization, high computation efficiency, and strong robustness.Many improvements have been made based on Codebook algorithm.For example, Wu and Peng [6] proposed a modified Codebook algorithm based on spatiotemporal context which improves the detection accuracy by adding the correlation of the spatiotemporal pixels.However, the computational complexity of the whole algorithm has been increased at the same time.Tu et al. [7] made simplifications to accelerate the computational speed by introducing boxbased Codebook model in RGB space to represent the matching field of the codewords.However, these simplifications decreased the detection accuracy.Most of the improvements to Codebook can improve either the detection accuracy or computational efficiency, but not both of them.
Camshift algorithm is a classical object tracking algorithm.Camshift is evolved from Mean Shift algorithm.It performs tracking according to the color information of

Simplified Codebook Algorithm
Compared with original Codebook algorithm [5], our simplified Codebook algorithm has two improvements: First, maximum and minimum brightness in codeword model are substituted by average brightness.So codeword model is simplified and computation speed is increased.Second, different processing methods of high and low brightness regions are applied to the matching between current pixel and codeword, so detection accuracy is improved and the probability of false detection is reduced in the low brightness region.The simplified Codebook algorithm in this paper is called hierarchical matching 5-tuple-based Codebook algorithm.This section will present how to detect the moving object by the proposed simplified Codebook algorithm.First, we will show process of building a codebook for a specific pixel.Then all the other pixels can repeat the same process to complete the detection for a whole image.
2.1.Initialization.We build a codebook   (  = {  | 1 ≤  ≤ }) containing several codewords for every pixel, where  is the number of codewords.The th codeword   includes two parts: RGB vector V  = (  ,   ,   ) and a 5-tuple aux  = ⟨  ,   ,   ,   ,   ⟩.The 5-tuple is composed of average brightness   , codeword accessed frequency   , maximal nonrepeatable time interval   , the initial codeword accessed time   , and the eventual codeword accessed time   .Except for the maximum and minimum brightness replaced by the average brightness   , all the other elements remain the same as the original Codebook algorithm.

Training Background Model.
Assume the first  frames of the video are used to train background model.For a particular pixel, the sequence of pixel values for training is where each element  ⇀   is RGB vector extracted from the th image frame.Now, we take this pixel as an example to explain the codebook training process: (1) Build a codebook: we build a codebook   for the pixel and initialize it with an empty set (let  = 0).
(2) Train the codebook: the following steps are executed circularly while  changes from 1 to : ( where   is the average brightness of the th codeword and  hi and  low are the upper and lower bounds of brightness matching scope.  ,  hi , and  low can be calculated by the following formulas: where  is the threshold to determine whether the current pixel belongs to the low brightness region or not.  is the average brightness of the th codeword. is a variable used to calculate the threshold of color distortion matching, whose value is between 0 and 1.  0 is a constant threshold of color distortion matching in low brightness region. is a variable used to calculate the ratio of the upper bound of brightness matching and average brightness in high brightness region. is a variable used to calculate the ratio of the lower bound of brightness matching and average brightness in high brightness region. 0 is half of the brightness matching range when the brightness of the current pixel is lower than . 0 and  0 jointly guarantee the ranges of color distortion and brightness matching are not too small in low brightness area to avoid the occurrence of false detection. (3) If  = 0 or there is no matching codeword, then let  =  + 1 and create a new codeword and also update the 5-tuple by (3) Regulate   for every codeword   , and let (4) Delete the nonbackground codeword.Assume the probability of background occurrence is bigger than 50%.Let  denote the background model which is the codebook after temporal filtering step.Specific operations can be expressed as the following formula: generally,   = /2. is the codeword set describing the background, where   is the th codeword in   .  is the maximal nonrepeatable time interval of the 5-tuple in   .

Foreground Detection.
We match current pixel with a codeword by the same method as the training codebook.If matching exists, we update the codeword and take the current pixel as background.Otherwise, we take it as foreground.

Classical Camshift Algorithm
The original Camshift algorithm [11] takes the color histogram of an object as its characteristic model based on its color information.Video frame images are then changed into color probability distribution images.The centroid of the object is searched and the size of the object box is predicted on these images.The implementation of the classical Camshift algorithm can be depicted as follows: (1) Initialize the position of the centroid and the size of the bounding box of the object.
(2) Compute the color histogram of the bounding box.
(3) Compute the color probability distribution image for the current frame.
(4) Predict the position of the centroid with Mean Shift algorithm.
(5) Predict the size of the bounding box.
The original Camshift algorithm can easily converge the bounding box to an object position, when there are significant differences between the object color and background color.Under such circumstance, pixel value of the object area is much higher than those of background on the color probability distribution image.However, when the object color is similar to the background color, the pixel value of the object area is no longer distinctive to those of background on the color probability distribution image.The algorithm will not guarantee that it will correctly converge the bounding box to an object position because it is color sensitive.This phenomenon is shown in Figures 1 and 2, respectively.
The video image sequences in Figures 1 and 2 are downloaded from the ITEA CANDELA project [24].Figures 1(a

Simplified Codebook Masked Camshift Algorithm
To overcome the background interference problem of Camshift algorithm, this paper proposes a SCMC algorithm, which is short for Simplified Codebook Masked Camshift algorithm.SCMC algorithm combines the simplified Codebook algorithm and the Camshift algorithm together.The detection result of simplified Codebook is utilized to mask the color probability distribution images.The main purpose is to filter out the background interference on tracking.
The implementation process of SCMC algorithm can be summarized as follows: (1) Detect moving objects with simplified Codebook algorithm.
(2) Perform median filtering to the detection results to filter out noise and make the object connectable.
(3) Compute the color probability distribution image for the current frame.
(4) Mask the color probability distribution images with the processed foreground images in step (2).
Pixel values in background area are assigned as 0, and the pixel values in foreground area remain unchanged to guarantee Camshift algorithm converges only to the area of the moving object.The masking procedure of color probability distribution images is shown in Figure 3. (5) Search the centroid of the object with Camshift algorithm and predict the size of the bounding box.
The whole process of the proposed CMC algorithm is depicted in the flowchart of Figure 4.

Experimental Results and Analysis
The experiments were carried out on an ordinary PC with configuration of Intel(R) Core(TM) i3, 3.0 GB RAM, and 64bit Windows 7 operating system.The programming environment is Microsoft Visual Studio 2010 and OpenCV 2.4.4.

Results of Simplified Codebook Algorithm.
We use five different videos to illustrate the detection performance of the proposed simplified Codebook algorithm.Video #1 is captured by us.Video #2 is the famous video called Waving Tree [25].Video #3 is chosen from the project of PETS2000 [26].Videos #4 and #5 are selected from the ITEA CANDELA project [24].
The detection results of Video #1 and Video #2 using the proposed simplified Codebook algorithm are depicted in Figure 5.In order to show the superiority of the proposed algorithm, we also performed comparisons with several other motion detection algorithms such as Gaussian Mixture Model and original Codebook model.The real foreground extracted by hand is also depicted in Figure 5.
From Figure 5, we can see that detection results of the proposed simplified Codebook algorithm are better than the Gaussian Mixture Model and the original Codebook model.The original Codebook model may lead to false detection in the region with low brightness.The simplified Codebook algorithm can reduce the influence of low brightness area by using the method of hierarchical matching.
The computational difference between the simplified Codebook algorithm and the original Codebook model mainly lies in the process of calculating direct parameters from indirect parameters.The original Codebook model includes 2 multiplications, 1 division, and 1 subtraction, while the simplified Codebook algorithm includes 1 division and 3 multiplications, or 2 divisions and 1 addition.It is more likely to save a lot of operations by using the simplified Codebook  algorithm especially when the average brightness is lower than .The detection speed of the simplified Codebook algorithm to Video #1 is 47 ms/frame, while detection speed of the original Codebook model is 62 ms/frame.More experimental results of different detection algorithms of Video #3, Video #4, and Video #5 are given in Figure 6.From these figures, we can also see the superiority of the proposed simplified Codebook algorithm.
In order to illustrate the validation performance of the proposed simplified Codebook algorithm, we draw the ROC curves of three different algorithms in Figure 7.In this figure, the false positive rate is defined to be the ratio of amount number of background pixels which are falsely detected as foreground pixels to amount number of background pixels, and true positive rate is defined to be the ratio of amount number of foreground pixels which are correctly detected as foreground pixels to amount number of foreground pixels.The validation performance is better when the area under ROC curve is larger.

Results of SCMC Algorithm.
Here, we also use several videos to test the tracking performance of the proposed SCMC algorithm.Video #3, Video #4, Video #5, and Video #6 [27] are used in this part.First, we show the tracking results of a white car in Video #4 using the original Camshift algorithm in Figure 8.  From Figure 8, we can easily see that serious tracking error occurs when background color and object color are slightly similar using the original Camshift algorithm.
Then we show the tracking results of the same object in Video #4 using the proposed SCMC algorithm in Figure 9. Figure 9(a) shows color probability distribution images masked by the moving object detection result using the simplified Codebook algorithm.Figure 9(b) shows the corresponding tracking result images of the proposed SCMC algorithm.In this figure, most of the background information was filtered out by the masked color probability distribution image.So the probability of false convergence is greatly reduced when using the SCMC algorithm, and higher tracking accuracy is obtained.
More comparisons about the tracking performance between the original Camshift algorithm and the proposed SCMC algorithm are shown in Figures 10, 11, and 12.We also give the tracking results of original Codebook with Camshift algorithm and Compressive Tracking algorithm [28] in these figures.From these figures, we can see that the tracking performance can be easily influenced by the background color when using the original Camshift algorithm.However, most of the background information has been filtered out after using the SCMC algorithm.As a consequence, it largely reduces the possibility of false convergence.So better tracking performance is obtained.The tracking results are also better than those of original Codebook with Camshift algorithm and Compressive Tracking algorithm.
In Figure 13, we show the tracking results of a person using 4 different algorithms.As you can see in this figure, the tracking performance of the proposed SCMC algorithm is also better than those of original Camshift algorithm and original Codebook with Camshift algorithm.In Figure 13(c), one of the targets is missing by using the method of incremental learning for robust visual tracking.However, the person can be correctly tracked by the proposed SCMC algorithm.In order to illustrate the superiority of the proposed SCMC algorithm, we perform comparison with some stateof-the-art object tracking algorithm such as original Camshift algorithm, CT (Compressive Tracking) algorithm [28], and TLD algorithm [29] in Table 1.The number of correct tracking frames of some cars in Video #3, Video #4, and Video #5 is summarized in this table.We consider correct tracking if the center of the bounding box falls into the object area.

Mathematical Problems in Engineering
From Table 1, we discover that the proposed SCMC algorithm can correctly track most objects in Video #3, Video #4, and Video #5, and the tracking performance is significantly superior to the original Camshift algorithm.Moreover, the correct tracking rate of SCMC is 28.2% higher than that of TLD and is 19.4% higher than CT.SCMC algorithm is therefore superior to TLD and CT algorithms when the camera is stable.
Notice that the tracking performance of the proposed SCMC algorithm for the gray car in Video #3 is worse than that of all the other algorithms in Table 1.This special case is shown in Figure 14, too.The reason is that a black car approaches the object and then leaves it in the tracking progress, which causes the false tracking of SCMC algorithm.By contrast, gray road information near the gray car is not filtered out for Camshift algorithm, which is a benefit to the convergence of the centroid of the object on the gray car.One possible disadvantage of the proposed SCMC algorithm is that the tracking performance may be affected if there is another moving object with similar color in the video, even if the interference of the background has already been restrained.

Conclusion
This paper first proposes a simplified Codebook algorithm called hierarchical matching 5-tuple-based Codebook algorithm.The average intensity is introduced as a variable into the Codebook model instead of the minimal and maximal intensities.And different matching methods between the current pixel and codeword are adopted according to the average intensity in the high and low intensity areas, respectively.Based on the simplified Codebook algorithm, the proposed SCMC algorithm masks color probability distribution images Mathematical Problems in Engineering    with moving objects detection results, in which the pixel values in the background are configured as 0 to filter out the interference of background on Camshift tracking.The probability of false convergence is therefore greatly reduced.The algorithm has higher correct tracking rate than classical Camshift algorithm.Moreover, its correct tracking rate is superior to TLD and CT if the position of the camera is stable.However, it also has some disadvantages: (1) it may not perform tracking if background rapidly changes and (2) it may track a false object if two objects have quite similar colors.To solve the first disadvantage, future improvement may be the construction of more robust background modeling to implement foreground detection for rapidly changing background.For the second disadvantage, we can include some texture information into the model to increase the distinction between different objects through the improvement of characters.

Figure 1 :
Figure 1: Tracking results of Camshift when obvious difference exists in object color and background color.

Figure 2 :
Figure 2: Tracking results of Camshift when no obvious difference exists in object color and background color.

Figure 3 :Figure 4 :
Figure 3: Illustration of the masking procedure of a color probability distribution image.
Figure 8(a) shows original color probability distribution images, and Figure 8(b) shows the corresponding tracking result images of the original Camshift algorithm.
(a) Color probability distribution images of masked simplified Codebook algorithm (b) Tracking result images of the proposed SCMC algorithm

Figure 9 :
Figure 9: Tracking results of a red car in the 165th, 173rd, 181st, and 189th frames in Video #4 using the proposed SCMC algorithm.
(a) Original Camshift algorithm (b) The original Codebook + Camshift algorithm (c) Compressive Tracking algorithm (d) The proposed SCMC algorithm

Figure 10 :
Figure 10: Tracking results of a red car in the 70th, 85th, 100th, and 115th frames in Video #3.
(a) Original Camshift algorithm (b) The original Codebook + Camshift algorithm (c) Compressive Tracking algorithm (d) The proposed SCMC algorithm

Figure 11 :
Figure 11: Tracking results of a black car in the 205th, 210th, 215th, and 220th frames in Video #3.

Figure 12 :
Figure 12: Tracking results of a crimson car in the 50th, 60th, 70th, and 80th frames in Video #4.

Figure 14 :
Figure 14: Tracking results of a gray car in Video #3 using SCMC algorithm; the 144th, 149th, 154th, and frames are shown here.
1) A new pixel value is read from the sequence  ⇀   = (  ,   ,   ).

Table 1 :
Comparison of tracking results for different tracking algorithms.