Object tracking using Mean Shift (MS) has been attracting considerable attention recently. In this paper, we try to deal with one of its shortcoming. Mean shift is designed to find local maxima for tracking objects. Therefore, in large target movement between two consecutive frames, the local and global modes are not the same as previous frames so that Mean Shift tracker may fail in tracking the desired object via localizing the global mode. To overcome this problem, a multibandwidth procedure is proposed to help conventional MS tracker reach the global mode of the density function using any staring points. This gradually smoothening procedure is called Multi Bandwidth Mean Shift (MBMS) which in fact smoothens the Kernel Function through a multiple kernel-based sampling procedure automatically. Since it is important for us to have less computational complexity for real-time applications, we try to decrease the number of iterations to reach the global mode. Based on our results, this proposed version of MS enables us to track an object with the same initial point much faster than conventional MS tracker.

Using a kernel function as a density estimator are methods in image processing which drew much attention. Object tracking [

The conventional and original version is a nonparametric kernel density estimator

According to (

In this paper, it is proposed to use multiple bandwidths in a conventional mean shift tracker (i.e., a broad bandwidth tracking a larger motion). A broad bandwidth played the central role to help tracking a larger motion. Due to the smoothness incorporated by the large bandwidth, the fixed point iteration can track due to converging faster. It is argued that the bandwidths can be automatically obtained. However, it can be seen below the overall algorithm for choosing effect of the algorithm in detail. The algorithm is also shown and explained below. Since we are dealing with an automatic bandwidth selection, the optimal bandwidth is the main and final goal after smoothening likelihood surface (i.e., cost function). The optimal bandwidth is the bandwidth in which global mode and other local modes are clear enough to seek (i.e., initial state of likelihood surface) and in the same bandwidth seeking about the global mode will take place by MBMS (Multi-Bandwidth Mean Shift).

In fact, there are a variety of ways to select the optimal bandwidth for an automatic optimal bandwidth selecting procedure but in this automatic procedure, first we calculate the optimal bandwidth according to [

This large bandwidth is meant to create a unimodal likelihood surface which is consisting of a single mode. This mode will be used by Mean Shift procedure in the first run. It does not matter whether this mode is global or not. Having found the final location of this mode using Mean Shift helps us achieve finally the global mode via the MB procedure in the optimal bandwidth (i.e., very last selected bandwidth) using MS iterations.

Having observed the above equations, therefore we can have now the optimal bandwidth at the last stage of MBMS, in which all modes especially global mode are clear. If we use three or four times as much as the optimal bandwidth as you can see in 1D and 2D likelihood surfaces (i.e., Cost Function) Figures

One-dimensional Gaussian mixture surface changes with the bandwidth: 500, 1000, 1500, 2500, 3500, 5500, 7500, and 15500.

Two-dimensional Gaussian mixture surface changes with the bandwidth: [0.13 0;0 0.13], [1.3 0;0 1.3], [2.6 0;0 2.6], and [6.5 0;0 6.5].

Monotonic decrease of bandwidth ends to h0 as the experimental optimal bandwidth. Figure

There is also a method to find the minima instead of maxima [

MBMS algorithm:

selecting the sequence of bandwidth

a starting location for first MB (multi-bandwidth) procedure and converge using

we run MS for each

Multi-bandwidth selecting and utilizing cost us some delay and computational burden. MS is proved to be a quadratic bound optimization according to [

We run the two algorithms for the same data sets and insert the results in Table

The step size iterations between frame 88 and 89.

Algorithm | Step-size | Step-size | Center in Frame 88 | Center in Frame 89 |
---|---|---|---|---|

Standard MS | 0.0040 | −0.0006 | 106 | 106 |

Multi-Bandwidth | 0.0027 | −0.007 | 58 | 58 |

It shows the step size iterations between frame 89 and 90 in seven iteration but different step sizes for standard method and proposed one.

Algorithm | Step-size | Center position of square 90 | ||
---|---|---|---|---|

Standard | −0.1353 | 0.0006 | 104.5116 | 58.0067 |

Multi-bandwidth | −0.0889 | −0.0192 | 105.0216 | 57.7888 |

Standard | −0.0881 | −0.0002 | 103.5420 | 58.0048 |

Multi-bandwidth | −0.0640 | −0.0199 | 104.3180 | 57.5700 |

Standard | −0.0515 | 0.0200 | 102.9758 | 58.2248 |

Multi-bandwidth | −0.0449 | −0.0037 | 103.8240 | 57.5298 |

Standard | −0.0754 | 0.0185 | 102.1470 | 58.4286 |

Multi-bandwidth | −0.0613 | −0.0138 | 103.1495 | 57.3778 |

Standard | −0.0082 | 0.0169 | 102.0563 | 58.6143 |

Multi-bandwidth | −0.0278 | −0.0121 | 102.8439 | 57.2442 |

Standard | −0.0197 | 0.0023 | 101.8392 | 58.6395 |

Multi-bandwidth | −0.0265 | −0.0193 | 102.5524 | 57.0324 |

Standard | 0.0124 | 0.0181 | 101.8392 | 58.6395 |

Multi-bandwidth | −0.0101 | −0.0107 | 102.5524 | 57.0324 |

Implementing the algorithm and the experimental result on an object is represented in a normalized rectangle region as the target region. We choose color as feature in target model

Using the Tailor expansion, the linear first-order extension helps us solve the optimization problem efficiently via MS iteration at the initial point

Conventional MS cannot seek a global mode in presence of local mode due to the fixed-bandwidth which is created by rapid motion, illumination changes, clutter and occlusion as shown in Figure

Top is cost function surface between frame 91 and 92.

Top is cost function surface between frame 91 and 92. Black-colored clutter in background, below images, created some local modes in the surface

Frames 88, 89, 90, 91, and 92 show us a tracking failure due to the black-colored clutter in background

Multi-bandwidth tracker did not fail to track in this cluttered sequence (multimodal surface).

Entirely all of the methods in object tracking, ever written and proposed could have covered some of the weak points in this field, but there are some common problems in all of them: All have problem in finding the object location in large distance movement between two successive frames. The starting-point for them all is very important to track the object correctly. Background problem such as clutter, occlusion, and illumination changes can completely influence the tracking path and cause failure exactly as illustrated in Figure

The conventional MS has all of these problems described above as well. In our work, we have enabled the tracker to be robust enough in different initial points in an image by considering the efficiency and efficacy of bandwidth variation of kernel function through adaptive step-size iteration. We are actually utilizing an object detector incorporated in localizing procedure to recover from any failure when occurred. It was also previously proposed to use a detector for Particle filtering tracking-based [

The multi-bandwidth tracker starts in a 3 or 4 bandwidth shifting iteration in an MS procedure. It is worth to say that through using color as feature so that there may be some unwanted modes created just because of the difference between two points color values [

The most important problem of the proposed method is that the series of bandwidth selection is manual, but we can be looking for some issues to be proposing an automatic selector of bandwidth using some features, but in this paper, we are using manually multi-bandwidth series to track correctly as illustrated in Figure

Figure

Top row: MS, Bottom Row: MBMS.

Left plot: Error from the GT (i.e., Ground Truth is zero state), Right plot: Similarity with Target (GT has 100% similarity). Compared to the real GT (i.e., Ground Truth), we can observe that MBMS successfully performs circle tracking through the entire sequence.

Figure

Frames: 1, 52,79,82,90,110, left column: MS, right column: MBMS.

Left plot: Error from the GT (i.e., Ground Truth is zero state), Right plot: Similarity with Target (GT has 100% similarity). Compared to the real GT (i.e., Ground Truth), we can observe that MBMS successfully performs bus tracking through the entire sequence.

Figure

Frames: 1, 140, 260, 290.

MS

MBMS

GT

Top plot: Iteration number for MS while tracking the hand in Figure

MS

MBMS

Definitely, we can observe that MBMS successfully performs better hand tracking through the entire sequence with lower number of iterations than MS.

At Table

Data set no. 1 (1D synthetic data). A total of 1000 data points are drown with equal probability from four normals:

Data set no. 2 (2D synthetic data). A total of 1050 bivariate data points are drown with equal probability from three normals

Comparison of Number of Iterations for Convergence for 1D and 2D data set.

Data set | Initial | Number of iterations | |

MBMS | MS | ||

data set #1 | 0 | 33 | 51 |

1 | 36 | 77 | |

3 | 21 | 33 | |

data set #2 | (−5,20) | 22 | 34 |

(−10,16) | 21 | 29 | |

(20,10) | 23 | 35 |

A new kernel-based object tracking framework is proposed. The contribution is mainly the use of a prior large bandwidth for a priori tracking followed by the estimated tracking. This framework is robust to noise and clutters so that it can escape from many local maxima. This tracking algorithm (i.e., MBMS) can converge faster than does the conventional kernel-based object tracking (i.e., MS). However, there are still some problems, and some weaknesses which are to be later clarified and reparaphrased. Many results can be analyzed theoretically. This paper as a reference can be much helpful for later extension of this work. The experimental results above must have illustrated this approach performance. As shown in above database, it can also be concluded that in rapid motion of an object, large displacement between two adjacent frames occurs which will lead MS to a failure in tracking an object. By means of multi-bandwidth proposal, we can be improving MS in recovering from the failure by incorporating a detector in localization process called multi-bandwidth kernel functionality. In comparison with conventional MS and other techniques like [