^{1}

^{1}

^{2}

^{2}

^{1}

^{2}

In this paper, we propose an evolutionary correlation filtering approach for solving pose estimation in noncontinuous video sequences. The proposed algorithm computes the linear correlation between the input scene containing a target in an unknown environment and a bank of matched filters constructed from multiple views of the target and estimates of statistical parameters of the scene. An evolutionary approach for finding the optimal filter that produces the highest matching score in the correlator is implemented. The parameters of the filter bank evolve through generations to refine the quality of pose estimation. The obtained results demonstrate the robustness of the proposed algorithm in challenging image conditions such as noise, cluttered background, abrupt pose changes, and motion blur. The performance of the proposed algorithm yields high accuracy in terms of objective metrics for pose estimation in noncontinuous video sequences.

Pose estimation is an important task widely used in three-dimensional (3D) imaging applications to obtain descriptors such as location, orientation, scaling, depth, or geometric visualization of a target [

Correlation filtering is a pattern recognition technique that presents high accuracy in location estimation of a target [

Conventionally, the design of correlation filters requires an explicit knowledge of the appearance and shape of the target [

Template matching based on correlation filters can be used to solve 3D pose estimation of rigid objects [

In this work, we are encouraged finding the optimal solution for 3D pose estimation problem when the target can present abrupt pose changes in the scene, i.e., when the search space of location, orientation, and scaling parameters of the target is very big. This challenge needs to be treated as a combinatorial optimization problem that can become numerically intractable as the number of feasible poses of the target increases [

In this paper, we propose an evolutionary correlation filtering approach to solve the pose estimation problem efficiently. The proposed evolutionary correlation filtering approach can be defined as a hybrid metaheuristic that combines correlation filters [

The three-dimensional pose parameters of a target pose can be estimated with a single monocular camera.

The proposed algorithm can solve the pose estimation problem with evolutionary correlation filtering using a hybrid approach between a genetic algorithm and correlation filters.

An adaptive filter bank dynamically evolves for pose tracking of a target by presenting high estimation accuracy in terms of location, orientation, and scaling.

Robust pose estimation is performed under noncontinuous video sequences, with image conditions such as additive noise, cluttered background, and motion blur.

The paper is organized as follows. Section

In this section, a brief description of the main components of the proposed evolutionary correlation filtering approach is presented. This approach combines correlation filters and a metaheuristic based on a genetic algorithm as the global optimization method to solve the 3D pose estimation problem.

Assume that a 3D object is observed with a monocular camera, as shown in Figure

Image model representation.

Correlation filtering quantifies the level of linear correspondence between two signals. This filtering is widely used in pattern recognition applications to detect a target

Note that the filter given in (

In the filter bank, a set of synthetic references are generated from a digital 3D model using computer graphics to establish a specific pose state. Each reference template represents a single view of the target

Block diagram of template matching using correlation filters.

Evolutionary computation is a family of population-based algorithms with a metaheuristic or stochastic optimization character for global optimization. It is inspired by biological evolution, where genetic algorithms are the most prominent example [

The genetic algorithms became popular through the work of John Holland and particularly his book Adaptation in Natural and Artificial Systems [

Because of the above-mentioned advantages, we choose the genetic algorithm as the search tool for the pose estimation problem. In this work, the genetic algorithm is used to search the optimal parameters of the target’s pose, consisting of scaling and orientation values.

A genetic algorithm is an adaptive heuristic search algorithm designed to simulate the evolution processes existing in natural systems, where, in the most basic form, a genetic algorithm can be modeled for computer simulation using the difference equation

Canonical genetic algorithms use a binary representation of chromosomes as fixed-length strings over the alphabet

In case of continuous parameter optimization problems, genetic algorithms typically represent a real-valued vector

In this section, we present the proposed algorithm for pose estimation of a target presented in noncontinuous video sequences. The main feature of this proposal is the robustness of the evolutionary correlation filtering when processing noisy and cluttered scenes. One of the most difficult challenges in target tracking is to track an object presenting appearance changes over time. The proposed algorithm is able to estimate and track accurately the pose of the target when presenting abrupt pose changes within a video sequence. The proposed evolutionary correlation filtering algorithm is modeled to achieve a robust pose estimation when facing noncontinuous video sequences. The proposed algorithm is based on correlation filters using an evolutionary computation approach. A genetic algorithm is employed to search for the best pose parameters in the computation of the filter bank in order to produce a high matching score. The genetic algorithm finds candidate solutions through the evolution of the pose parameters of the target. The bank of correlation filters is able to estimate the pose of the target under difficult image conditions, such as the presence of noise, motion blur, and cluttered background. Figure

Proposed algorithm for pose estimation using evolutionary correlation filtering.

In the observation process, a video sequence enters into the system assuming the optical setup shown in Figure

Bit-string representation of an individual.

For the consecutive frames

Template matching is employed to detect the target in the scene and estimate their location coordinates, within the current frame. As the main feature, the algorithm yields high accuracy in challenging image conditions such as noise, motion blur, and background clutter. In the proposed methodology, we implemented the template matching technique considering three different stages: template generation, correlation filtering, and evaluation.

In the

Once the templates are generated, the next stage is the computation of

In the next subsystem, the DC is utilized as an objective function to quantify the quality (fitness value) of the candidate solutions contained in

Figure

Genetic operators performed by the filter bank evolution subsystem.

The

Next, the

Then, the

The filter bank evolution subsystem iterates until the maximum number of generations

In this section, we present the experimental results obtained with the proposed algorithm for pose estimation in noncontinuous video sequences. We evaluate the convergence of the genetic algorithm in terms of the detection efficiency of the evolutionary correlation filters. Also, we quantify the performance of pose estimation in terms of location and orientation errors. The performance of the proposed algorithm is evaluated and discussed by processing noisy input scenes. Then, candidate solutions given by the genetic algorithm are analyzed in terms of quality of pose estimation. We measure the performance of the proposed approach under abrupt pose changes of the target. Finally, we show that the proposed algorithm performs well in noncontinuous video sequences in a real environment.

The proposed evolutionary correlation filtering algorithm for pose estimation was implemented in C/C++ programming language. Computer graphics are computed with OpenGL library. We used the Stanford Bunny [

Table

Parameters used in the experiments.

Parameters | Values |
---|---|

Population size | 16, 32, 64, 128 |

Number of generations | 1, 10, 20, 50, 100, 200, 500 |

Selection rate | 0.5 |

Mutation rate | 0.2 |

We tested the performance of the algorithm in terms of discrimination capability (DC) yielded by the correlation filtering through generations of the evolutionary algorithm, as shown in Figure

Performance of target detection of the evolutionary correlation filtering algorithm in terms of DC. The solid and error bars represent the mean value and the standard deviation, respectively, for 400 frames in 30 separate sample runs.

Also, in this experiment we tested the performance of the proposed algorithm in terms of accuracy of the location estimation of the target. For this we measure the location error (LE) between the real

Performance of location estimation of the proposed evolutionary correlation filtering algorithm in terms of LE. The solid and error bars represent the mean and the standard deviation, respectively, for 400 frames in 30 separate sample runs.

Performance of pose estimation of the proposed evolutionary correlation filtering algorithm in terms of OE. The solid and error bars represent the mean and the standard deviation, respectively, for 400 frames in 30 separate sample runs.

The performance of the proposed algorithm is evaluated when processing scenes degraded with additive noise. We tested the noise robustness of the algorithm by considering additive noise with signal-to-noise-ratio (SNR) values of 50 dB, 10 dB, 5 dB, and 2 dB with respect to the number of generations of the evolutionary correlation filtering algorithm. For this experiment, we use a fixed population size of 32 individuals and 20% of mutation rate.

Figure

Performance of target detection of the proposed algorithm for noisy scenes in terms of the DC. The solid and error bars represent the mean value and the standard deviation, respectively, for 400 frames in 30 separate sample runs.

Figure

Performance of location estimation of the proposed algorithm for noisy scenes in terms of the LE. The solid and error bars represent the mean value and the standard deviation, respectively, for 400 frames in 30 separate sample runs.

The performance of the proposed algorithm in terms of orientation estimation of the target is shown in Figure

Performance of pose estimation of the proposed algorithm for noisy scenes in terms of the OE. The solid and error bars represent the mean value and the standard deviation, respectively, for 400 frames in 30 separate sample runs.

Figure

Candidate solutions for 1, 50, and 100 generations, and the best solution obtained by the proposed evolutionary correlation filtering approach.

We test the robustness of the proposed algorithm in frame sequences which contain noncontinuous pose changes of a target. For these experiments, we process a synthetic video composed of abrupt pose changes of a target varying from 5 to 50 degrees between two consecutive frames. The pose changes are calculated in terms of the degrees of difference given by the Euclidean distance for the three coordinate angles of the current

Figure

Results of target detection of the proposed algorithm in noncontinuous video sequences in terms of DC. The error bars represent the standard deviation of measurements for 400 frames in 30 separate sample runs.

Figures

Results of location estimation of the proposed algorithm in noncontinuous video sequences in terms of LE. The error bars represent the standard deviation of measurements for 400 frames in 30 separate sample runs.

Results of orientation estimation of the proposed algorithm in noncontinuous video sequences in terms of OE. The error bars represent the standard deviation of measurements for 400 frames in 30 separate sample runs.

We present the evaluation performance of the proposed algorithm for pose estimation in real scenes. In this experiment, we processed a video sequence of 400 frames with 256

Results of pose estimation obtained with the proposed evolutionary correlation filtering approach in a real video sequence. (a) The algorithm evolves into optimal estimates in the subsequent frames. (b) The algorithm supports abrupt pose changes in noncontinuous target displacement. (Video

In this paper, an evolutionary correlation filtering approach for pose estimation in noncontinuous video sequences was presented. The proposed algorithm exhibited that it is robust for 3D pose estimation and tracking of a target in challenging images conditions such as additive noise, cluttered background, and motion blur, caused for imprecise video capturing. The performance of the proposed algorithm was tested by processing video sequences containing noncontinuous segments and abrupt pose changes of the target. Accordingly, with the obtained results, the proposed algorithm was able to adapt the correlation filtering process using an evolutionary approach. This method was able to successfully refine the final pose estimation solution given by the correlator. For pose tracking, the algorithm used information of the pose estimation in previous frames by assuming that the object displacement within the scene appears continuously. Additionally, the evolutionary approach allowed dealing with a wide range of pose candidate solutions. Thus, it yields high diversity in the exploration of the landscape of solutions to find the best pose parameters. Hence, the evolutionary correlation filtering presented robustness under abrupt pose changes. Based on the obtained experimental results the proposed algorithm yielded a high accuracy of pose tracking for noncontinuous video sequences, given in terms of orientation and location errors. The proposed algorithm also showed its versatility of the correlation filtering used for three-dimensional pose estimation applications. Thanks to the evolutionary approach, the filter bank adapts to the best (suboptimal or optimal in the best of cases) solution. Computer simulation results proved accurate results for real-world scenes in terms of pose estimation obtained with the proposed evolutionary correlation filtering approach.

The data used to support the findings of this study are included within the article.

The authors declare that they have no conflicts of interest.

This research was supported by Consejo Nacional de Ciencia y Tecnología (CONACYT).

Video 1 (MPEG, 1.3 MB) shows the performance of the evolutionary correlation filtering for real images. The video sequence contains 400 frames with 256×256 pixels and 3 color channels RGB. In Video 1, it can be observed how the proposed algorithm successfully estimates the pose of the object despite the presence of challenging factors. The observation point presents rough displacements in the scene, yielding abrupt pose changes of the object through consecutive frames.