In many digital video applications, video sequences suffer from jerky movements between successive frames. In this paper, an integrated general-purpose stabilization method is proposed, which extracts the information from successive frames and removes the translation and rotation motions that result in undesirable effects. The scheme proposed starts with computation of the optical flow between consecutive video frames and an affine motion model is adopted in conjunction with the optical flow field obtained to estimate objects or camera motions using the Horn-Schunck algorithm. The estimated motion vectors are then used by a model-fitting filter to stabilize and smooth video sequences. Experimental results demonstrate that the proposed scheme is efficient due to its simplicity and provides good visual quality in terms of the global transformation fidelity measured by the peak-signal-noise-ratio.
Video captured by cameras often suffers from unwanted jittering motions. In general, this problem is usually dealt with by means of compensation for image motions. Most video stabilization algorithms presented in the recent literature try to remove the image motions by either totally or partially compensating for all motions caused by camera rotations or vibrations [
In this paper, an integrated video stabilization scheme is proposed, which primarily has two objectives. First of all, rather than developing novel and complicated individual algorithms, it aims to simplify the stabilization process by integrating the well-researched techniques, such as motion estimation, motion modeling, and motion compensation, into a new single framework that is of modular nature and can reduce the complexity for implementation, particularly in hardware. Secondly, the scheme aims to provide better performance in terms of the global transformation fidelity (a typical measure of stabilization performance), compared to other existing methods. This is achieved by combining optical flow estimation with motion models to increase accuracy of estimation. The scheme is based on estimating the motion field between consecutive frames using the Horn-Schunck algorithm [
The rest of this paper is organized as follows. In the next section, we present an overview of the proposed video stabilization scheme. Sections
The flowchart of the proposed stabilization scheme is shown in Figure
Flowchart of the video stabilization scheme.
The accuracy of the stabilization scheme mainly depends on the motion vectors produced during the interframe motion estimation. Here, a coarse-to-fine technique is used to perform block correlation, initially at a coarse scale, and then to interpolate the resulting estimates before they pass through iterations of Horn and Schunck’s optical flow algorithm. Optical flow is an approximation of the local image motion based upon local derivatives in a given sequence of images. That is, in two dimensions, it specifies how much each image pixel moves between adjacent images, while, in three dimensions, it specifies how much each volume voxel moves between adjacent volumes.
To estimate optical flow of any pixel
Equation (
In motion estimation, there are occasions that the motion vectors produced fall outside normal values. When a motion vector is above a certain value, it is characterized as an outlier. The above method is very sensitive to outliers; that is, it is prone to produce outliers or unexpected data. Therefore, an alternative value has to be considered to substitute these outliers. Here, the median value of motion vectors is adopted. This is because, among geometric mean, harmonic mean, standard deviation, median and trim-mean, all of which have been applied and tested, the median and trim-mean are found to be the most robust, that is, resistant to outliers.
A camera projects a three-dimensional world point onto a two-dimensional image point. The motion of the camera may be regarded as a single motion such as rotation, translation, or zoom or a combination of any two or three of these motions. Such camera motion can be well categorized by a set of parameters. In our case, the first frame of a video sequence is used to define the reference coordinate system, and a two-dimensional affine model is used to estimate a parametric form describing the displacement of the video content between consecutive frames by identifying the correspondence between local invariant features. The affine model was employed since it is more resilient to noisy data and it can represent all the basic camera motions which often occur in video applications. If we denote a pixel position in the first frame by
Assuming that we have
In order to produce high quality stabilized video sequences, the motion parameters obtained need to be smoothed. This can be achieved by space-domain filtering. Different types of filters have been applied and tested. These include the recursive Kalman filtering which removes camera vibrations, the moving average filter that smoothes data by replacing each data with the average of the neighboring data defined within a span, and the locally weighted scatterplot smoothing which uses weighted linear regression to smooth data. In our scheme, the Savitzky-Golay filter [
Motion compensation is performed frame by frame using previously stabilized frames (apart from the first frame) and their corresponding global smoothed parameters; that is, the first stabilized frame is obtained by compensating the first original frame with its corresponding smoothed affine motion parameters; the second stabilized frame is achieved by compensating the first stabilized frame with its corresponding smoothed affine motion parameters, and so forth. The block diagram of this compensation process is shown in Figure
Compensation process to stabilize video sequence.
In order to evaluate the effectiveness and performance of the stabilization scheme proposed, the simulations are carried out using a range of the QCIF format (176 pixels by 144 lines) video sequences captured. Figure
Example of stabilization results from the stabilization scheme proposed: (a) original frame No. 14 of the video sequence “My Office”; (b) dense optical flow field-estimated motion vectors between frames 14 and 15; (c) difference between original frames 14 and 15; (d) difference between stabilized frames 14 and 15.
Experimental results from the video sequence “Jerky”: (a) original frame no. 14; (b) dense optical flow field-estimated motion vectors between frames 14 and 15; (c) difference between original frames 14 and 15; (d) difference between stabilized frames 14 and 15.
Since dynamic processes, such as stabilization, cannot be illustrated with still images, we present and compare in Figure
Original and smoothed motion parameters of the video “My Office.”
In order to objectively and quantitatively evaluate the performance of the scheme proposed, we use global transformation fidelity (GTF) [
GTF of the video sequence “My Office,” measured by PSNR.
This paper presents a general-purpose video stabilization scheme, aiming at a simple and effective solution for a wide range of video-based applications. The scheme features integration of optical flow and motion model based motion estimation, space-domain filtering, and motion compensation, thus offering an efficient computation method for video stabilization. It is successfully implemented in MATLAB. The simulation result shows that the scheme is effective for a broad range of real-time applications. Compared to other video stabilization methods, it has the advantages of simplicity and robustness while maintaining better or comparable performance in terms of the global transformation fidelity measured by PSNR.