The temporal complexity of video sequences can be characterized by motion vector map which consists of motion vectors of each macroblock (MB). In order to obtain the optimal initial QP (quantization parameter) for the various video sequences which have different spatial and temporal complexities, this paper proposes a simple and high performance initial QP determining method based on motion vector map and temporal complexity to decide an initial QP in given target bit rate. The proposed algorithm produces the reconstructed video sequences with outstanding and stable quality. For any video sequences, the initial QP can be easily determined from matrices by target bit rate and mapped spatial complexity using proposed mapping method. Experimental results show that the proposed algorithm can show more outstanding objective and subjective performance than other conventional determining methods.

In the last decade, multimedia data has been applied to communication, security, entertainment, and military. Because multimedia data has the problem of large amount of data, it can be hardly stored and transmitted. Video coding can effectively solve the problem. With the development of terminal equipment and communication networks, the video coding standards have been continually established as MPEG-1 [

In multimedia communication and transmission, rate control (RC) algorithm plays a crucial role. H.264/AVC includes an RC [

In order to solve the existing problems, we propose a simple and high performance method to determine an initial QP in given target bit rate. To obtain the initial QP for any video sequences, it is very important to measure the spatial and temporal complexities of the video sequences. In H.264/AVC [

In this section, we review some initialization methods of RC which are used to decide initial QPs in the recent literatures. The method of JVT-G012 that can automatically conclude the value of initial QP for the IDR is the most traditional and coarse method. The advantages of JVT-G012 are easy implementation and low computational complexity. In the case of the poor performance hardware of terminal equipment, JVT-G012 is being of extensive usage. Various versions of the reference software of H.264 [

In the method of Wang, the value of

Consider

Relationship between the best initial QP and BPP according to News, Foreman, and Mobile.

In the scheme of Wu,

The parameters used in the method of Wang and Wu are calculated using three types of tested video sequences. In Wu and Wang, the extracting method of sample video sequences has not been shown. It is difficult to say that they can represent the various spatial and temporal complexities video sequences. Moreover, Wang and Wu have not taken into account the quality consistency of recovered video sequences. Furthermore, the scheme of the best initial QP determination is not explained.

In the video sequence, the content of adjacent frames has not any significant difference. In order to save amount of bits, only difference is encoded. For finding difference, most video coding standards support the method named motion estimation (ME) [

The process of ME

Motion vector map.

By the number and magnitude of the motion vectors, the temporal complexity of video sequences can be measured and predicted. In H.264/AVC, most frames are encoded using intermode [

Figure

Motion vectors obtained from ten video sequences.

In proposed initialization algorithm of RC, the value of initial QP is computed based on the spatial and temporal complexities at the given target bit rate. As the sample video sequences are selected, the calculating method of spatial complexity of sample video sequences must be provided.

In H. 264, the smallest encoding unit is MB. MB includes two INTRA prediction modes which are INTRA

INTRA prediction modes.

INTRA

INTRA

Generally, the INTRA

Type of MBs.

Since the variance of an MB, which includes 256 pixels, is the equal of the total information of the DC and AC coefficients of the MB, the spatial complexity of the MB can be estimated using variance [

Consider
_{var} and _{complex} is the rate of the number of complex MBs in the IDR. MB_{Complex} is the number of the complex MBs of IDR and MB_{Frame} is the total number of the MBs of IDR.

Classification process of MB.

The Frame_{complex} can be a measure of the spatial complexity.

According to the given target bit rate, the value of initial QP is directly related to the performance of encoding at the H.264/AVC. The performance of encoding can be evaluated using the quality of reconstructed video sequences and total bits. In other words, the objectives are to satisfy target bit rate to ensure the best quality of reconstructed video sequences. And also the stability of reconstructed video sequences is a very important quality measure in multimedia broadcasting and transmission. Thus the optimal initial QP algorithm has the following properties: (1) maximizing PSNR(peak signal-to-noise ratio) that means the best quality of reconstructed video sequences, (2) maximizing stability that is defined as the differences of QPs, and (3) minimizing total real bits under satisfying the target bit rate.

To find out the optimal initial QP, all potential 52 initial QPs are calculated. For each initial QP_{Initial QP}, BIT_{Initial QP}, and

For the video sequences which have the similar complexity, the initial QPs should be similar. A mapping method of spatial complexity is proposed for any tested video sequence. The spatial complexities of selected sample video sequences can be calculated using (

Let

In Table

Lookup table for proposed initial QP.

Bit rate | (MSC) Video | |||
---|---|---|---|---|

(32%) |
(44%) |
(53%) |
(90%) | |

0.4 (Mbps) | 30 | 38 | 42 | 45 |

0.5 (Mbps) | 33 | 32 | 39 | 42 |

0.6 (Mbps) | 30 | 30 | 36 | 40 |

0.7 (Mbps) | 28 | 27 | 35 | 36 |

0.8 (Mbps) | 28 | 25 | 37 | 37 |

0.9 (Mbps) | 25 | 24 | 34 | 33 |

1.0 (Mbps) | 25 | 23 | 35 | 32 |

The proposed algorithm and existing methods, which are JVT-G012 and Wu, are implemented on JM9.3 [

The system platform is Intel (R) Core(TM)2 Duo CPU E7400 2.80 GHZ, 2.00 GB RAM, and the OS is Microsoft Windows XP professional 2002 Service Pack 3.

JM 9.3 is implemented at the Visual Studio 6.0.

The profile baselines are used; one GOP has 15 frames which includes that the 1st frame is encoded by intra and others are encoded as interframes; the B-picture is not adopted. The item of “Rate Control Enable” is enabled, the item of “Initial QP” is set to 0, and the target bit rates are limited to range that is from 0.4 to 1.0 (units: Mbps).

The proposed initial QP is determined using Table

As for the standard video sequence, the number of frames is 60, the frame rate is 30.

The three methods which are proposed algorithm, JVT-G012, and Wu are compared in terms of PSNR and the difference of real bits. These indicators of performance can be quantized as follows:

Consider

Table _{Average} and

Comparison of coding performance.

Video sequence | JVT-G012 | Wu | Proposed |
---|---|---|---|

Waterfall | 36.04 | 35.83 | 36.12 |

Foreman | 37.78 | 37.68 | 37.81 |

Flower | 29.11 | 29.14 | 29.16 |

Mobile | 28.12 | 28.28 | 28.37 |

Bus | 30.12 | 30.18 | 30.24 |

City | 34.72 | 34.63 | 34.75 |

Stefen | 31.48 | 31.46 | 31.59 |

Average | 32.48 | 32.46 | 32.58 |

Video sequence | JVT-G012 | Wu | Proposed |
---|---|---|---|

Waterfall | 0 | −2288 | 14048 |

Foreman | 0 | −2272 | 2152 |

Flower | 0 | 62336 | 69072 |

Mobile | 0 | 128888 | 154184 |

Bus | 0 | 5944 | 5456 |

City | 0 | −1576 | −1904 |

Stefen | 0 | −1096 | 24 |

Total | 0 | 189936 | 243032 |

In Table _{Average} is similar. This illustrates that the reconstructed video of proposed method has minimum actual total bits at the same or similar quality in almost all of simulations except City.

One of the important quality measures of a video is that the quality of each frame should be uniform. The existing methods have not involved this issue. The proposed method has solved this issue by selecting initial QP according to the highest and stablest quality as well as the lowest actual bits in (

Reconstructed video with extremely changing quality.

Frame number | Bit/frame | QP | Frame PSNR |
---|---|---|---|

0 | 291952 | 25 | 38.084 |

1 | 95904 | 25 | 36.900 |

2 | 42176 | 31 | 34.305 |

3 | 24272 | 32 | 32.891 |

4 | 17208 | 34 | 32.058 |

5 | 12504 | 36 | 30.613 |

6 | 10280 | 38 | 30.614 |

7 | 8848 | 40 | 29.980 |

8 | 7776 | 42 | 29.338 |

9 | 7184 | 44 | 28.591 |

10 | 6408 | 46 | 27.687 |

11 | 5720 | 48 | 26.682 |

12 | 4976 | 50 | 25.367 |

13 | 4856 | 51 | 24.405 |

14 | 5432 | 51 | 24.615 |

Stability performances.

Flower

Bus

Foreman

Mobile

Stefen

Waterfall

In fact, PSNR can objectively and effectively assess quality of one frame. However, PSNR is not a perfect measure to evaluate the qualities of video sequences which have multiple frames. In Table

The quality of video sequences that have high complexity is very sensitive to the value of initial QP at low bit rate. For a given target bit, low initial QP can lead the large bits assignment to the first frame and insufficient bits to the other frames in a GOP to maintain qualities. Therefore, the stability of a reconstructed video sequence is a very important quality performance. Proposed optimal initial QP is determined under consideration of stability in (

General video user evaluates video sequences by just looking but not by calculating PSNR or stability. This implies the importance of subjective evaluation. Frankly speaking the relationship between the results of objective evaluation and subjective one is not known. Therefore, it is not easy to convert the difference of objective evaluation results to the differences in subjective evaluation.

In this part, three methods, proposed method, JVT-G012, and Wu, are subjectively evaluated using the objective evaluation results, that is, the maximum and minimum QPs at 0.4 Mbps target bitrate. The Mobile sample video sequence is tested. Figure

Result of subjective evolution.

Minimum QP of GOP based on JVT-G012

Maximum QP of GOP based on JVT-G012

Minimum QP of GOP based on Wu

Maximum QP of GOP based on Wu

Minimum QP of GOP based on proposed method

Maximum QP of GOP based on proposed method

The average PSNRs of the proposed method, JVT-G012, and Wu are very similar at objective assessment, which are 25.22, 25.03, and 24.83, respectively.

Although Figures

In subjective evaluation, we can see also the importance of stability from the gaps in Figures

In order to obtain the optimal initial QP for the various video sequences which have different spatial and temporal complexities, we propose a simple and high performance initial QP determining method based on motion vector map and temporal complexity to decide an initial QP in given target bit rate. Four sample video sequences are selected according to the temporal complexity which is measured using motion vectors map, and the spatial complexities of four sample video sequences are computed according to proposed method. For any video sequences, the initial QP can be easily determined from matrices by target bit rate and mapped spatial complexity using proposed mapping method. Experimental results show that the proposed algorithm can obtain more outstanding objective and subjective performance than other conventional determining methods. In the future, one of the further research areas will be the development of quantitative measure for the temporal complexity. The study on temporal complexity will provide a hint to explain or to solve the exceptional case for H.264 AVC coding.

The authors declare that they have no financial or personal relationships with other people or organizations that can inappropriately influence their work; there is no professional or other personal interests of any nature or kind in any product, service, and/or company that could be construed as influencing the position presented in, or the review of, the paper entitled.