INTERNATIONAL ORGANISATION FOR STANDARDISATION
ORGANISATION INTERNATIONALE DE NORMALISATION
ISO/IEC JTC1/SC29/WG11
CODING OF MOVING PICTURES AND AUDIO
ISO/IEC JTC1/SC29/WG11
MPEG2008/M15580
July 2008, Hannover, Germany
Title: Technical considerations for Ad Hoc Group on New Challenges in Video Coding Standardization
Purpose: Informative
Authors: Kyohyuk Lee ( ),
Elena Alshina (),
Jeonghoon Park ()
Woojin Han (),
Junghye Min (),
Source: Samsung Electronics Co., Ltd.
Summary
In this contribution, we investigated several aspects of video coding technologies under consideration of new applications, especially with high resolution video contents. As the results of the investigation, we obtained over 20% bit savings with high resolution video sequences compared to the state of the art standard video codec, MPEG-4 AVC High Profile. In the specific test sequence, over 30% bit rate reduction is achieved. Our results imply that there still is enough room for further improvement of the latest standard video codec for high resolution video applications.
Many companies have already developed devices for high resolution video applications over HD resolution. The new video codec standard especially targeting these applications could accelerate the growth of the market of these kinds of applications. Therefore we propose starting of investigation on new video coding standardization as soon as possible under consideration of high resolution video applications.
1. Introduction
In the past decade, high resolution video contents have widely prevailed from internet to broadcasting. For example, DVD applications with SD resolution have already dominated the video contents rental market. Also, DTV technologies with HD resolution not only provide broadcasting service but also try to expand their feasibility towards the mobile market. And besides, resolution and size of display devices tend to increase in the recent consumer electronics market. As we saw in the CES 2008, most major display providers have already introduced 80" LCD and 150" PDP devices. Furthermore, video capturing devices of 4Kx2K video contents are already in the stage of commercial use meeting the growing demand in digital cinema market. Additionally, commercial projectors already cover 4Kx2K resolution. Even though the current main stream of media contents is HD resolution video, it is quiet clear that the resolution of video contents will expand over HD resolution in the future.
Also, due to the startling development of transmission and storage technologies, quality of video contents is constantly improving. For example, in the case of packaged media applications, Blue Ray exceeds DVD in data capacity. Besides the packaged media application, it is self-evident that all video content services will be inclined to high quality video in the future environment.
For this reason, we investigated several aspects of video coding tools for future video applications.
2. Consideration of video coding for new application domain
2.1. Expanded Prediction Block Unit
It is well-known that in terms of coding efficiency, large motion block size becomes more efficient as image resolution increases. The largest motion block size of state of art video codec, such as MPEG-4 AVC, is 16x16. Additionally, both the motion block for motion compensation and the intra prediction block size are bounded up to 16x16.
We investigated the usefulness of changing prediction block size by expanding the prediction block size over 16x16 to cope with the high resolution image compression, and each separated block has its own displacement vector for inter prediction as conventional video codec.
For the case of intra prediction, most part of the images with high resolution can be described as homogenous regions with variations such as the gradation and directional pattern extension. As the prediction block size is expanded, the characteristics of those spatial variations in one prediction block became more dynamic and diverse. Therefore, conventional 9 directions for intra prediction defined in MPEG-4 AVC may not be enough to express various texture statistics with enlarged prediction block. Thus, we added more detailed directions for intra prediction expecting further performance improvement. So, we have tested directional intra prediction by increasing the number of directions over the conventional.
2.2. Transform size corresponding to Expanded Prediction Block Unit
To maximize the benefits of expanded prediction block units, transform should be designed appropriately. Larger transform has several advantages such as better energy compaction and less quantization error. In large resolution images, most of image patterns in prediction blocks represent small part of objects or backgrounds which can be described as homogeneous texture patterns with little variations. Hence, we tried applying large transforms greater than 8x8 for residual coding.
2.3. Adaptive interpolation filter
In recent years, numerous attempts were done to overcome the limit of coding efficiency with state of the art video coding. For example, many experts are trying to find new evidence for future video coding through KTA activity in VCEG [3]. According to the analysis report from Tandberg, ‘adaptive interpolation filter’ seems to have the most significant results in terms of coding efficiency [4]. Adaptive interpolation filter exploits image correlation dependent interpolation filter for motion compensation with fractional-pel displacement vector [5].
3. Experimental Results
3.1. Experimental Results without adaptive interpolation filter
We have tested video coding efficiency with the previously introduced methodologies except for adaptive interpolation filter.
Experimental conditions of estimation for coding efficiency were determined under consideration of recent recommendation of ITU-T VCEG [1] and all the numerical expression on coding efficiency is based on BD-PSNR [2]. Coding efficiency of this contribution is compared with state of the art video codec, MPEG-4 AVC. For fair comparison between these two systems, we used well-optimized MPEG-4 AVC encoder; relative Qp control with hierarchical B prediction structure, iterative bi-prediction motion estimation and etc.
Experiments were done with 17 test sequences consisting of various resolutions and video motion characteristics. Test video sequences are listed in Table 1.
Table 1 Test sequences, resolution and test frame numbers
Resolution / Video Sequence NameCIF / Foreman
Mobile
Tempete
Paris
4CIF / Harbour
Soccer
Stockholm
Shield
HD 720p / BigShips
City
Crew
ShuttleStart
HD 1080p / Rolling Tomatoes
RushHour
SunFlower
PlayingCards
TableSetting
Test configurations are as follows:
n Hierarchical B prediction structure (GOP size 8)
n One I slice with the first frame
n 4 Qp points for key picture : 22, 26, 30, 34
n CABAC entropy coding
Figure 1 shows average bit saving percentage with 4 Qp points. Total average of bit saving with all 17 test sequences and 4 Qp points is 13% furthermore average bit savings apparently become larger toward large resolution sequences such as HD 1080p.
Figure 1 Average bit saving for each sequence
Figure 1 shows average bit saving for each video resolution. It is very easy to see that non-standard video coding tools, such as expanded prediction block unit and large transform, are well-suited for large resolution image compression. With these tools, coding efficiency improves further as the resolution rises. This tendency of coding efficiency might be favorable for the future video applications with high resolution. As we can see in Figure 1, a specific test sequence, RollingTomatoes, showed over 30% average bit saving.
Figure 2 shows the average bit savings for each resolution. As shown in the figure, in case of HD 1080p resolution, we could reach 22.36% average bit saving with 4 Qp points while larger bit savings are expected if application requires video sequences having above HD resolutions.
Figure 2 Average bit saving for each resolution.
The following figures show several RD curves of different HD resolutions. From these curves, it is evident that significant bit savings can be achieved from non-standard technologies described in previous sections.
Figure 3 RD curve examples of HD resolution test sequences
3.2. Experimental Results with adaptive interpolation filter
We have tested video coding efficiency with the previously introduced methodologies including adaptive interpolation filter.
Figure 4 shows average bit saving percentage of each test sequence with 4 Qp points and an example of RD curve from HD 1080p resolution. Total average of bit saving with all 17 test sequences and 4 Qp points is 16.94%. Similar to the previous experiment, average bit savings apparently become larger toward large resolution sequences such as HD 1080p. In case of HD 1080p resolution, we could reach 26.32% average bit saving; additional bit saving percentage by adaptive interpolation filter is about 4%. Adaptive interpolation filter seems to work well especially in high bit rate range.
These results imply that we could reach much higher coding efficiency through well-integration of already known new video coding tools.
Figure 4 Average bit saving with introduced all techniques
4. Conclusion
From our experimental results, it was proven that it is possible to further improve video coding efficiency of the latest video codec standard for applications requiring high resolution video services. We would like to emphasize that it would be very worth to try launching new video coding standardization for future technical/commercial environments, especially under consideration of high resolution video applications. It could surely accelerate the growth of the market of high definition video services, which already have been prepared by many major companies.
References
[1] TK Tan, G. Sullivan, T. Wedi, “Recommended Simulation Conditions for Coding Efficiency Experiments Revision 2”, ITU-T SG16/Q6 Document, ITU-T VCEG-AH010r3, Antalya, January 2008.
[2] Gisle Bjontegaard, “Calculation of average PSNR differences between RD-curves”, ITU-T SG16/Q6 Document, ITU-T VCEG-M33, March 2001
[3] http://www.tnt.uni-hannover.de/~vatis/kta/
[4] Tandberg, “A coding efficiency-computational complexity analysis of KTA 1.8 coding tools”, ITU-T SG16/Q6 Document, COM 16 – C 409 – E, April 2008
[5] Yuri Vatis, Bernd Edler Dieu Thanh Nguyen and Jörn Ostermann, “Two-dimensional non-separable Adaptive Wiener Interpolation Filter for H.264/AVC”, ITU-T SG16/Q6 Document, ITU-T VCEG-Z17, Busan, April 2005
2