Technical Report number RUCS/2005/TR/01/002/A

Linear Algorithm Predicted Two-Pass Hexagonal Algorithm for Video Compression

Yunsong Wu,

Graham Megson

Parallel, Emergent and Distributed Architecture Laboratory

School of Systems Engineering,

Reading University


Yunsong Wu, Graham Megson

A Novel Linear Predicted Two-Pass Hexagonal Search Algorithm for Motion Estimation

ABSTRACT

This paper presents a novel two-pass algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for block base motion compensation. On the basis of research from previous algorithms, especially an on-the-edge motion estimation algorithm called hexagonal search (HEXBS), we propose the LHMEA and the Two-Pass Algorithm (TPA). We introduced hashtable into video compression. In this paper we employ LHMEA for the first-pass search in all the Macroblocks (MB) in the picture. Motion Vectors (MV) are then generated from the first-pass and are used as predictors for second-pass HEXBS motion estimation, which only searches a small number of MBs. The evaluation of the algorithm considers the three important metrics being time, compression rate and PSNR. The performance of the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms. Experimental results show that the proposed algorithm can offer the same compression rate as the Full Search. LHMEA with TPA has significant improvement on HEXBS and shows a direction for improving other fast motion estimation algorithms, for example Diamond Search.

Keywords

video compression, motion estimation, linear algorithm, hashtable, hexagonal search.

School of Systems Engineering, Reading University, Reading, UK, RG6 6AA

Correspondence to: Yunsong Wu, PEDAL Group, School of Systems Engineering,

The University of Reading, Reading RG6 6AY, United Kingdom.

E-Mail:

1. INTRODUCTION

In this paper, we propose a Linear Hashtable Motion Estimation Algorithm (LHMEA) and a Two-Pass Algorithm constituted by LHMEA and Hexagonal Search (HEXBS) to predict motion vectors for inter-coding. The objective of our motion estimation scheme is to achieve good quality video with very low computational complexity. There are a large number of motion prediction algorithms in the literature. This paper is only concerned with on one class of such algorithms, the Block Matching Algorithms (BMA), which is widely used in MPEG2, MPEG4, and H.263. In BMA, each block of the current video frame is compared to blocks in reference frame in the vicinity of its corresponding position. The one with the least Mean Square Error (MSE) is considered as a match, and the difference of their positions is the motion vector of the block in the current frame to be saved in the corresponding position on the motion map. Motion estimation is quite computationally intensive and can consume up to 80% of the computational power of the encoder if the full search (FS) is used. It is highly desired to speed up the process without introducing serious distortion.

In the last 20 years, many fast algorithms have been proposed to reduce the exhaustive checking of candidate motion vectors (MV). Fast block-matching algorithms (BMA) use different block-matching strategies and search patterns with various sizes and shapes. Such as Two Level Search (TS), Two Dimensional Logarithmic Search (DLS) and Subsample Search (SS) [1], the Three-Step search (TSS), Four-Step Search (4SS) [2], Block-Based Gradient Descent Search (BBGDS) [3], and Diamond Search (DS) [[4], [5]] algorithms. A very interesting method called HEXBS has been proposed by Ce Zhu, Xiao Lin, and Lap-Pui Chau [6]. There are some variant HEXBS method, such as Enhanced Hexagonal method [7], Hexagonal method with Fast Inner Search [8] and Cross-Diamond-Hexagonal Search Algorithms[9]. The fast BMA increases the search speed by taking the nature of most real-world sequences into account while also maintain a prediction quality comparable to Full Search.

As shown in the table 1, most algorithms suffer from being easily trapped in a non-optimum solution.

Our LHMEA method attempts to predict the motion vectors using a linear algorithm and Hashtable [10]. In this paper we propose the LHMEA and the TPA. In the first-pass coding, we employ LHMEA to search all Macroblocks (MB) in picture. Motion Vectors (MV) generated from first pass will be used as predictors for second-pass HEXBS motion estimation, which only searches a small number of the MBs. Because LHMEA is based on a linear algorithm, which fully utilizes optimized computer’s structure based on addition, its computation time is relatively small. Meanwhile HEXBS is one of best motion estimation methods to date. The new method proposed in this paper achieves the best results so far among all the algorithms investigated. A test with moments invariant shows how powerful the hashtable method can be.

Contributions from this paper are:

(1) It achieves the best results among all investigated BMA algorithms.

(2)First time, hashtable concept is used in the search for motion vectors in video compression.

(3) Linear algorithm is used in video compression to improve speed and allow for future parallel coding.

(4) The Two Pass Algorithm (TPA) is proposed. LHMEA is used for the first pass while HEXBS is used for a second pass. MVs produced by the first pass will be used as predictors for the second pass and this makes up for the drawback of the coarse search in the hexagonal search. This can also be used and leave space for research of nearly all kinds of similar fast algorithms for example Diamond Search etc.

(5) Invariant moments are added into hashtable to check how many coefficients work best for hashtable. We also prove that the more information hashtable has, the better result the table will have.

(6) Spatially related MB information is used not only in coarse search but also inner fine search.

The rest of the paper is organized as follows. Section I continues with a brief introduction to HEXBS and varieties. The proposed LHMEA, moments method and LAHSBTPA are discussed in Section II. Experiments conducted based on the proposed algorithm are presented in Section III. We conclude in Section VI with some remarks and discussions about the proposed scheme.

Table 1: Comparison of different fast search algorithms and their disadvantage

Fast Search Algorithms / Disadvantage
Three Step (Two-dimensional logarithmic )Search (TSS) / PSNR is substantially lower, it can be easily trapped in a non-optimum
solution. Inefficiency to estimate small motions.
New Three Step Search
(NTSS) / Small motion compensation errors and much more robust than TSS method
Four Step Search
(FSS) / Performance of FSS is maintained for image sequence that contains complex movement such as camera zooming and fast motion. Easily trapped in non-optimum solution
Block-Based Gradient Descent Search (BBGDS) / The algorithm might actually have significant problems when coding non center- biased sequences. Easily trapped in non-optimum solution
Hierarchical Motion Estimation (HME) / The problem of propagating false motion vectors
New Diamond Search (DS) / Sensitive to motion vectors in different directions. Easily trapped in non-optimum solution
Hexagonal Search ( HEXBS) / Easily trapped in non-optimum solution

1.1 Hexagonal Search Algorithm

The Hexagonal Search Method is an improved method based on the DS (Diamond Search). HEXBS has shown significant improvement over other fast algorithms such as DS. In contrast with the DS that uses a diamond search pattern, the HEXBS adopts a hexagonal search pattern to achieve faster processing due to fewer search points being evaluated. The motion estimation process normally comprises of two steps. First is a low-resolution coarse search to identify a small area where the best motion vector is expected to lie, followed by fine-resolution inner search to select the best motion vector in the located small region. The large central 5x5 search pattern used in HEXBS, can provide fast searching speed. It gives consistently better motion estimates and directions due to larger size. Another relief of reducing checking points is to have successive search patterns can be overlapped. HEXBS requires only three extra points to be evaluated in each step. Most fast algorithms focus on speeding up the coarse search by taking various smart ways to reduce the number of search points to identify a small area for inner search. There are two main directions to improve the coarse search:

  1. usage of predictors [[8], [11]]

2.  early termination [11]

A new algorithm [11] was introduced on HEXBS, which is similar as Motion Vector Field Adaptive Search Technique (MVFAST) based on DS. The algorithm has significantly improved the preexisting HEXBS both in image quality and speed up by initially considering a small set of predictors as possible motion vector predictor candidates. Then using a modified Hexagonal pattern used the best motion vector predictor candidate as the center of search. Another prediction set is proposed in the literature [[13], [14]]. In general, Search blocks correlated with the current one can be divided into three categories as in Figure.1.:


Figure. 1. Blocks correlated with the current one

(1) Spatially correlated blocks (A0, B0, C0, D0),

(2) Neighboring blocks in the previous frame (A1, B1, C1, D1, E1, F1, G1, H1)

(3) Co-located blocks in the previous two frames (X2 and X3), which provide the Acceleration motion vector (MV).

Except for coarse search improvement, Inner search improvement includes:

  1. 4 points [8]
  2. 8 points [11]

3.  Inner group search [11].

Figure. 2. Inner Search Method

2. TYPESET TEXT LINEAR ALGORITHM AND HEXAGONAL SEARCH BASED TWO-PASS ALGORITHM (LAHSBTPA)

Most of the current Hexagonal search algorithms are predictive methods that focus on relations between current frame and previous frames. They approach the global minimum on assumption that local minimum is global minimum which may not always be the case. What we want to do is to find a fast method which discovers the predictor from the current frame information by using spatially related MB or pixel information. The method can avoid trapping in local minimum, fast, accurate and independent on finding right predictors. So we have designed a vector hashtable lookup and block matching algorithm. It is more efficient method to perform an exhaustive search. It uses global information in the reference block. The block-matching algorithm calculates each block to set up a hashtable. By definition hashtable is a dictionary in which keys are mapped to array positions by a hash function.

We try to find as few as possible variables to represent the whole macroblock. Through some preprocessing steps, “integral projections” are calculated for each macroblock. These projections are different according to each algorithm. The aim of these algorithms is to find the best projection function. The algorithms we present here have two projections, one of them is the massive projection, which is a scalar denoting the sum of all pixels in the macroblock. It is also DC coefficient of macroblock. Another is A of Y=Ax+B (y is luminance, x is the location.) Each of these projections is mathematically related to the error metric. Under certain conditions, the value of the projection indicates whether the candidate macroblock will do better than the best-so-far match. The major algorithm we discuss here is linear algorithm.

2.1 Linear Hashtable Motion Estimation Algorithm (LHMEA)

In previous research methods, when people try to find a block that best matches a predefined block in the current frame, matching was performed by SAD (calculating difference between current block and reference block). In Linear Hashtable Motion Estimation Algorithm (LHMEA), we only need compare two coefficients of two blocks. In the current existing methods, the MB moves inside a search window centered on the position of the current block in the current frame. In LHMEA, the coefficients move inside the hashtable to find the matched blocks. If coefficients are powerful enough to hold enough information of the MB, motion estimators should be accurate. So LHMEA increases accuracy, reduces computation time and may allow for a new era of video encoding. The Linear Algorithm is the easiest and fastest way to calculate on a computer because the constructions of computer arithmetic units are based on additions. So if most of calculations of video compression are done on linear algorithm, we can save lots of time on compression. It is also very easy to put on parallel machines in the future, which will benefit real time encoding. In the program, we try to use polynomial approximation to get such result y=mx+c; y is luminance value of all pixels, x is the location of pixel in macroblocks. The way of scan y is from left to right, from top to button. Coefficients m and c are what we are looking for to put into hashtable.

(1)
(2)

According to experience of our research in the encoder, we changed m to keep its value around 100-1000. This improved a lot on previous research result whose m is always zero in hashtable, in which case there is only c in hashtable. In this way, we initially realized the way to calculate the hashtable.

2.2 Moments Invariants

Except for the coefficients from the Linear Algorithm, we put moments invariant into the hashtable as a test. The set of moments we are considering is invariant to translation, rotation, and scale change. We consider moments represent a lot more information than the coefficients m and c that we proposed in LHMEA. As we can see from the experimental result, moments have some improvement on hashtable method. By performing experiments on moments, we attempt to understand how many coefficients work best for the hashtable. Second we try to prove that the more information hashtable has the better the hashtable is.

In this paper, moments of two-dimensional functions are as following [14]:

For a 2-D continuous function f(x,y) the central moments are defined as

(3)

where and

The normalized central moments, denoted , are defined as