Polytechnic University, Dept. Electrical and Computer Engineering
EE3414 Multimedia Communication System I
Spring 2006, Yao Wang
______
Homework 9 (JPEG Image Coding)
(Solution)
Written Assignment:
1. Describe briefly how JPEG compresses an image. You may want to break down your discussion into three parts:
a. How does JPEG compress a 8x8 image block (three steps are involved)
b. How does JPEG compress a gray-scale image
c. How does JPEG compress a RGB color image
The basic coding unit in JPEG is a 8x8 block. For each 8x8 block, it goes through DCT, quantization, and runlength coding. Given a gray scale image, it is divided into non-overlapping 8x8 blocks. Each block is processed as above. When the input image is RGB, each color component (R, G, and B) can be compressed directly as a gray scale image. Alternatively, the RGB values of each pixel can be changed to the YcbCr values, and the Cb and Cr images can be further downsampled by a factor of 2 in both horizontal and vertical directions. Then the Y, Cb and Cr images can be coded independently, each as a gray scale image. In the interleaved mode, a minimum coding unit (MCU) consists of 4 Y blocks, 1 Cb block and 1 Cr block. The 3 components are processed simultaneously, by coding 1 MCU at a time.
2. Describe what does a scalable bit stream mean and why is scalability a desired feature for image coding and transmission. What is the difference between quality scalability and spatial scalability?
A scalable bit stream means that it can be truncated at any point and the decoded image quality depends on the number of bits retained, with fewer retained bits leading to lower quality. Scalability is desired for image streaming over a heterogeneous network, where a pre-coded image at a higher bit rate can be downloaded by different clients with different bandwidth constraint. A user with low bandwidth (such as a cell phone user) can download only a portion of the bit stream, whereas a user with high bandwidth (such as a user connected to the internet via fast Ethernet card) can download the entire bit stream.
Quality scalability means that the reconstructed image with a truncated bit stream has the same spatial resolution as the reconstructed image from the entire bit stream, but the color reconstruction quality is worse.
Spatial scalability means that the reconstructed image with a truncated bit stream has a lower spatial resolution (smaller image) than the reconstructed image from the entire bit stream, but the reconstruction accuracy for each pixel color is similar. Quality scalability and spatial scalability can be combined. A fully scalable streams enables both spatial scalability and quality scalability.
3. Suppose the DCT coefficient matrix for an 4x4 image block is as shown below (dctblock).
a. Quantize its DCT coefficients using the quantization matrix Q given below, assuming QP=1. Determine the quantized coefficient indices and quantized values.
Solution:
For each coefficient value “f” in “dctblock”, find its corresponding quantization stepsize “q”, the quantized index is Qindex(f)=floor((f+q/2)/q), the quantized value is Q(f)=Qindex(f)*q. Apply this procedure to each component in “dctblock” yields:
- Represent the quantized indices using the run-length representation. That is, generate a series of symbols, the first being the quantized DC index, the following symbols each consisting of a length of zeros and the following non-zero index, the last symbol is EOB (end of block).
Solution:
The runlength representation is {85,(0,-5),(0,1),(2,-2),(0,2),(0,2),(0,-1),EOB}
- Encode the DC index and the runlength symbols from (b) using the JPEG coding method, with the coding tables given in the lecture note. For this problem, assuming the quantized index for the DC coefficient of the previous block is 60.
Solution: The DC prediction error is 85-60=25. This is in category 5 (Huffman code is “110”), number 25 (“11001”), the total codeword is “11011001”. The first AC symbol is (0,-5). The non-zero value is “-5”, which is in category 3, number 2. The runlength symbol (0/3) has codeword “100”, number 2 has binary codeword of “010”, the total codeword is “100010”. The second AC symbol is (0,1). The non-zero value is “1”, which is in category 1, number 1. The runlength symbol (0/1) has codeword (“00”), number 1 has codeword “1”, the total codeword is “001”. Continue this manner, (2,-2) is represented by (2/2) (“11111000”) and number 1 (“01”), yielding “1111100001”; (0,2) is represented by (0/2) (“01”) and number 2 (“10”), yielding “0110”; (0,-1) is represented by (0/1) (“00”) and number 0 (“0”), yielding “000”. EOB has a codeword “1010”.
The entire bit stream is “11011001, 100010, 001, 1111100001, 0110, 0110, 000,1010”.
4. Determine the Y,Cr,Cb values for a pixel with (R,G,B) value of R=200,G=150, B=50.
Solution:
Using the conversion formula given in the lecture note
we obtain Y=153.55, Cb=69.55, Cr=161.1. By rounding, we can obtain integer values:
Y=154, Cb=70, C=161.