CHAPTER 1
1. DIGITAL IMAGE
The digital video has become a very important element inside the different areas of our society: business, education, entertainment, etc . Its diffusion and distribution covers a wide gamma of technologies; from the own diffusion of the Digital Video Broadcasting (DVB-T) into the Digital Terrestrial Television (DTT) up to the publication of contents based on web technology. The increase
of video consumption is due to the union of two factors: Cheaper cost and the increase of capacities, both in the field of communications as well as in the storage.
The speed increment in the net communications joined to the low cost of it , has led the massive use of these nets both in the domestic use and the professional environments of production.
About the storage systems and process, has allowed that the domestics users can have in their computers libraries of content and strong tools of content creation that only recently were available in cost production environments. This cheapening has permitted that the companies of production are able to update their equipments in less time and with more benefits.
This migration of process from a traditional environment office, has requested the creation and the continue modification of work standards with the intention of satisfying strong demands that imposed by this type of contents.
In this chapter it will be accomplished a revision of the basic concepts of the capture and image creation for their professional production, so that it establishes a quality of base image. This base image, will allow us to have a starting point to optimize the resources in each process.
1.1 BLACK AND WHITE
1.1.1 IMAGE CAPTURE
The capture of our cameras has a projection in two dimensions of the real image different from our actual ones in three dimensions, this projection is a distribution of the light energy that reflect the image. In order to capture this image to digital data, it is necessarily to convert this light energy into digital image. It will require three steps:
· Definition of rows and columns (axis u and v) in order to take the values of luminance of the image. Through this process, it is possible to obtain discrete values of luminance from the continuous values. In each point (u,v) a value will be obtained.
· Quantification of values obtained in the previous steps through a certain number of bits that can be treated by the computer.
· Sequencing of the values that are in the points (x,y), so that the values sets can define the image.
As shown in the Illustration 1-1, the image is inverted for the effect of the concave lens.
ILLUSTRATION 1-1 IMAGE BEHAVIOR THROUGH THE LENS
In the image lay out we have a representation of two reality dimensions. It is in this lay out where the image sensor is placed. This sensor is divided by rows and columns with a cell in each pair (x,y), this cells give the information of the luminosity that impact in each one of them (Illustration 1-2), obtaining a numeric representation of the distribution of the light that impacts the sensor.
ILLUSTRATION 1-2. SENSOR DIVIDED IN CELLS
The generated data done by the sensor because of the incidence of the light energy, has to be sampled in regular intervals to extract such information of the sequential form pixel by pixel (Illustration 1-3)
ILLUSTRATION 1-3. DATA SERIALIZATION
Either to store the data or to be done through mathematical operations, it’s necessary to encode the result data in bits sequence, thus, each pixel will come defined by a bit number (k) inside the data streams, the values gamma that are able to have the image is given by a minimum value (0) and a maximum value (). In case to use seven bits (k=7) the values from the black to white will be referenced to 0 to that reveals a range of values going from 0 to 127 (128 values), in case of 8 bits will be from 0 to 255. As a general rule, the more bits are used to encode an image, much more quality will be obtained from the original image. In the case where k has a value of 1, a bit will be used for its codification and will show us a black and white image without none type of grays.
For the data serialization it exists several methods, the simplest is selecting the values pixel by pixel and is going to show the information of each pixel.
On Illustration 1-4 the well-known method can be seen , such as the Frame Transfer (CCD-FT), through this method, after doing the presentation, the values of the images pass to storage registers and then the transfer values of the different lines to the output registers, when the outputs registers have turned out to the memory, values are transferred to the next line.
ILLUSTRATION 1-4. FRAME TRANSFER
1.1.2 IMAGE FORMAT
The image format or the definition has been defined by the number of columns M against the rows N, so that the pixel numbers will be M x N, if that image is encoded with k bit numbers, the storage of that image uncompressed will have a size (s) of:
S = M x N x k
The image format used to specify through the relation between the numbers of columns and the number of rows, or the same, among the pixel numbers of a horizontal line and the pixel numbers of a vertical one. As a consequence an image has M/N, which means that each one has M horizontal pixels (nº of columns) and has N verticals (nº rows). Thus 16/9 indicates that each 16 pixels in one horizontal line will have 9 vertical lines.
When speaking about a format of 1.080 horizontal lines in 16/9, the pixel numbers will have in a horizontal line will be of 1.920 = 1.080 x 16/9, so the image will have No Pixels = 1.920 x 1.080 = 2.073.600, if it is encoded with 7 bits per each pixel there will No Bits= 1920 x 1080 x 7, that means 14.515.200 Bits (1814400Bytes)
1.2 IMAGE IN COLOR
A monochromatic image performs through the sampling of the incident light intensity in the scene, to capture images in color, the beam has been filtered in three primary colors and captured by three different CCD, one by color: red, green and blue (RGB).
1.2.1 RGB
Each pixel will be represented by three numbers that indicate the intensity of each one of the colors. If it uses 8 bites per color, there is a need of 24 bites per pixel, so the size of the image in color, corresponding to k bites per pixel and color will be:
S = M x N x 3 x k
With these three colors it may represent all additive colors through the combination of each one of them, for the specific case of one pixel in white the value of each one of the components is:
Y = 0,299 x R + 0,587 x G x 0,114 x B
In the Illustration 1-5 can be seen how the original image is decomposed in their three RGB components. It shows how the part of the lips that is redder, appears with more presence in the R channel while is darker in the rest of the channels. The sky that in the original image is white will have a maximum in each one of the components.
ILLUSTRATION 1-5 FRACTION OF A NATURAL IMAGE IN ITS THREE COMPONENTS
In the Illustration 1-6, shows a generated image of an electronic device that symbolizes the color bars, is one of the most used signals to adjust electronic equipments because it gives the possible combinations of the three RGB components.
ILLUSTRATION 1-6 ELECTRONIC IMAGE OF BARS AND THEIR SEPARATION IN THE THREE PRIMARY COLORS
1.2.2 Y, CR, CB
One of the most used ways to represent the images is through the separation of the luminance and the chrominance components. One of the key aspects of the images’ compression is the more sensitivity of the human eye to the variations in the lightmore than the variations of the color levels, so that if we separate the luminance from the chrominance it may be treated each one separately.
Considering that:
Y = 0,299 x R + 0,587 x G + 0,114 x B
Where Y is the luminance component, we can define two chrominance components Cr y Cb like the subtraction of the red component with the luminance and the subtraction of the blue component with the luminance respectively.
From the values YCrCb, the RGB may be obtained:
Through the separation of the components, the different ways the luminance and the chrominance may be encoded. This differentiation of encoding is usually represented through three figures that identify the quantity of encoded pixels in a block of 2x2 pixels, in the case of 4:4:4, these three figures mean that they encode four pixels of Y, the same four of Cr and the same four of Cb. In the case 4:2:2, for the four pixels encoded by Y, they encode two of Cr and two of Cb, because they encode the color components in the half of the horizontal pixels. In the case of 4:2:0 they encode half of the horizontal and vertical according to how it’s shown in the illustration 1-7.
ILLUSTRATION 1-7 ENCODING EXAMPLES 4:4:4, 4:2:2 AND 4:2:0
For each one of the cases the sizes will be:
S (4:4:4) = M x N x K
For the case of 4:1:1 it doesn’t take a square of 2x2 pixels but four consecutive pixels, so what it indicates is that it encodes the colors components of one pixel to each four (in horizontal). 3:1:1 indicates that the four pixels encode three luminance and one chrominance in a horizontal. If one more number is added, it indicates that there is an alpha channel and there is the encoding of that channel (see Error! reference source not found).
ILLUSTRATION 1-8 ENCODING FOR 4:1:1 AND 3:1:1
The sizes will be:
1.3 IMAGE FILES
Today there are many standardized formats to store and to process images. The developers have to select the type of image and the ideal compression for each type of process, it is not the same to manage an image with retouching needs to production or a file to publish in a web page. In some of the cases, it will be necessary the store the uncompressed material for an image process and in other cases it will be necessary a compression as high as possible even with changes in the characteristics in terms of image size.
We will see the formats using arrays to store the values of pixels and not the referenced vector formats.
1.3.1 TIFF
The Tagged Image File Format (TIFF) was originally developed by Aldus and Microsoft, but actually itbelongs to Adobe System. The last version was performed in 1992 (Adobe Developers Association, Final – June 3, 1992). A TIFF File can contain one or several images with the possibility to describe from black and white images up to real color images with diverse compression schemes.
One of the possibilities that offers the mentioned format and becomes a very attractive product in the professional environments of production is the use of Alpha Channel which allows the creation of masks and transparency effects necessary in the professional environments of production both photography and professional video.
The architecture is based on the use of etiquettes that defines the characteristics of the image as the palette can be resolution dimensions, localization of data, etc. These etiquettes are located at the beginning of the file and provide flexibility at the moment to create new data and etiquette types defined by the user.
1.3.2 GIF
The Graphics exchange Format (GIF) was created by CompuServe in 1987, it has been one of the most widely used formats for web environments because of many factors; one of them is the possibility to have harbor inside the same file of several images so that it can store sequences to provide motion to the stored images in the file. On the other side the use of the motion images. Another factor that is widely used is for the size in the files that can be generated, the format can use palettes from 2 to 256 colors of between 16,8 millions of colors that can be used, this fact enables the use of few bits for the color encoding because it is restricted to a maximum of 256 colors for each palette that is used.
For example, a drawing that uses 8 different colors instead of using one byte to store the color information of each pixel, only is requires the use of 3 bites to store that information and that means a safe of more of the fifty percent in storage space and the transmission of this image.
The format is used mainly for the creation of motion icons which with few bits of color encoding, it turns into compressions lossless with a great performance. This together with the possibility to create animates makes it an ideal format for these environments.
1.3.3 PNG
The Portable Network Graphic (PNG) was initially developed as a free format to replace the gif files for many reasons to avoid too many license payments. Designing the representation of images on the Net with support of three types of images: grayscale (until 16 bits per pixel), real color (until 3x16 bits per pixel) and indexed colors (until 256 colors)
Also it incorporates Alpha channel to work in different production environments of photography or motion video.
1.3.4 JPEG
It was created by the Join Photographic Experts Group (JPEG) and established like a standard by ISO (CCITT, 1992). Today is the most used compression format to still images. The first encoding JPEG that emerged from the collaboration between ITU and ISO was a compression of 15:1 without losing subjective quality, with more compression factors that can be perceived as the loss of quality.
Is a standard created for images that has used the encoding of the motion images like image sequences, has been for a long time the system of the video servers in professional quality. The advantage of editing in any frame of the sequence without increase is the computer complexity made was widely used in production.