Multimedia Techniques

Multimedia Techniques

Module 2: Building Blocks

Text

Text is the most widely used and flexible means of presenting information on screen and conveying ideas. The designer should not necessarily try to replace textual elements with pictures or sound, but should consider how to present text in an acceptable way and supplementing it with other media. For a public system, where the eyesight of its users will vary considerably, a clear reasonably large font should be used. Users will also be put off by the display of large amounts of text and will find it hard to scan. To present tourist information about a hotel, for example, information should be presented concisely under clear separate headings such as location, services available, prices, contact details etc.

Strictly, speaking, text is created on a computer, so it doesn't really extend a computer system the way audio and video do. But, understanding how text is stored will set the scene for understanding how multimedia is stored. Processing words (not called word-crunching!) is the major use of computers today. Words are stored as character by character. Characters can be more than letters - they can be digits, punctuation. Even the carriage-return when you hit the return key is stored as a character.

Computers deal with all data by turning switches off and on in a sequence. We look at this by calling an off switch "0" and and on switch "1". These 0's and 1's are called bits. Everything in a computer is ultimately represented by sequences of 0's and 1's - bits. If the sequence were of length 2, we could have 00, 01, 10, or 11. Four items. Similarly, we find that a sequence of length 3 can represent 8 items (000, 001, 010, ...). A sequence of length 4 can represent 16 things (0000, 0001, 0010, ...). There are about 128 characters that a computer has to store. This should take a sequence of length 7. In reality, 8 bits are used instead of 7 (the 8th bit is used to check on the data). n bits can represent 2^n items.

Hyper Text

Definition of hypertext is as follows:

Ted Nelson (dubbed: "The father of hypertext") – "A combination of natural language text with the computer's capacity for interactive branching" or "Dynamic display of a nonlinear text which cannot be printed conveniently on a conventional page"

"Representation of a body of information in a form that captures all the inherent interlinks in the information"

"Organization of a document as a collection of nodes connected by direct links"

"A system in which a rich structure of interconnections is created & used within online electronic documents"

"A computer-based system with the ability to perform high speed, branching transactions on textual elements, to aid thinking and to aid communication"

"text that recognizes a stimulus (i.e., a mouse click) and is associated with a "link" to another location.”

"a database that has active cross-reference and allows the reader to "jump" to other parts of the database as desired."

The three basic elements of hypertext systems are:

  1. Nodes
  2. Links
  3. Navigation
  4. Structure and Metrics of Hyperdocuments

Sound

Sound is generally known as vibrational transmission of mechanical energy that propagates through matter as a wave (through gases and fluids as a compression wave, and through solids as both compression and shear waves) that can be audibly perceived by a living organism through its sense of hearing

Sound is further characterized by the generic properties of waves, which are frequency, wavelength, period, amplitude, speed, and direction (sometimes speed and direction are combined as a velocityvector, or wavelength and direction are combined as a wave vector).

The mechanical vibrations that can be interpreted as sound can travel through all forms of matter, gases, liquids, solids, and plasmas. However, sound cannot propagate through vacuum. The matter that supports the sound is called the medium. Sound propagates as waves of alternating pressure deviations from the equilibrium pressure (or, for transverse waves in solids, as waves of alternating shear stress), causing local regions of compression and rarefaction. Matter in the medium is periodically displaced by the wave, and thus oscillates. The energy carried by the sound wave is split equally between the potential energy of the extra compression of the matter and the kinetic energy of the oscillations of the medium. The scientific study of the propagation, absorption, and reflection of sound waves is called acoustics.

MIDI:

MIDI (Musical Instrument Digital Interface) is a communications standard developed in the early 1980s for the electronic musical instruments and computers. MIDI provides a protocol for passing detailed descriptions of a musical score, such as the notes, sequence of notes, and what instrument will play these notes. But MIDI is not digitized sound. In general, MIDI data is used in the following circumstances:

----Digital audio won’t work if you don’t have enough RAM, hard disk space, CPU processing power, or bandwidth.

----You have a high quality MIDI sound source

----You have complete control over the playback hardware.

----You don’t need spoken dialog.

Digital Audio:

Digital audio data is the actual representation of a sound, stored in the form of thousands of individual numbers (called samples). The digital data represents the instantaneous amplitude of a sound at discrete slices of time. Because it is not device dependent, digital audio sounds the same in every time it is played. Digital sound is used for musical CDs.

Any sound can be digitized - sounds from any source, natural or prerecorded. Digital sound is sampled sound. Every nth fraction of a second, a sample of sound is taken and stored as digital information in bits and bytes.The larger the sample size, the better the data describes the recorded sound.

In general, digital audio can be used in the following circumstances

----You don’t have control over the playback hardware

----You have the computing resources and bandwidth to handle digital files.

---- You need spoken dialog.

Sound Cards

A sound card (also known as an audio card) is a computerexpansion card that facilitates the input and output of audio signals to/from a computer under control of computer programs. Typical uses of sound cards include providing the audio component for multimedia applications such as music composition, editing video or audio, presentation/education, and entertainment (games). Many computers have sound capabilities built in, while others require additional expansion cards to provide for audio capability.

Sound cards usually feature a digital-to-analog converter, that converts recorded or generated digital data into an analog format. The most basic sound card is a printed circuit board that uses four components to translate analog and digital information:

  • An analog-to-digital converter (ADC)
  • A digital-to-analog converter (DAC)
  • An ISA or PCI interface to connect the card to the motherboard
  • Input and output connections for a microphone and speakers

Instead of separate ADCs and DACs, some sound cards use a coder/decoder chip, also called a CODEC, which performs both functions.

Imagine using your computer to record yourself talking. First, you speak into a microphone that you have plugged into your sound card. The ADC translates the analog waves of your voice into digital data that the computer can understand. To do this, it samples, or digitizes, the sound by taking precise measurements of the wave at frequent intervals.


An analog-to-digital converter measures sound waves at frequent intervals.

The number of measurements per second, called the sampling rate, is measured in kHz. The faster a card's sampling rate, the more accurate its reconstructed wave is.

If you were to play your recording back through the speakers, the DAC would perform the same basic steps in reverse. With accurate measurements and a fast sampling rate, the restored analog signal can be nearly identical to the original sound wave.

Even high sampling rates, however, cause some reduction in sound quality. The physical process of moving sound through wires can also cause distortion. Manufacturers use two measurements to describe this reduction in sound quality:

  • Total Harmonic Distortion (THD), expressed as a percentage
  • Signal to Noise Ratio (SNR), measured in decibels

For both THD and SNR, smaller values indicate better quality. Some cards also support digital input, allowing people to store digital recordings without converting them to an analog format.

Other Sound Card Components

In addition to the basic components needed for sound processing, many sound cards include additional hardware or input/output connections, including:

  • Digital Signal Processor (DSP): Like a graphics processing unit (GPU), a DSP is a specialized microprocessor. It takes some of the workload off of the computer's CPU by performing calculations for analog and digital conversion. DSPs can process multiple sounds, or channels, simultaneously. Sound cards that do not have their own DSP use the CPU for processing.
  • Memory: As with a graphics card, a sound card can use its own memory to provide faster data processing.
  • Input and Output Connections: Most sound cards have, at the very minimum, connections for a microphone and speakers. Some include so many input and output connections that they have a breakout box, which often mounts in one of the drive bays, to house them. These connections include:
  • Multiple speaker connections for 3-D and surround sound
  • Sony/Philips Digital Interface (S/PDIF), a file transfer protocol for audio data. It uses either coaxial or optical connections for input to and output from the sound card.
  • Musical Instrument Digital Interface (MIDI), used to connect synthesizers or other electronic instruments to their computers.
  • FireWire and USB connections, which connect digital audio or video recorders to the sound card

Anatomy of a Sound Card
PC sound cards typically have all the components in this picture. Some have only one output, which may be amplified (Amp) or not (Buffer amp). These components may also be built directly into the motherboard. (Illustration courtesy of Peter Hermsen.)

Standards

Image

The image data can be stored in the computer in two ways- raster and vector image. The appearance of both types of images depends on the display resolution and capabilities of your computer’s graphics hardware and monitor. Image files are compressed to save memory and disk space. Some image formats use compression within the file itself. For e.g. GIF, JPEG and PNG.

Raster Graphics

Raster imaging is the technique of dividing the entire image area into pixels or small dots and then recording the image data as to which color is to be displayed in which dot size area. These colors will combine together to form the image. In windows, they are called as bitmapped graphics. Bitmaps are used for photo-realistic images and for complex drawings requiring fine detail.

Vector Graphics

Instead of storing as dots, it is better to store the lines, arcs, curves that make up the ultimate image and store them as a group of such elements. Such a technique of storing image data is known as vector imaging and such images are called vector graphics.

All types of images cannot be stored as vector images.

Color

Color is a vital component of multimedia. Color is the frequency of a light wave within the narrow band of the electromagnetic spectrum to which the human eye responds. ecause the eye’s receptors are sensitive to red, green, and blue light, by adjusting combinations of these three additive primary colors, the eye and brain will interpolate the combinations of colors in between. The reflected light that reaches your eye from a printed page is made up of tiny halftone dots of a few primary colors.

In contrast, the screen of a computer monitor is, like a sun, a source of light. On the back of the glass face of a monitor are thousands of phosphorescing chemical color dots (red, green, and blue), which are bombarded by electrons that paint the screen at very high speeds. These dots are each about .30mm or less in diameter and are positioned very carefully and very close together.

The red, green, and blue dots light up when hit by the electron beam, and the eye sees the combination of red, green, and blue (RGB) light and interpolates it. When one of the primary colors is subtracted from this RGB mix, the subtractive primary color is perceived, as follows.

RGB Combination Perceived Color

Red only Red

Green only Green

Blue only Blue

Red and green (blue subtracted) Yellow

Red and blue (green subtracted) Magenta

Green and blue (red subtracted) Cyan

Red, green and blue White

None Black

Monitors and Color

Multimedia is typically presented on color monitors that display 8 bits of color information per pixel in a matrix of 640 pixels across and 480 pixels down (640x480). In the HSB (hue, saturation, brightness) and HSL (hue, saturation, lightness) models, you specify hue or color as an angle from 0 to 360 degrees on a color wheel, and saturation, brightness, and lightness as percentages. Lightness or brightness, is the percentage of black or white that is mixed with a color. A lightness of 100 percent will yield a white color; 0 percent is black; the pure color has a 50 percent lightness. Saturation is the intensity of the color. At 100 percent saturation, the color is pure; at 0 percent saturation, the color is white, black, or gray.

Color Palettes

Palettes are mathematical tables that define the color of a pixel displayed on the screen. On the Macintosh, these tables are called color lookup tables or CLUTs. In Windows, the term palette is used. The most common palettes are 1, 4, 8, 16, and 24 bits deep:

Color Depth Colors Available

1- Bit Black and white (or any two colors)

4- Bit 16 colors

8- Bit 256 colors (good enough for color images)

16- Bit Thousands of colors (excellent for color images)

24- Bit More than 16 million colors (totally photo-realistic)

Palette flashing

When 256 colors of an 8-bit palette is used then only one combination of these 256 colors is displayed. If we change the colors in the current palette by remapping, there will be an annoying flash of strange colors. This palette flashing is a serious practical problem for multimedia designers.

All the techniques for handling the palette-plashing problem involve design solutions:

The simplest solution is to map all the images in your project to a single, shared palette. The disadvantages here is that you will trade the best 256 colors that show a single image for an “average” of 256 colors shared among all images.

A less simple but more effective technique is to fade each image to white or black before showing the next image. Black and white are usually present in all palettes.

Image Types

.BMP / Bitmapped file
.DIB / Device Independent BMP
.GIF / Graphics Interchange file
.JPEG / Joint Photographics Experts Group
.WPG / Word Perfect Graphics file
.CDR / Corel Draw file format
.TIF / Tagged Image file
.PIC / Lotus 1-2-3 picture file format

Image Compression

  1. RLE (Run Length Encoding)

Run-length encoding (RLE) is a very simple form of data compression in which runs of data (that is, sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original run. This is most useful on data that contains many such runs: for example, relatively simple graphic images such as icons, line drawings, and animations.

For example, consider a screen containing plain black text on a solid white background. There will be many long runs of white pixels in the blank space, and many short runs of black pixels within the text. Let us take a hypothetical single scan line, with B representing a black pixel and W representing white:

WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW

If we apply the run-length encoding (RLE) data compression algorithm to the above hypothetical scan line, we get the following:

12W1B12W3B24W1B14W

Interpret this as twelve W's, one B, twelve W's, three B's, etc.

The run-length code represents the original 67 characters in only 18. Of course, the actual format used for the storage of images is generally binary rather than ASCII characters like this, but the principle remains the same. Even binary data files can be compressed with this method; file format specifications often dictate repeated bytes in files as padding space. However, newer compression methods such as DEFLATE often use LZ77-based algorithms, a generalization of run-length encoding that can take advantage of runs of strings of characters (such as BWWBWWBWWBWW).

Common formats for run-length encoded data include PackBits, PCX and ILBM. Run-length encoding performs lossless data compression and is well suited to palette-based iconic images. It does not work well at all on continuous-tone images such as photographs, although JPEG uses it quite effectively on the coefficients that remain after transforming and quantizing image blocks. Run-length encoding is used in fax machines (combined with other techniques into Modified Huffman coding). It is relatively efficient because most faxed documents are mostly white space, with occasional interruptions of black. Data that have long sequential runs of bytes (such as lower-quality sound samples) can be RLE compressed after applying a predictive filter such as delta encoding.