Image Security and Encryption
By Dr. Bjarne Berg-Saether
Submitted in Partial Fulfillment of the Requirements of
INFO-8200 Principles of Security
University of North Carolina – Charlotte
Spring 2005
Table of Content
Paper Abstract 3
Motivation and Background 3
JPEG Static Images Background 3
Other Compression Methods 4
The JPEG Compression Method 4
JPEG Streaming Images Background 6
Encryption Algorithm 6
Cycle Control 8
Limitations 9
Risks 9
APPENDIX A – The Encryption Code 11
Works Cited: 13
Paper Abstract
As computers matured from simple number machines into more text based systems, the need for better abilities to secure the information has increased. While securing textual information is rather simple, the new usage of computers to stream images, music and movies has raised new concerns around image and sound security. This research paper reviews the current status of encryption and security of images as presented in the research literature. The paper focuses specifically on JPEG and MJ2 standard formats. In addition the paper proposes a conceptual framework for a proof-of-concept application that can encrypt images. The paper also discusses how a simple application can be incorporated into a larger application and the requirements that must be met for such an application to work.
Motivation and Background
As computers matured from simple number machines into more text based systems, the need for better abilities to secure the information increased. While securing textual information is rather simple, the new usage of computers to stream images, music and movies has raised new concerns around image and sound security. While static files are relatively easy to secure using the existing technology, continuous data streams are much harder to secure. In addition, the increased reliability of IP over the Internet as the core protocol to transmit this type data has also exposed many to significant risks. As an example, many public security cameras in cities, parking lots and airports are now transmitting their clear images across the Internet. Unfortunately, the vast majority of these images are unsecured and unencrypted.
This exposes these types of images to several risks. First, the images can be monitored by others; they can be copied and viewed later. A more serious risk is the potential for criminal activities. If an outside camera was recording a certain business area, it might provide a false sense of security as the viewer starts to rely more on the camera than physical patrols. I.e. the images from that camera can be recorded by a criminal and later be played back to the viewer who believes he is seeing real-time images. This would allow the criminal to engage in break-ins, assault or other activity while the viewer is unaware.
The problem above illustrates the need for strong image security for these cameras. In addition, a cycle control is needed. Even with encrypted images, the stream can be slowed so that the viewer is seeing images that are lagging substantially in-time while other activities are taking place. This slow-down in frames should also be monitored by a real security system.
JPEG Static Images Background
In 1982, the ISO formed the Photographic Experts Group (PEG) to research methods of transmitting video, still images, and text over ISDN lines. The goal was to produce industry standards for the transmission of graphics and image data over digital communications networks (Murray & vanRyper, 1994).
Other Compression Methods
Prior to JPEG most compression technology could not compress images containing a large number of colors and most could not support 24-bit raster images. I.e. GIF can only cover 256 colors (8 pixels deep). Also, the GIF compression method (LZW) has a hard time recognizing repeated patterns when images have “noise” (i.e. faxed images). Another common image format, BMP, relies on the RLE method and therefore has the same problems even through it supports the same number of bits as JPEG, (24-bits and 16 million colors).
The JPEG Compression Method
JPEG is not a single algorithm but a lossy method for compressing images. This means that an image that is compressed have some values removed and cannot be restored 100% back to the original format. There are many factors to this loss compression algorithm. However, a key factor is that high-frequency color information is removed from the image. This is known a chrominance components Cb and Cr. The high frequency gray scale is kept mostly intact, since this is critical to the contrast of the image (see illustration 1).
(Illustration 1: Effects of chominance components compression)
This loss processing very unlike the RLE and the LZE methods mentioned above. JPEG has also limited support for an alternative 2D Differential Pulse Code Modulation (DPCM) scheme that does not prevent loss of data, but can predict where the loss will be, and can therefore restore more of the highly compressed image in the decoding of the image when viewed. Another way to encode high volume images is to tile them into many smaller images. JPEG supports pyramidal, composite and simple tiling and can therefore also encode and decode very large images.
The complete process of encoding the image is a complex one. First, a transform is performed to change the image into a basic color scheme, then samples of pixels are grouped together and chrominance parts are analyzed for reduction processing. JPEG’s core compression (50%) is achieved by reducing the luminance pixel areas and substituting them with a chrominance pixel that spans a 2-by-2 area of luminance pixels. This means that 6 values are stored for each block instead of the 12 values normally needed.
The key to the next step in the process is JPEGs use of the DCT (Discrete Cosine Transform) algorithm. The DCT provides this baseline compression that all JPEG compression systems are required to follow. In the DCT system the colors that are removed in the JPEG compression cannot be detected by the human eye and the number of color used can be reduced to re-usable objects. In a simplified expression, DCT which processes blocks of pixels and removes redundant data in the image. However, a major step in DCT is the conversion of 8-by-8 blocks into a frequency map that measures the average block value as well as the strengths of change across the block and the direction of the change (i.e. height or width of the pixel block).
After this, each block are re-processed by using a set of weighted coefficients. This is done by dividing each of the 64 DCT block values (8-by-8) by a coefficient known as quantization and rounding the result to an integer. If the quantization is high, the accuracy of the DCT block value will be reduced and hence the image will have higher compression, but worse quality. The benefit of this re-processing is the optimization for the viewing by the human eye by a set of semi-fixed rules. The core idea is to make sure that similar groupings of pixels have the same encoding so that they appear uniform. The set of semi-fixed rules are actually quantization tables that are optimized for chrominance and luminance.
The key is that these tables have higher quantization values for chrominance, thereby increasing the compression of colors, while providing less compression of luminance of the image. The possible values for the quantization coefficient can be manipulated by changing the desired quality by the end-user. It is important to note that the black-white-gray colors are maintained mostly intact for contrast purposes. As a result, multi-colors used in images (i.e. real world shades) compresses very well in JPEG, typically at 90% compression, while black and white documents have only marginal compression benefits. To determine the compression ratio, the creator can manipulate the Q factor for quality settings. However, each image has its own best Q-factor, so a compromise has to be made in streaming video where multiple images are transmitted.
The most costly process of all these steps is the DCT quantization of each block. Therefore this is best done by hardware (i.e. pre-made chips), or pre-compiled software. A small but important note is that “Images containing large areas of a single color do not compress very well. In fact, JPEG will introduce "artifacts" into such images that are visible against a flat background, making them considerably worse in appearance than if you used a conventional lossless compression method.” (Murray & vanRyper, 1994).
After all this processing, the weighted coefficients are re-processed to remove duplicates. This is normally done through the Huffman variable word-length algorithm, but a binary arithmetic entropy encoder also be employed to increase the compression by approximately 15% more without missing any data (it is however slower to decode and encode). The result of all this processing is a compressed image that can be interpreted by a decoder typically found in a browser, TV or PC hardware such as screens and projectors.
JPEG Streaming Images Background
Required in a streaming image process is a version marker. This indicates the capability needed to decode the data stream. Multiple version markers can indicate the functionality needed, the process image chain, the encode methods and the preferred display execution as well (i.e. pyramid tiling). While JPEG is the base for a streaming image, the static representation could only represent the image as interlaced lines.
Naturally, you could decode the lines of the images non-sequentially (i.e. every second line, and then “fill-in the blanks”). However, this would be a poor graphics solution with image quality dramatically changing while the being viewed. A better solution is a progressive image building is the transfer of images with very high compression first, supplemented with better images on-the-fly.
In 2000, format called MJ2 (or MJP2) for streaming JPEGs were launched. This is a sequence of JPEG 2000 images that also provides audio. This allows MJ2 to encode each frame separately using JPEG 2000 and does not require inter-frame coding as with earlier standards. However, the standard requires substantial computing power to encode and decode images, but is great for rapid transfer over low-bandwidth networks such as the internet.
Encryption Algorithm
The core issue around a enryption algorithm is that is has to both efficient as well as secure. Since the JPEG images are subject to a 8 level process of compression when created, as well as an 8 level decompression processs when being viewed, the computing power needed for the encryption has to be minimalized. This is perticularly true for straming images where high volumes is likely to occur.
(Illustration 2: The compression and encryption process)
The solution proposed consists of a simple executionable file generated by C++ coding that takes the compressed JPEG image, which is in binary form, and manipulates the binary file with a encryption key that creates an encrypted binary file that is protected from viewing by unauthorized people. Since file management is hard when reading blocks of code back and forth from a binary file, it is more effircient to read the binary file into memory and address it as an object with an associated pointer.
The solution consists of a secret key file that consists of 128 characters (bytes), when converted to binary systems, this key file consists of 1024 bits that is used for encryption. The creation of the secret key is not part of this program, but is can be created through a random generator, or a 3rd party and distributed through normal secure channels i.e. RSA, DES or SSL. The secret key is then stored on the sender and the receiver’s side and has to be protected from anauthorized access. If this key is accessed, the security of the encrypted images are compromised.
The soltion proposed, first reads the key file into an object based on an array with assoicated pointer. This object is called the “key”. It is stored in binary form and accessed as so in memory. Secondly, the program reads the binary JPEG file into an object that is defined as an array. For simplicity purposes this is referred to as the image buffer. A challenge is that while we know the fixed size of the key file (128), we do not know the size of the JPEG image. Therefore we have to test the file size by examining the beginning and the end and define the array dynamically based on the size of the file.
Another challenge is the need for a simple processing of the encryption. For simplicity purposes, the solution loops through the buffer array and also sub-loops through the key array. For each 8-bit that is read, the buffer bits are flipped with the values in the current 8-bit block of key that is being processed. This creates a stream of bits that is processed sequentially and transposed based on the values of the key file. The result is that the buffer array now contains bits that are “flipped” so that is a key value is “1” for a given position, and the buffer value is ‘0” the value in the new buffer (encrypted) will become 1. This is known as a ‘not’ process. The overall rules of this process is that ‘0 and 0 becomes 0; 1 and 1 becomes 0; 1 and 0 becomes 1; 0 and 1 becomes 1’. The implication is that if a single bit is changed on the key file, the image can never be decrypted again. This is true since the position of the other bits will shift and the decrypted file will become meaningless.