Research Paper on Enhancing Data Compression Rate Using Steganography

RESEARCH PAPER ON ENHANCING DATA COMPRESSION RATE USING STEGANOGRAPHY

TamannaGarg

School of Computer Science & Engineering

Bahra University, ShimlaHills, India.

Sonia Vatta

School of Computer Science & Engineering

Bahra University, ShimlaHills, India.

ABSTRACT-In this paper, description of a compression algorithm based on steganography has been narrated. The compression algorithm has been used to develop an application which will help the users to hide large size text documents inside small size images. Maximum bits to be hidden per pixel can be increased to eight with the help of the developed compression application. After hiding the data inside an image, there appears to be no visible distortion at all. Also the application is compatible with all the documents and image formats. The developed application automatically converts the output stego image in bmp format.

Keywords- steganography, cryptography, LSB, embedding, extraction, secret key, compression.

I.INTRODUCTION

In the contemporary era there is a dire need to convey confidential information secretly. Steganography serves the above said purpose in a frictionless way by hiding information inside the carrier. In other words steganography facilitates the process of hiding any information related document in carrier such a way that the existence of hidden information can’t even be judged by anyone.

Image steganography is very prominent now-a- days. Hiding small size information in small or large images is an easy task but hiding large size information in small images is very complicated. In this research work, there is the narration of a compression algorithm which has been used to design an application capable of hiding large documents in small images without any changes. The developed compression algorithm has a capability of hiding data up to eight bits per pixel.

The primary objective of steganography is to avoid drawing attention to the transmission of hidden information. The basic terminologies used in steganography systems are: the cover message, secret message, the secret key and embedding algorithm.In this research work, the embedding algorithm is the compression algorithm. The cover message is the carrier of the message such as image, video, audio, text or some other digital media. Here the carrier is an image.The secret message is the information which is needed to be hidden in the suitable digital media. The secret information in this work is in the form of any text format. The secret key is usually used to encrypt the message to have more security. The embedding algorithm is the way or the idea that is generally used to embed the secret information in the cover message.

In steganography, before the hiding process starts, the sender must select an appropriate message carrier, an effective message to be hidden as well as a secret key used as a password. A robust steganography algorithm must be selected that should be able to encrypt the message more effectively. The sender then sends the hidden message to the receiver by using any of the modern communication techniques. The receiver after receiving the message decrypts the hidden message using the extraction algorithm and a secret key.

Figure 1: General Steganography Approach

II. REVIEW OF LITRATURE

In any field, the literature review provides a massive support to find the questions to carry out research.The review of literature reveals that further investigation in the field is required. So in relation to this work many research developments have been taken into consideration.

Great scholar James C. Judgein his work' Steganography: Past, Present, Future’,stated that steganography is the term applied to any number of processes that will hide a message within an object, where the hidden message will not be apparent to an observer [1]. One of the researches byMuhalim bin Mohamed Amin et alin their work on' Information Hiding Using Steganography' has put forward that the system used to enhance the compression rate using LSB technique by randomly dispersing the bits of the message in the image. This technique makes it harder for unauthorized people to extract the original message [2].The pioneer researchers T. Morkel et al in their work 'An Overview of Image Steganography' asserted that different applications have different requirements of the steganography technique used. For example, some applications may require absolute invisibility of the secret information, while others require a larger secret message to be hidden [3].

In one another study by Shawn D. Dickman entitled ' An Overview ofSteganography’,it has been stated that Steganography is a useful tool that allows covert transmission of information over an overt communications channel [4]. One another research byNamitaTiwari et al entitled 'Evaluation of Various LSB based methods of Image Steganography on GIF File Format 'proposed that many different carrier file formats can be used, but digital images are the most popular because of their frequency on the Internet [5]. Prominent research scholar YongzhenZhenget alin their work on ' Identification of Steganography Software based on Core Instructions Template Matching ' proposed an approach, which was based on the principles of LSB Replacement Steganography algorithm and which was used to identify steganography software by Core Instructions Template Matching [6]. Research scholars DipeshAgrawalSamidhaDiwediin their research on ' Analysis of random bit image steganography techniques'propounded that many steganography techniques can be used like least significant bit (LSB), layout management schemes replacing only 1& apos;s or only zero & apos;s from lower nibble from the byte for hiding secret message in an image [7].

SaddafRubab and Dr. M. Younusin their project' Improved Image Steganography Technique for Colored Images using Huffman Encoding with Symlet Wavelets' stated a new devised algorithm to hide text in any coloredimage of any size using Huffman encryption and 2D Wavelet Transform. The results proved that there is very negligible image quality degradation. It gives more capacity for larger image sizes. It enhances security and also preserves the image quality. By inserting Huffman codes into the three components of coloredimage it becomes complicated[8].Shamim Ahmed Laskar and KattamanchiHemachandranin their work on' High Capacity data hiding using LSB Steganography and Encryption'proposed a high capacity data embedding approach by the combination of Steganography and Cryptography. The combination of these two methods will enhance the security of the data embedded. The main objective of this work was to provide resistance against visual and statistical attacks as well as high capacity[9].

Hemalatha Sharma et al in their project on'A Secure and High Capacity ImageSteganography Technique' provides a novel image steganography technique to hide multiple secretimages and keys in color cover image using Integer Wavelet Transform (IWT).However the disadvantage of the approach is that it is susceptible to noise if spatial domain techniques are used to hide the key[10].ElhamGhasemi et al in their work on'High Capacity ImageSteganography Based on Genetic Algorithm and Wavelet Transform' stated the application of wavelet transform and genetic algorithm (GA) in a novel steganography scheme. A GA based mapping function to embed data in discrete wavelet transform coefficients in 4*4 blocks on the cover image has been employed. The optimal pixel adjustment process (OPAP) is applied after embedding the message.This work introduced a novel steganography technique to increase the capacity and the imperceptibility of the image after embedding [11].

Rahul Jain and Naresh Kumar in their research on ' Efficient data hiding scheme using lossless data compression and image steganography' stated a data hiding scheme using image steganography and compression. The improved embedding capacity of the image is possible due to preprocessing the secret message in which a lossless data compression technique is applied. This preprocessing reduces the size of the secret data by a significant amount and thus permits more data into the same image [12]. PrashantDahake in his work‘An Efficient Encryption Using Data Compression towards Steganography' stated that compactness is achieved using data compression technique, that is by using arithmetic coding.In proposed system additional security is provided to data by using encryption technique, which makes use of any cryptographic algorithm and it is applied on the compressed data [13].

In the above study, first of all there has been described the general definitionof the steganography given by a researcher. Then the work done by some pioneer researchers on the steganography to enhance the quality as well as the size of data being hidden in digital media has been described.

After having a deep observation, it has been found that there was a problem related to hide large size text information in small size image. So the aim was to develop a technique which could enhance the compression rate in order to hide large size information in small size images.

III.OBJECTIVES

The main goal of this research work is to enhance the data compression rate by designing and applying compression algorithm on bmp images to facilitate the hiding of enlarged text in an image.This project has following objectives:-

To explore techniques of hiding data using encryption module of this project.
To extract techniques of getting secret data using decryption module.
To design a compression algorithm.
To enhance data compression rate by using the designed algorithm.
To create a tool that can be used to hide the data inside a 24-bit colored image.

IV. OLDTECHNIQUEPROPOSEDTECHNIQUE

1. OLD TECHNIQUE

Old technique was based on LSB algorithm.LSB (Least Significant Bit) substitution is the process of adjusting the least significant bit pixels of the carrier image. It is a simple approach for embedding message into the image. The Least Significant Bit insertion varies according to number of bits in an image. For an 8 bit image, the least significant bit i.e., the 8th bit of each byte of the image is changed to the bit of secret message. For 24 bit image, the colors of each component like RGB (red, green and blue) are changed. LSB is effective in using BMP images as the compression in BMP is lossless. But for hiding the secret message inside an image of BMP file using LSB algorithm it requires a large image which is used as a cover. LSB substitution is also possible for GIF formats, but theproblem with the GIF image is whenever the least significant bit is changed thewhole color palette will be changed. The problem can be avoided by only using thegray scale GIF images as the gray scale image contains 256 shades and thechanges will be done gradually, so that it will be very hard to detect. For JPEG, thedirect substitution of steganographytechniques is not possible as it will use lossycompression. So it uses LSB substitution for embedding the data into images. Thereare many approaches available for hiding the data within an image: one of the simpleleast significant bit submission approaches is "Optimum Pixel Adjustment Procedure". The simple algorithm for OPA explains the procedure of hiding thesample text in an image.

Step1: A few least significant bits (LSB) are substituted with data to be hidden.

Step2: The pixels are arranged in a manner of placing the hidden bits before thepixel of each cover image to minimize the errors.

Step3: Let n LSBs be substituted in each pixel.

Step4: Let d= decimal value of the pixel after the substitution.d1 = decimal value of last n bits of the pixel.d2 = decimal value of n bits hidden in that pixel.

Step5: If (d1~d2) <= (2^n)/2, then no adjustment is made in that pixel.

Else

Step6: If (d1<d2) d = d–2^n.If (d1>d2) d = d + 2^n.

This "d" is converted to binary and written back to pixel.

This method of substitution is simple and easy to retrieve the data and the image quality is better & it provides enhanced security.

Figure 2: General LSB technique

2. PROPOSED TECHNIQUE

The algorithm that has proposed is basically an extension of the original LSB technique, which is quite vulnerable. Instead of hiding data in least significant bits of the RGB components of a pixel, the data would be hidden as shown below:-

Let the data to be hidden is word “ABC”

ASCII code of A= 65 and corresponding binary is 01000001.

ASCII code of B= 66 and corresponding binary is 01000010.

ASCII code of C= 67 and corresponding binary is 01000011.

Let the first pixel’s RGB component be: -

Red component is replaced with binary of 65 i.e. A.

Let the second pixel’s RGB component be: -

Green component of second pixel is replaced with binary of 66 i.e. B.

Let the third pixel’s RGB component be: -

Blue component of third pixel is replaced with binary of 67 i.e. C.

And the process continues until all the pixels get exhausted.

The resulting stego image that will be obtained after the algorithm completes its execution, is distorted and is easy to detect, that some kind of alteration has been done to the image. So, to enhance the security of the secret message the covering of resulting stego image with a new cover image would be done, this is the first level of security. By just looking at the resulting image no one would be able to predict that something is hidden inside it. The new cover image can be the same or different than the original.

In order to increase the storage capacity of the image, a compression algorithm has been used; each component of an RGB pixel is represented with 8 bits. So, the maximum compression would be 8 bits per pixel and minimum would be 1 bit per pixel.

The proposed steganography algorithm comprises of two embedding techniques; which are data hiding technique and data retrieving technique. Data hiding technique as the name suggests is used to hide secret message and key in the cover image, while data retrieving technique is used to retrieve the key and the hidden secret message from the stego image. Therefore data is protected in image without revealing to unauthorized party.

A. Proposed embedding technique.

Inputs: - Text file, cover image 1, cover image 2 and secret key.

Output: - Stego image.

Begin

Select a text file, convert it into binary form and calculate the number of bits in it.
Select a carrier image (cover image 1) for hiding purpose, find the number of pixels, convert it into RGB image and call the compression function.
If bits calculated are compatible with the image resolution, then

Start sub iteration 1

Replace red component of the first pixel with first character.

Replace green component of the second pixel with second character.

Replace blue component of the third pixel with third character.

And repeat iterations until pixels exhaust.

Stop sub iteration 1

Else

Repeat sub iteration 1

Find necessary compression ratio and perform sub iteration 2.

Sub iteration 2

Replace necessary bits as defined by the compression ratio in immediate component of each pixel.

Store the information about bits embedded in a binary address file.

Stop sub iteration2

Provide a security key to encryptthe data for better security.
Select 2nd cover image to hide the distorted stego image.

End

B. Proposed Extraction technique.

Input: - Stego image and secret key.

Output: - Secret text file.

Begin

Browse the stego image.
Choose the folder in which you want to extract the hidden text file.
Provide necessary security key.
Convert the binary file into human readable form.

End

The main focus of this proposed steganography technique is to hide text files in images, compresses the text files so as to increase the overall storage capacity, applying a secret key on the resulting stego image and transferring the secret message without any vulnerability and threat.

Figure 3: General Layout of Proposed System.

Thissystem is able to maintain the accuracy & confidentiality of the data. The system also works by hiding the text files in images using a secret key and is also able to retrieve the data back from the stego image.

V. IMPLEMENTATION OF SYSTEM

The system has been developed in Java. The system basically comprises of two main interfaces, one for embedding purpose and other for the extraction process.

Overview of System

The embedding form looks like as shown below:

Figure 4: Embedding form of application

The embedding form as shown above comprises of three main browsing fields. One for the text file to be embedded, second for the image in which the file will be embedded and third for the cover image to hide the underlying distortion. One important point to note here is that the cover file can or cannot be same as the one used for the hiding process. After filling these necessary fields, the next step is to check the encryption checkbox. User need not to worry about the underlying compression procedure, which in turn is automatically performed by the system itself. User then needs to provide the secret key twice for the verification procedure, various validations are applied here. The secret key along with the text file is embedded inside the image. Once the data has been keyed in and the secret key has been entered, the new stego image can be saved to a different image location. The new stego image can then be used by the user to send it via internet or email to other parties without revealing the secret data inside the image.If the other parties want to extract the hidden data from the stego image, they need to upload the new stego image using the system itself to retrieve the text file hidden inside the image by providing the secret key.