JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING
HAND GESTURE RECOGNITION BASED ON COLLABORATIVE AUGMENTED REALITY ENVIRONMENT FOR HUMAN-COMPUTER INTERACTION
1 AKHIL KHARE, 2VINAYA KULKARNI, 3 Dr. UMESH KUMAR SINGH
1 Research Scholar, JNU, Jodhpur, India
ABSTRACT: With rapid development of virtual reality, the easy and efficient virtual interface between human and computer is widely needed as per its popularity. Computer vision can hold a lot of information at a very low cost. So it has increasingly become very popular for creating user interfaces. As a contrary, much hi-end research is going on to provide more natural interface for human-computer interaction with the threshold of computer vision. Hand gesture is one of the most powerful and natural interface for interaction using computer vision. Hand replaces the currently used cumbersome and inefficient devices like mouse and keyboard. In this paper different algorithms for parsing of hand gesture are presented. This techniques does not employ any extra device like head mounted display, gloves etc. Also they do not use any special camera for operating beyond visible spectrum (or any active technique that require some form of structured light etc). So with this idea, with a simple video camera and bare hands, a person can interact with computer.
KEY WORDS: Augmented Reality, Virtual Interface Between Human And Computer, Hand Gesture Recognition Hand Modeling, Real Time Multi-Hand Posture Recognition.
ISSN: 0975 –6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 55
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING
1. INTRODUCTION
Now a day’s research is going on for providing more natural interface for human-computer interaction based on computer vision. Hand is a natural and powerful means of communication that conveys information very effectively. Hand gesture is most popular and effective medium for communication in virtual environment. In the initial days different devices were imposed on users such as head mounted display, digital gloves etc. These devices were very troublesome to use and user feels uncomfortable so it limits the users’ movement. On the other hand vision based gesture recognition system that uses bare hand is becoming very popular because it does not need any device to impose on user’s body. Instead, it provides a natural hand gesture recognition interface system for human-computer interaction. The whole process of hand gesture recognition is broadly divided in three steps first is the segmentation that is the hand is separated from the background using different methods such as colour segmentation method. Then the features of the hand are extracted that is the feature detection and with the help of extracted features multiple hand gestures are categorized in to three groups communicative, manipulative and controlling gesture. Communicative gesture is used to express an idea or concept. Manipulative gesture is used to interact with virtual objects in virtual environment. To control a system controlling gestures are used. In this paper we focus our attention to vision based recognition of hand gesture first part discusses basic steps in hand gesture recognition. these steps are described in detail as well as different techniques for hand gesture recognition are discussed and finally the conclusion.
2. LITERATURE ANALYSIS
One of the method proposed by Rokade et al [1] uses the technique of thinning of segmented image, but it needs more computation time to detect different hand postures.
One method is based on elastic graph matching, but it is sensitive to light changes [2]. In a system proposed by Stergiopoulou and Papamarkos YCbCr color segmentation model was used but the background should be plane and uniform [3].
In one method CSS features were used by Chin-Chen Chang for hand posture recognition [4]. In the method presented by this paper a boost cascade of classifiers trained by Adaboost and haar like features are used to accelerate the evaluation speed used to recognize two hand postures for human-robot recognition. It uses haar like features along with color segmentation method to improve the accuracy in detecting the hand region and then the topological method is used to classify different hand postures.
In the method proposed by Shuying Zhao [5] for hand segmentation Gaussian distribution model (for building complexion model) is used. With Fourier descriptor and BP neural network an improved algorithm is developed that has good describing ability and good self learning ability. This method is flexible and realistic. In the system proposed by Wei Du and Hua Li statistic based and contour based features are used for stable hand detection [6].
In a system developed by Utsumi [7] a simple hand model is constructed from reliable image features. This system uses four cameras for gesture recognition In a system known as fingermouse developed by Quek the hand gesture replaces mouse for certain actions [8]. In this system only one gesture that is pointing gesture is defined and for mouse press button shift key on the keyboard is used. Segan has developed a system [9] that uses two cameras to recognize three gestures and hand tracking in 3D. By extracting the feature points on hand contour the thumb finger and pointing finger are detected by the system.
In the system presented by Triesch multiple dcues such as motion cue stereo cue, color cue are used for robust gesture recognition algorithm[10]. This system is used in human robot interaction that helps robot to grasp objects kept on the table. In the system real time multihand posture recognition system for human-robot interaction haar like feature and topological features were used along with color segmentation technique [11]. This method gives accurate results and a rich set of features could be extracted.
3. THE GESTURE RECOGNITION SYSTEM:
Different methods and technologies have been used for gesture recognition, but the major steps in gesture recognition are the same. The complete system is categorized in to three parts,
3.1 HAND SEGMENTATION: This step is also known as hand detection. It involves detecting and extracting hand region from background and segmentation of hand image. Different features such as skin colour, shape, motion and anatomical models of hand are used in different methods. The output of this step is a binary image in which skin pixels have value 1 and non-skin pixels have value 0. Different methods for hand detection are summarized in this paper. Some of them are.
Colour: Different colour models can be used for hand detection such as YCbCr, RGB, YUV etc.
Shape: The characteristics of hand shape such as topological features could be used for hand detection.
Learning detectors from pixel values: Hands can be found from their appearance and structure such as Adaboost algorithm. 3D model based detection: Using multiple 3D hand models multiple hand postures can be estimated.
3.2 FEATURE EXTRACTION: The next important step is hand tracking and feature extraction. Tracking means finding frame to frame correspondence of the segmented hand image to understand the hand movement. Following are some of the techniques for hand tracking.
a) Template based tracking: If images are acquired frequently enough hand can be tracked. It uses correlation based template matching. by comparing and correlating hand in different pictures it could be tracked.
b) Optimal estimation technique: Hands are tracked from multiple cameras to obtain a 3D hand image.
c) Tracking based on mean shift algorithm: To characterize the object of interest it uses color distribution and spatial gradient. Mean shift algorithm is used to track skin color area of human hand.
Two types of features are there first one is global statistical features such as centre of gravity and second one is contour based feature that is local feature that includes fingertips and finger-roots. Both of these features are used to increase the robustness of the system.
3.3 GESTURE RECOGNITION: The goal of hand gesture recognition is interpretation of the meaning of the hands location and posture conveys. From the extracted features multiple hand gestures are recognized. Different methods for hand gesture recognition can be used such as template matching, method based on principle component analysis, Boosting contour and silhouette matching, model based recognition methods, Hidden Markov Model (HMM). Hand gesture is movement of hands and arms used to express an idea or to convey some message or to instruct for an action. From psychological point of view hand gesture has three phases.
a) Preparation: bringing hand to starting posture of gesture.
b) Nucleus: includes main gesture.
c) Retraction: this includes bringing hand to resting position.
Finding starting and ending position of the nucleus phase is a difficult task because different persons have different shape and hand movement.
3.4 HAND MODELLING: Different model based solutions for hand gestures are proposed here. A typical vision based hand gesture recognition system consists of a camera, a feature extraction module, a gesture classification module and a set of gesture models. The necessary hand features are extracted from the frames captured by video. These features are classified as,
a) High level features that are based on three dimensional models.
b) The image is used by view based approach as a feature.
c) Some low level features that are measured from image.
Hand gesture is the movement of hand in air so it is required to define a spatial model to represent these movements. Two types of models are used. Volumetric models that describes the shape and appearance of hand and skeletal model represents hand posture.
4. CLASSIFICATION OF VISION BASED GESTURE RECOGNITION METHODS:
There are a number of methods that are used for hand gesture recognition. Some of the vision based hand gesture recognition systems are discussed below.
4.1 HAND MODELLING WITH HIGH LEVEL FEATURES: in this method multiple images are captured by multiview point camera and then the hand regions are extracted from images. Using all these multiview point images and integrating them a hand posture can be constructed. this model is compared with hand model to recognize hand posture with the help of hand tracking.
4.2 VIEW BASED APPROACH: The hand is modeled using a collection of 2D intensity images and the gestures are modeled as a sequence of views. These approaches are also called appearance-based approaches. This approach has been successfully used for face recognition. Eigenspace approaches are used within the view-based approaches. They provide an efficient representation of a large set of high dimensional points using a small set of orthogonal basis vectors. These basis vectors span a subspace of the training set called the eigenspace and a linear combination of these images can be used to approximately reconstruct any of the training images.
4.3 LOW LEVEL FEATURES: This method is based on the assumption that detailed information about hand shape is not necessary for humans to interpret sign language. It is based on the principle that all humans hand has approximately same hue and saturation. A low level feature set is used to recognize hand posture. This method achieves approximately 97% accuracy.
4.4 GESTURE SEGMENTATION METHOD BASED ON COMPLEXION MODEL:
In this method a gesture segmentation method based on complexion model and background model that uses Fourier Descriptor and BP neural network is discussed. Hand segmentation is done by selecting color space and building complexion model and background model. The only use of complexion model may produce some interference by similar complexion region. This interference can be removed by using background model along with complexion model. In this method the gesture is recognized using Fourier descriptor and BP neural network. In this approach contours are described effectively by using Fourier descriptor because it is rotational invariance, translation invariance and scale invariance. Y calculating fourier factors of the border points of the gesture fourier descriptors can be obtained. Hand gesture can be identified fast with this method. Hand gesture classification based on BP neural network solves the problem of low recognition rate and problem of gesture segmentation in intricate background. The experimental results show that the method is flexible, realistic, exact and fit for many applications in virtual reality. But in high light and shadow the result is still not perfect.
Fig. 1(a)
Fig. 1(b)
Fig. 1(c)
Fig. 1(d)
Fig. 1 Segmentation result. (a) and (c) original picture and fig (b) and (d) segmentation result
4.5 MULTIHAND POSTURE RECOGNITION BASED ON HAAR LIKE FEATURE AND TOPOLOGICAL FEATURES:
This method uses haar like feature detector and topological method along with color segmentation to get accurate hand posture recognition.
A rich set of Haar-like features are computed from the integral image. Each Haar-like feature is composed of two connected "black" and "white" rectangles. The value of a Haar-like feature is obtained by subtracting the sum pixel values of the white rectangle from the black rectangle.
Hand region segmentation: this is an important step in hand posture recognition. This is accomplished by two important techniques, haar-like feature and color based model. Haar like detector is used to detect both left and right hand posture. Initially the input image is transformed in to an integral image because from integral image these features could be extracted fast. A set of haar like features are computed from the integral image. Edge and rotated haar like features were proposed in this algorithm that gives better description of hand posture. Therefore more than 60,000 features could be extracted from each sub of input image with size of 24×30
Fig 2 Haar-like features
For feature extraction following steps are followed a) Hand region area is converted to binary image. Hand area has pixel value 1 and nonhand region has value 0