A Survey of Different Shape Analysis Techniques

2 A survey of different shape analysis techniques Part One – Prepared by Huang Nan

Li Li Ping, Huang Nan

Computer Science Department, FIU

Abstract

Shape analysis methods play an important role in the computer vision

applications. It is used for object recognition, matching, registration and analysis [1]. In this paper we will describe two major shape analysis categories, one is focus on the shape boundary (or contour) points and the other one is focus on the global (or interior) method. In this paper we will describe several typical algorithms for each category and try to analysis and compare the relevant advantage and disadvantage within these two methods.

1 Introduction

Retrieving image object is one of the important function of modern multimedia application system which used in the filed of entertainment, art, science, business and engineering. Retrieving images by their contents, instead of by other characters, is more and more becoming a operation strategy. The fundamental task for content-based image retrieval is the technique used for comparing sample image and the model images in image database and try to find the images which have some kind of relations such as color, shape with the sample image [2].

There are two general methods for image comparison: intensity-based (color and texture) and geometry-based ( shape). We found that the user more prefer to used shape retrieval method instead of intensity-based method. However, image retrieval by shape is still considered one of the most difficult aspects of content-based search [2].

Shape retrieval algorithm analyze the objects in a scene. In the followed sections, we will focus on the shape representation and description concepts of shape analysis. Shape representation methods are result in a non-numeric representation of the original shape, but the important characters of the original shape are preserved. On the contrary, shape description refers to the methods that result in a numeric descriptor of the shapes and is a setup subsequent to shape representation [1].

Use shape description method, there need to create a shape descriptor vector from the target shape. The purpose of the description is to use the shape descriptor vector to separate the different shapes, because the descriptor vector holds some unique characters for each shape [1].

Shape matching is the process to comparing sample shapes and model shapes. The sample shape is matched to one or several model shapes by comparing the shape descriptor vectors using some algorithm.

One of the arguments of shape description is how to evaluate the result quality of a shape description method? Not all methods are appropriate for every kind of shape and applications. For example, the presence of noise, the properties of the shapes (regular or irregular shapes) [1].

But there are several common criteria need to be followed when develop or evaluate the shape description algorithms [1]:

·  Accessibility: means that the algorithm used to create a shape descriptor should consider the memory and running time cost of the system, it should be affordable and in reasonable time frame

·  Scope: means how many kinds of shapes (regular shape, irregular shapes, etc…) can be represent by this kind of algorithm

·  Uniqueness: means whether the result descriptor vector is one to one match with the original image or there is in case one vector maps to multiple original images

In the followed sections, we will begin with the discussion the boundary based (contour) algorithms and global (or interior) method in separate, then try to analysis for what kind of images areas they are suitable to apply.

2 Boundary based image retrieval algorithm

Boundary scalar transform techniques

Boundary scalar transform is very straightforward in the process. The basic concept is to use a one-dimensional function to represent the two-dimensional shape boundary. This method typically use one-dimensional function to represent the two-dimensional shape boundary, and the result is scaleable but not a graph, an image or other values which like the shape.

2-D shape boundary can be represented using a real or complex 1-D function. In [1], there are several algorithms were introduced in this area. For example,

1.  Turning function: Which used a tangent angle versus arc length, the tangent angle at some point is measured relative to the tangent angle at the initial point. This method is typically suitable to measure polygonal shapes

2.  Shape centroid: The approach is that select a centroid point of the shape, and the values of the 1-D function are the distances between shape centroid point and boundary points. The boundary points are selected based on the criteria that the central angles are equal. (Refer to the Figure 1)

Figure 1: The centroid to boundary distance approach

3.  Instead of calculate the distance from the centroid to the boundary points. Chang, Hwang and Buehrer [1] purpose another way that construct the distance function from the centroid to the feature points. The features points are the points of high curvature. Weather use the high curvature points from the computing curvature of the boundary curve, or perform a polygonal approximation of the shape and use the polygon vertices as the feature points.

4.  Using a sequence of line segment moments as a 1-D function. Line segments are obtained by partitioning the radial line form the center of the mass to the boundary point. Segments are partitioned into parts within the shape and parts outside the shape.

5.  Using Arc Height Function (AHF) to describe the shape boundary. The methoridozy of arc height concept is shown in Figure 2 and defined as followed: An arc chord with a predefined length is insert on the boundary. A vertical line is cross the arc chord and separate to two same length parts. also reach the boundary at point C. The length of is called the arc height at position A. As the arc chord is moved along the curve, a mapping between arc length and arc height defined the AHF. The AHF is then used to represent the shape.

Figure 2: The Arc Height Concept

Boundary Space Domain Technique

Different with the Boundary scalar transform technique. The Boundary space domain technique use shape boundary as input and produce the result as the format of graph or pictorial [1]. This kind of technique is typically suitable for the shape recognition. Because typically the process of the shape recognition system is always beginning with the early preprocessing phase to higher levels where the final interpretation of the visual scene is performed [1]. One of the most important characters of such kind of system is that the result is a image, a graph, instead of scalar results which we analysis in the previous section.

Chain Code

When process the boundary image representation and description, efficient data compression ratio and faithful reproduction are both essentially required. Freeman [3] proposed the chain code scheme, which describes arbitrary image boundary in a simple and efficient manner. In this approach, an arbitrary boundary image is represented by a sequence of small vectors of unit length and limited set of possible directions.

The basic idea of the chain code algorithm is: As the successive points of a continuous image boundary are adjacent to each other in the view of an grid area, so that each data point of the boundary can be represented in a sequence coincides with each of the grid points surrounding a current data point.

First, define a N neighboring grid points are defined in a counter-clockwise sense, starting from the one which is horizontally from the right. N >= 8 are called generalized chain code. Detailed refer to the Figure 3.

3 / 2 / 1
4 / 0
5 / 6 / 7

Figure 3: Freeman’s generalized chain code

Second, a directed straighten line segment linked two adjacent points of the image boundary are called link. And a chain is defined as an ordered sequence of links.

Third, the boundary image is represented by the chain code by from some initialized point and cross though the whole boundary as the clock-wise sense, each point is converted to the value of the Figure 3 based on the direction of the link with the previous point.

The example can be get from the followed Figure 4.

x

Figure 4 (a): A sample closed object, the starting point is x

{1,2,2,0,0,7,7,7,6,6,

5,4,4,5,0,0,2,2,2,2,} (Clock-wise cross the boundary)

Figure 4 (b): the chain code presentation of the object that based on Figure 3 ‘s definition

Chain code represent boundary image by bits/link, because each pixel of the boundary is represent by a octal digital from 0 to 7, so we can use 3 bits of space to store the each chain code and the algorithm can help us to achieve data compression goals. The coding schemes mentioned above assume that a boundary image is represented by a sequence of straight-line segments connecting tow adjacent grid points.

Freeman’s chain code description of the boundary image is used to extract the critical points which he then used to generate a shape description that is invariant to translation, rotation, and scale [1]. It can be made invariant to rotations of 900, because the 900 rotation causes only a circular shift the chain code.

There are some enhanced boundary image encoding schemes based on Freeman’s ideas. For example, in[1], there introduced a modified chain code to perform a symmetry analysis. The shape boundary is represented in a hierarchical way so that at the highest level a lower number of polygon vertices is used, while at the lowest level the finest polygonal approximation is utilized. The search for the symmetric axis position is performed by starting at the highest level and shifting to lower levels as the position of the symmetry axis becomes move accurately determined.

Pairwise object recognition algorithm

Pairwise geometric histograms (PGH) are a representation used for the recognition of polygonal shapes. PGH is a robust solution for the recognition of arbitrary 2D shape and a shape may be reconstructed from its PGH representation [7].

The basic idea of pairwise geometric histograms is: The boundary image (typically polygonal shape) are represented by its edgepoints. And use the consecutive edgepoints to define the line segments which organizing the shape boundary. In [4], the PGH is calculated using the following algorithm: Let each line segment be a reference line on its turn., then comparing the reference line to all other lines and calculate the relative angle q between the reference line and each line, and the perpendicular minimum and maximum distance(dmin and dmax ) . The histogram values are increased by one on the indexes corresponding to the angle q and the line segment from the dmin to dmax .

Figure 5: Relative angle and perpendicular distances between reference line and other line

Boundary Approximation

There are other descriptions algorithms to represent the boundary image. One of the typical methods is polygonal approximation.

Polygonal approximations are to use the polygonal line to approximately representing the boundary image. These methods are based on the use of the minimal error, the minimal polygon perimeter, the maximal internal polygon area, or the minimal external polygon area are approximation criteria [1]. In [5], a hierarchical method is applied for the polygonal approximation algorithm.

In [5], the strategy is focus on how to use systematic and mathematically well-founded way of to find the descriptions or simplified representations of curves and boundary of the image, the answer is: To approximate them with some family of functions.

The method is based on multiscale approximations with lines. A crucial first step is the generation of a hierarchic family of polygonal approximations. From this family the descriptive shape properties are derived by analysis of the features that are stable over scale. The split-and-merge algorithm was used to create the polygonal approximation. The scale-space approach was used to track the position of inflection points on the boundary curve. Stable shape features are those which remain unchanged over scale. An important aspect of this approach is that the result is relevant transformational invariance [5]. Detailed mathematics algorithms please refer to [5].

Boundary Decomposition

In [6], a algorithm is developed by Hong-Chih Liu and Mandayam to recognized and locate partially distorted shapes without regard to their orientation, location and size. The idea of this approach is to estimate the orientational, scaling, and transnational data between the target image and model shape by using a small number of control points extracted from both shapes. The algorithm is not only computationally simple, but also works reasonably well in the presence of a moderate amount of noise.

The detailed process as:

·  Boundary segmentation. First calculate the curvature function from the target image, the points of local maxima and minima are determined from the smoothed curvature function and are assigned as the control points to break the boundary of the shape into segments. Each shape is represented by an ordered sequence of line segments with control points as vertex points.

·  Boundary matching. First step, compare every segment of the target with every segment of the shape from the template data bank. If the length ratio and angle between these two segment under the target values assigned by user, these segments can be used for the next step. Second step, groups of segments are compared and a group was disqualified if less than three consecutive segments matched.

·  Finally, the distance between two shapes is measured to measure the similarity between there two shapes. The target shape is assigned to the class corresponding to the minimum distance.

The choice of the proper shape recognition method is always a compromise between recognition power and computational complexity. Chain Code algorithm can not preserve information on the exact shape of a boundary image, because it only shows the probabilities for the different directions present in a boundary. Thus, there may be many shapes with the same chain code. However, the chain code histogram is fast to calculate and it needs only a small amount of space, then this algorithm is typically suitable for real-time application. The pairwise histogram is computationally heavy and it requires more memory space. If the images are non-polygonal, the polygonal approximation must be made which requires more time, then this method is typically suitable for images which contain polygonal objects.