Publication Information

Author(s)

First Name / Middle Name / Surname / Role / Email
Jiang / Yuan / Yu / ASABE Member /

Affiliation

Organization / Address / Country
College of Biosystems Engineering and Food science, Zhejiang University / 268Kuaixuan Road,Hangzhou,310029,People’s Republic of China / China
First Name / Middle Name / Surname / Role / Email
Ren / Ye /

Affiliation

Organization / Address / Country
College of Biosystems Engineering and Food science, Zhejiang University / 268Kuaixuan Road,Hangzhou,310029,People’s Republic of China / China

Author(s)

First Name / Middle Name / Surname / Role / Email
shen / chuan /

Affiliation

Organization / Address / Country
College of Biosystems Engineering and Food science, Zhejiang University / 268Kuaixuan Road,Hangzhou,310029,People’s Republic of China / China

Publication Information

Pub ID / Pub Date
073128 / 2007 ASABE Annual Meeting Paper

The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of the American Society of Agricultural and Biological Engineers (ASABE), and its printing and distribution does not constitute an endorsement of views which may be expressed. Technical presentations are not subject to the formal peer review process by ASABE editorial committees; therefore, they are not to be presented as refereed publications. Citation of this work should state that it is from an ASABE meeting paper. EXAMPLE: Author's Last Name, Initials. 2007. Title of Presentation. ASABE Paper No. 07xxxx. St. Joseph, Mich.: ASABE. For information about securing permission to reprint or reproduce a technical presentation, please contact ASABE at or 269-429-0300 (2950 Niles Road, St. Joseph, MI 49085-9659 USA).

An ASABE Meeting Presentation

Paper Number: 073128

Measurement of 3-D position of tomato based on binocular stereo vision

Huanyu Jiang, assistant professor

College of Biosystems Engineering and Food science, Zhejiang University, 268Kuaixuan Road,Hangzhou,310029,People’s Republic of China,

Ye Ren, Graduate Student

College of Biosystems Engineering and Food science, Zhejiang University, 268Kuaixuan Road,Hangzhou,310029,People’s Republic of China,

Chuan Shen, Graduate Student

College of Biosystems Engineering and Food science, Zhejiang University, 268Kuaixuan Road,Hangzhou,310029,People’s Republic of China,

Written for presentation at the

2007 ASABE Annual International Meeting

Sponsored by ASABE

Minneapolis Convention Center

Minneapolis, Minnesota

17 - 20 June 2007

Abstract. In this paper, a new algorithm for location tomatoes based on stereovision is presented. The color features are used to segment image to recognize tomato. According to the results, the position of the fruit’s centroid can be got with centroid-based matching. Then in terms of grey correlation of neighborhood regions, the algorithm calculates depth of the tomato surface points with area-based matching. And the method of limited candidate region and twice thresholds is used to reduce computation burden and improve precise. The error ranged withinwhen work distance below 550mm.

Keywords. stereovision; location; centroid-based matching; area-based matching; tomato

Introduction

To save the labor, research on fruits harvesting robots have been carried out[l][2]. Before tomatoes cultivated, reorganization mature fruits and location is necessary. Visual perception is an important source of information for autonomous navigation. Stereovision improves approaches by adding the third dimension—depth, to attain a more accurate localization of objects within the sensed scene. Stereo vision systems have two important advantages for automatic applications: (1) their capability to provide useful position information and (2) their insensitiveness to shadows and changes in lighting conditions [3]. Kassay (1992) [4] fixed stereo color CCD on end-effecter and used Hough transform to get center of arc for fruits picking.

This research develops a fast algorithm to recognize and locate mature tomato based on stereovision images. First the images were segmented by threshold. And then area-based matching is combined with centroid-based matching to compute depth for picking end-effectors.

MATERIALS AND METHODS

System Architecture

The stereo vision system consists of two same cameras (Model TMC7DSP, Pulnix), two frame grabber boards and an industry computer. It is necessary to extract its color information to judge the ripe fruit. Therefore, the stereovision sensor system is made of two analogical cameras equipped with three parallel analog video signals, R (red), G(green), B(blue) corresponding to the NTSC (National Television System Committee) color primaries. The R, G, B video signals were converted to an 8-bit color digital image by a frame grabber board (Matrox Meteor/RGB PCI) embedded in industry computer. The resolution of the images was recorded in 640 pixels×480 pixels and tiff format. The view axis of left camera was parallel to and as high as that of right one, and the cameras were set off at same time by a trigger.

Calibration

System Calibration is an absolutely necessary step before obtaining images with stereo vision camera. System calibration references to external parameter, internal parameter and the relation of two cameras’ position. Traditionally the methods of calibration were used to resolve a pack of non-linear equations based on triangle measurement principle. In this paper, SRI Small Vision System soft was chosen which included the Smallv standalone application that was used to calibration the system by means of the calibration toolbox. The calibration procedure requires a set of at least 5 image pairs of a planar checkerboard target, so that it can find the image features in each of the image pairs, and then processes calibration calculation.

Recognition

1. Grey transform

Tomato was cultivated in the boxes which were put in greenhouse. The ripe fruits have obviously difference with unripe ones and other obstacles such as leaves, stem and stem-supporting poles. After color images are converted to grey images, the pixels value is that :

. (1)

It is known that if the pixels appear red in original images, the R value is larger than G value. In term of that, R value will be enhanced and G value weakened. The chromatism is expressed that:

. (2)

The color images are transformed to grey images with the chromatism. And then the suitable threshold value is chosen to segment tomatoes images.

Fig.1 Original Left Image And Right Image

2. Image segmentation

In this study, it was found that the chromatism value in context was very different between the background and the objects. So a fixed threshold value determined from the histogram could separate the mature fruits from its background. It is necessary to remove noise because there is some non-tomato regions in the images segmented. The area of blobs (number of blob) is used as criterion to determinate noise and object. If the area is less than 100, the blob is removed as noise. Fig.2 shows the result of recognition.

Fig.2 The Result Of Recognition

Location

The stereovision was used to locate tomato. Stereovision checks the disparities between two images, and then the depth is calculated with triangle theory. Fig.3 illustrates the geometry model involved in depth detection with triangle theory. The distance between cameras is defined as baseline and represented by b. The focal length is f, and R represents the range or depth to the sensed object. XL gives the position in pixels (image domain) of the object in left image whereas XR gives the position, also in pixels, of the same point in right image. So the disparity is defined as: .

As stated in this equation, the depth and disparity are inversely proportional:

(3)

1. Centroid-based matching

Because of material object, this research developed centroid-matching to locate. According to the result of recognition, it is easy to calculate the coordinates of the centroids in left and right images. Obviously, shown as Fig.3, they must be the projections of one same point- centroids of tomato. Now we have found the first feature point.

Fig.3 Centroid-matching

2. Area-based matching

Approximate depth information of tomato can be got with centroid-matching, however, relying on one point could be too sensitive to noise, and the exact plucking position still hasn’t been found. In order to make this processing more robust, area-based matching is used to locate surface points after centroid-based matching. It is feasible to compute the length and width of the rectangle bounding the tomato in the result of recognition. Then region of interest (ROI) is defined according to the height and width with the coordinates of the centroids for processing area-matching region, as Fig.4.

Fig.4 The region of interest Fig.5 Candidate Region

2.1 matching element

Appropriate matching element must be chosen, before designing stereo matching algorithm. We use the grey of neighboring areas as matching element, but the original images are color image expressed by RGB, hence the linear transform has to be done. Ohta(1980)concluded three orthogonal color features by calculating variances of different color images.

(4)

Found from experiments, I3 is the most efficient feature as matching element. The information of I2 is too little, and though I1 included much the information, I3 has stronger distinguishing power in this ROI, which is more regarded.

2.2 candidate region

In substance, area-based matching is a process of continuously computing maximum likelihood between the windows of left and right image, which conquers much time. In order to reduce computation burden and improve precision, we apply two constraints as follows to choose an appropriate candidate region.

1) Only the points of tomato surface in left image (source image) can have a matching template.

2) As known, ROI is determined by the centroid and the rectangle bounding the tomato, so the coordinates of points in ROI are relation coordinates of centroid and depend of original image. The position of one point in left image is near to that in right image (target image). According to this, we can define the candidate region: based on a point (x,y) in left image, the candidate region in right image is rectangle with center (x,y), height 2m+1 and width 2n+1, as Fig.5.

2.3 area-based matching algorithms

There are some methods to calculate the difference between two widows in left and right image such as sum of square difference (SSD), as expressed by:

(5)

L(x,y）and R(x,y) are respectively the grey of points(x,y) in left and right image. n and m are the width and height of window, and d is the disparity. Changing the format of equation 5, we can get that:

(6)

The first item is the energy of the template window, a constant, while the third one is the energy of target window in right image, which changes slowly as i and j. The second one is the covariance of template window and target window. What do we care for is the state of every item when getting correspondence point, obviously, the second item will be maximum. So equation 6 can be normalized:

(7)

As known, .The reasons that we use R not SSD are:

R can’t be influenced by different brightness of left and right image.

Less computation burden.

Because the surface of tomato is smooth, incomplete diffuse reflection can’t be avoided. In research of stereovision, incomplete diffuse reflection is a big problem all the time. From experiments we find that the grey of normal surface of tomato is different from that of incomplete diffuse reflection region, and the change of R is sudden and obvious. In term of that, we use twice thresholds method to overcome this problem:

(8)

is maximum R in every row of candidate region, is the second maximum R in the same row, whileis the minimum.is the biggest of all.is the scope of R in same row. T1 is the once threshold and T2 is the twice threshold.

The once companion with T1 has two face works: 1）Make sure the distance between and is enough big. The ambiguous matching points are removed for improving precise.2）limit the change range of R in every row to conquer the influence of incomplete diffuse reflection. The twice threshold is just to obtain maximum of all remained to determine the correspond point in target image.

Considering that the size of tomatoes is similar, a range base on centroids can be set up. The points of tomato surface is remained, only when its depth belong to this range. So far, we have gotten the depth of tomato surface points which can serve for various end-effectors.