Author(S) Repeat Author and Affiliation Boxes As Needed s1

Send your completed paper to Sandy Rutter at by 13 April 2007 to be included in the ASABE Online Technical Library.

If you can't use this Word document and you'd like a PDF cover sheet please contact Sandy.

Please have Word's AutoFormat features turned OFF and do not include live hyperlinks. Your paper should be no longer than 12 pages. For general information on writing style, please see http://www.asabe.org/pubs/authguide.html.

This page is for online indexing purposes and should not be included in your printed version.

Author(s)

First Name / Middle Name / Surname / Role / Email
Geetika / (or initial) / Dilawari / Research Engineer,
ASABE Member /

Affiliation

Organization / Address / Country
Oklahoma State University / 211 Ag Hall, Dept of BAE, OSU,Stillwater,OK-74075 / USA

Author(s) – repeat Author and Affiliation boxes as needed--

First Name / Middle Name / Surname / Role / Email
Carol / (or initial) / Jones / Assistant Professor,
ASABE Member /

Affiliation

Organization / Address / Country
Oklahoma State University / 212 Ag Hall, Dept of BAE, OSU,Stillwater,OK-74075 / USA

Publication Information

Pub ID / Pub Date
073030 / 2007 ASABE Annual Meeting Paper

The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of the American Society of Agricultural and Biological Engineers (ASABE), and its printing and distribution does not constitute an endorsement of views which may be expressed. Technical presentations are not subject to the formal peer review process by ASABE editorial committees; therefore, they are not to be presented as refereed publications. Citation of this work should state that it is from an ASABE meeting paper. EXAMPLE: Author's Last Name, Initials. 2007. Title of Presentation. ASABE Paper No. 07xxxx. St. Joseph, Mich.: ASABE. For information about securing permission to reprint or reproduce a technical presentation, please contact ASABE at or 269-429-0300 (2950 Niles Road, St. Joseph, MI 49085-9659 USA).

An ASABE Meeting Presentation

Paper Number: 073030

Estimating Quality of Canola Seed Using a Flatbed Scanner

Geetika Dilawari, Research Engineer (Phd candidate)

Affiliation, Address, e-mail optional.

Carol Jones, Assistant Professor

Affiliation, Address, e-mail optional.

Written for presentation at the

2007 ASABE Annual International Meeting

Sponsored by ASABE

Minneapolis Convention Center

Minneapolis, Minnesota

17 - 20 June 2007

Abstract. Various machine vision techniques have been applied to grade, size and classify various grain types like wheat, rice, lentils, pulses and soybeans. Little work has been done to grade canola using machine vision. Grading canola into samples with less than 2% foreign material (pure sample) and samples with more than 2% foreign materials (impure sample) using flat bed scanners has been outlined as the main objective for this study. Samples with 0%, 2%, 5%, 10%, 20%, 40% and 60% foreign materials were used. Mean intensity values of Red (R), Green (G) and Blue (B) domains of sample images were recorded and analysed using histogram and discriminant analysis. The results from the analysis showed that it was possible to categorize canola into pure and impure samples. It was found that samples were broadly classified into three groups.

Keywords. Canola grading, flat bed scanners.

(The ASABE disclaimer is on a footer on this page, and will show in Print Preview or Page Layout view.)

Introduction

Canola is usually graded on the basis of visual inspection and follows the U.S. standard guidelines (USDA). Canola quality grading is mainly affected by green seeds, damaged seeds, conspicuous admixture, broken seeds, contaminated grain, animal excreta, foreign material, inconspicuous admixture, insect excreta, and stones etc. Many researchers have been working on developing electronic techniques to grade different grain types. Machine vision and image spectroscopy are among a few of the techniques that are being used for this purpose.

Machine vision involves identifying and separating grains by digital image processing. Impurities are separated from the good grain using different physical features like size, shape, texture, color and other morphological parameters. Morphological and textural parameters have also been used to identify the quality of rice (Bal et al, 2006). High-resolution images, obtained using three chip charge coupled device (CCD) color cameras, have been successfully used to identify different grain kernels using an algorithm based upon kernel signature which involved shape, length and color of different grain kernels. Though this algorithm was able to identify different grain kernels, classification of damaged kernels, foreign material content was not tested (Paliwal et al, 1999). Image analysis in conjunction with back propagation neural network as a classifier has been used to identify different grain types on the basis of color and textural features (Visen et al, 2003). Another study on quantification of foreign material (barley) in wheat has shown that back propagation neural networks with statistical classifiers can be used to classify wheat and barley admixture correctly. The neural network showed better performance than the statistical classifier. A need to improve the algorithm, so as to improve the efficiency of the classifier, was identified because the classification accuracy was less for barley admixture equal to 1.2% (Tahir et al, 2006). Machine vision has also been able to determine the percentage of dockage material in the grain sample before and after it has passed through the cleaner and thus has been used to test the performance of a grain cleaner ( Paliwal, 2004). All the above discussed studies have used the CCD camera as their image acquisition device.

CCD cameras yield high resolution images but are quiet expensive. A rather inexpensive machine vision system that has been used and tested by many researchers is the flatbed scanner (Paliwal et al, 2004; Shahin et al, 2001; Shahin et al, 2006). Flatbed scanners with back propagation neural networks as a classifier have been successfully used to classify cereal grains using color, textural and morphological features of the samples (Paliwal et al, 2004). Machine vision techniques using flat bed scanners have also been applied for determining the seed size distribution of lentil seeds, seed sizing of pulses, color and size grading of pulse grains, seed size uniformity of soybean seeds, and quality of rice (Shahin et al, 2001; Shahin et al, 2004; Shahin et al, 2002; Shahin et al, 2006; Kumar and Bal, 2006).

Grading, sizing and classification of cereal and pulse grains have been done using various morphological, color and textural features. Some researchers have also used thresholding based on histogram analysis in conjunction with other morphological features to segment out damaged seeds (Shatadal and Tan, 2003) for soybeans. As discussed earlier different machine vision techniques have been applied for rice, wheat, pulses, soybeans and lentils but little work has been done to apply this technique to grade canola. Therefore grading of canola into samples with less than 2% foreign material (pure sample) and samples with more than 2% foreign materials (impure sample) using flat bed scanners has been outlined as the objective for this study.

Material and Methods

Canola sample seeds were prepared with 0%, 2%, 5%, 10%, 20%, 40% and 60% of foreign material such as straw, pieces of wood, dead grass, and a few small insects. The percent of impurities in a sample was determined on the basis of weight. The sample with 60% foreign material contained moldy canola seeds in addition to other foreign materials. From each sample, five sub samples of 45gm were used for further testing.

The pure samples have been categorized as the sample with less than equal to 2% foreign material and impure samples have been categorized as samples with more than 2% foreign material.

Image and Data Acquisition

Samples were scanned using a color image flat bed scanner (CanoScan 8400, Canon USA Inc., Lake Success, NY). A wooden frame of size 0.127 m x 0.127 m was used to hold each sample in a uniform distribution while on the scanner. For each sample a 512 by 512 pixel image was captured at 150 dpi (dots per inch). Kodak gray cards (Catalog No. E1527795, Eastman Kodak Company, 1999) were used for color calibration of the scanner. Color calibration was done at the start of image acquisition and after taking every two images afterwards. As the gray card reflects 18% of the incident light and the maximum allowable deviation in reflectance is 1% therefore the correction was applied only if the variation in the reflectance values between the sub-samples ranged above 1%. The mean values, that is the average intensity values, of the red (R), green (G) and blue (B) domains were recorded using Adobe Photoshop 2.0, an image editing software. The RGB model assigns each pixel an intensity value ranging between 0 (black) and 255 (white) for each of the RGB components a color image (Adobe Photoshop Elements 2.0). It represents the visible spectrum. Since luminescence in Adobe Photoshop 2.0 is the weighted sum of the non-linear red, green and blue signals therefore the data for it was not used for analysis. The data for all the five samples were recorded and averaged to give a value for mean R, G and B. These averaged values were then used for further analysis.

Results and Discussion

The averaged data for each domain was then plotted in Microsoft Excel, Figures 1, 2 and 3 represent this data. It was found that R and B domains were able to clearly distinguish between pure and impure samples. The average intensity of pixels in R domain for pure samples was significantly less than the impure samples. Similarly the average intensity value of the pixels in B domain for pure samples was significantly higher than the impure samples. It was difficult to distinguish between the samples with 0%, 2%, 5%, 10% and 20% foreign material in the G domain but 40% and 60% foreign material samples showed relatively higher average pixel intensity.

Figure 1 Red Histogram data

Figure 2 Green Histogram data

Figure 3 Blue Histogram data

A linear discriminant analysis was further carried out in JMP statistical software (version 6.0) on the mean values for all the samples to classify samples according to their percent impurities. This method measures the distance from each point in the data set to each group's multivariate mean (centroid) and classifies the point to the closest group. The distance measure used is the mahalanobis distance, which takes into account the variances and co- variances between the variables (Statistics and graphics guide, JMP 6.0). Figure 4 shows a canonical plot of the points and their multivariate means that separates different groups in two dimensions.

Figure 4 Canonical plot obtained from discriminant analysis using RGB domain

In the canonical plot each multivariate mean has been labeled as a circle with the size of the circle corresponds to a 95% confidence limit for the mean. The groups that are significantly different have been represented by the non intersecting circles. From figure 4, it can be noticed that the sample with 0% impurities or the pure sample is significantly different from all the other samples. Samples with 2%, 5%, 10% and 20% impurities appear to be similar and samples with 40% and 60% impurity levels can also classified as one group. The results of this analysis have been depicted with the help of a classification table, Table 1.

Table 1 Classification Table for different canola samples* classified using discriminant analysis

Percent Impurities / 0% / 2% / 5% / 10% / 20% / 40% / 60%
0% / 5 / 0 / 0 / 0 / 0 / 0 / 0
2% / 0 / 2 / 1 / 2 / 0 / 0 / 0
5% / 0 / 0 / 2 / 3 / 0 / 0 / 0
10% / 0 / 0 / 1 / 2 / 2 / 0 / 0
20% / 0 / 1 / 0 / 2 / 2 / 0 / 0
40% / 0 / 0 / 0 / 0 / 0 / 2 / 3
60% / 0 / 0 / 1 / 0 / 0 / 0 / 4

* Number of samples for each type = 5

From the classification table the similarities between the color properties of different samples can be easily found out. Even though some similarities are observed between the impure samples the clear sample has not been misclassified. Therefore it can be inferred that the clear sample is significantly distinguishable from the impure samples. When a similar analysis was done R-G domains no significant improvement in results was observed. Figure 5 shows the canonical plot for the same. The inaccuracy in the classification of the samples increased if the R-B and G-B combination of domains were selected. On the basis of this analysis it can be said that the 7 samples can be broadly categorized in to three groups. Group 1: 0% impurities, Group 2: 5%-40% impurities and Group 3: 40% to 60% impurities.