This research is aimed at evaluating the texture and shape features using the most commonly used neural network architectures for cereal grain classification. An evaluation of the classification accuracy of texture and shape features and neural network was done to classify four Paddy (rice) grains, namely, Karjat-6(K6), Ratnagiri-2(R2), Ratnagiri-4(R4), and Ratnagiri-24(R24). Algorithms were written to extract the features from the high-resolution images of kernels of four grain types and used as input features for classification. Different feature models were tested for their ability to classify these cereal grains. Effect of using different parameters on the accuracy of classification was studied. The most suitable feature from the features for accurate classification was identified. The shape feature set outperformed the texture feature set in almost all the instances of classification.
1. Introduction
The application of computer vision technologies along with capturing, processing, and analyzing images can be effectively used in nondestructive assessment of visual quality characteristics in agricultural and food products Chaugule and Mali [1]. Seed image analysis includes the techniques of acquisition, preprocessing, and interpretation, resulting in quantification and classification of seeds. The objective of this research is to identify and classify grain kernels of four paddy types to minimize the losses incurred during harvesting, production, and marketing by sowing proper type of seeds in the farm.
Grain kernels considered as agricultural objects are of variable sizes, shapes, colors, and textures. The accuracy of the classifying algorithm depends on the extracted features which can be further processed to identify the class of the seed. The precision of computer vision can further be exploited to detect the seeds infected by insects or to detect damaged grain kernels. This process of perfectly classified seeds without any damage or infections is essential to increase the productivity of specific grains.
2. Related Work
Substantial work in seed texture and shape feature extraction using image processing has been carried out under the application of computer vision in agriculture.
Granitto et al. [2, 3] extracted features corresponding to morphology of the weed seeds. The final six parameters that were selected for classification were ratio of semiaxis lengths of the main principal axis [h1/h2], ratio of seed and enclosing box areas [A/(h1+h2)×(v1+v2)], square root of seed area [SQRT(A)], and moments of the planar mass distribution with respect to the principal axes [M20,M21, and M22]. Also features corresponding to textural characteristics of the weed seeds were extracted. Two different matrices, gray level cooccurrence matrix and gray level run-length matrix, were used to describe seed surface texture. The final two parameters selected for classification were contrast along the main principal axis direction and cluster prominence along the secondary principal axis direction. Chen et al. [4] found a total of 30 morphological features to be extracted for identifying corn varieties.
Zhao et al. [5] extracted the 11 geometric features of corn kernels based on binary image including contour points, perimeter, area, circular degrees, equivalent diameter, major length, minor length, stretching of the length of the rectangle, maximum inscribed circle, and the smallest excircle. Texture features such as mean, variance, smoothness, third moment, consistency, entropy, and 7 statistical invariant moments from the gray image were obtained. Kiratiratanapruk and Sinthupinyo [6] adopted texture features such as energy, contrast, correlation, and homogeneity based on gray level cooccurrence matrix (GLCM) and local binary pattern (LBP) for corn image classification.
Paliwal et al. [7] extracted area, perimeter, major axis length, minor axis length, elongation, roundness, the Feret diameter, and compactness features. Shouche et al. [8] extracted geometric features and shape related features described by moments for feature analysis of the cereal grains. Geometry related features including area, perimeter, major and minor axis lengths, compactness, axis ratio, shape factor 1 to shape factor 5, and spread and slenderness were measured from the binary images. Standard or raw, central, normalized central, and invariant moments were computed from the digital images of each grain and mean, standard deviation (S.D.), and standard error (S.E.) were calculated. Standard moments m00, m10, m20, m02, m30, and m12; central moments like mu00, mu20, mu02, and mu30; normalized central moments nu20, nu02, and nu30; and invariant moments M1,M2,M3, and M4 all showed coefficients of variation below 5%. The other moments showed higher coefficients of variation. Visen et al. [9] used eight morphological features, namely, area, perimeter, length of major axis, length of minor axis, elongation, roundness, the Feret diameter, and compactness.
Paliwal et al. [10] extracted area, perimeter, major axis length, minor axis length, maximum radius, minimum radius, mean radius, four invariant shape moments, and 20 harmonics of Fourier descriptors Gonzalez et al. [11] Fourier descriptors were calculated for boundary pixels as well as radii. Also eight gray level cooccurrence matrix (GLCM) and six gray level run-length matrix (GLRM) models as textural features were extracted. Paliwal et al. [12] extracted 51 morphological and 56 textural features. Dubey et al. [13] used the 45 morphometric parameters. Singh et al. [14] extracted the morphological features from grain kernels. Also the textural features were derived from the gray level cooccurrence matrix (GLCM) and gray level run-length matrix (GLRM). Pourreza et al. [15] extracted 131 textural features, including 32 gray level textural features, 31 LBP features, 31 LSP features, 15 LSN, and 10 gray level cooccurrence matrix (GLCM) and 12 gray level run-length matrix (GLRM) models for each monochrome image of the bulk wheat samples.
Wiwart et al. [16] determined the following descriptors for the image of each wheat kernel represented by a single blob (ROI—region of interest): area, perimeter, circularity, the Feret diameter, minimal Feret diameter, aspect ratio, roundness, and solidity. Huang [17] calculated a pair of orthogonal eigenvectors of the covariance matrix. The geometric features, the principle axis length (Lp), secondary axis (Ls), the centroid, axis number (Lp/Ls), area (A), perimeter (P), and compactness (4πA/P2) were computed using eigenvectors for areca nuts. Li et al. [18] extracted fourteen shape characteristic parameters of cottonseeds: the area, perimeter, NCI ratio, circular degree, center of gravity X, center of gravity Y, major diameter, short diameter, second moment X(Mx2), second moment Y(My2), second moment XY(Mxy), major axis of oval, short axis of oval, and shape coefficient of oval.
Adjemout et al. [19] found the recognition procedure on the basis of shape features separately for corn, oats, barley, and lentil. 15 shape features, the perimeter, the surface, the circularity, major axis, minor axis, Hu’s moments, and central moments of second order, were calculated from the preprocessed images. Spatial gray level dependence method was used for extracting texture features such as second angular moment (SAM) which gives information about the homogeneity of texture, contrast (CONT) which measures the local variation of texture and supports the great transitions from the gray levels, entropy (ENT) which evaluates the degree of organization of the pixels, variance (VAR), differential inverse moment (IM), and correlation (COR).
Mebatsion et al. [20] determined three geometric features, namely, aspect ratio (AR), major diameter (MD), and roundness (Ceq), using ellipse fitting and Green’s transformation of curve integrals, respectively. The geometric characteristics of individual, nontouching grain kernels were estimated after the transformation of digital images to representative polygons defined by coordinates on the natural boundary of grain kernels. The morphological classification model was defined using SFX, AR, MD, and Ceq.
3. Proposed Methodology
The block diagram for the proposed methodology is as shown in Figure 1. The steps are as follows.
Block diagram.
3.1. Material and Grain Samples
Sony make 18.9 megapixels digital camera, black cloth sheet for background, and photographs of the paddy seeds. The Seed Testing Laboratory Pune, India, provided the grain samples used in this study. The unclean commercial samples of four paddy grains, K6, R2, R4, and R24, were collected.
3.2. Image Capturing
The images were acquired using the above specified digital camera from different distances. Randomly any numbers of seeds were placed on the black background with the kernels not touching each other. The sample images and number of seeds for the specific type are shown in Table 1.
Type of different variety of grains and Number of seeds.
Type of seed
Number of seeds
1919
1863
1780
3417
3.3. Image Preprocessing
The image analysis software was developed in Matlab version 7.12.0.635 (R2011a). In order to extract object features, image segmentation and any necessary morphological filtering were done. A total of sixty-four features which included twenty texture features and forty-four shape features were extracted by the algorithm and then used for evaluating the performance and the obtained statistics were fed to ANN classifier.
3.4. Feature Extraction3.4.1. Texture Features Extraction
Texture consists of texture primitives, texels, which are a contiguous set of pixels with some tonal and/or regional property. Texture can be characterized by intensity (color) properties of texels and structure spatial relationships of texels. Textures are highly scale dependent. Texture features are categorized as statistical/structural and syntactical. In statistical texture analysis, texture features are computed from the statistical distribution of observed combinations of intensities at specified positions relative to each other in the image. According to the number of intensity points (pixels) in each combination, statistics are classified into first-order, second-order, and higher-order statistics.
(a) Second-Order Statistics Features. First-order statistics features provide information related to gray level distribution of the image but not about relative position of the various gray levels within image. Second-order statistics features do this where pixels are considered in pairs. Two parameters or more, such as relative distance and the orientation among the pixels, are used.
(b) Gray Level Cooccurrence Matrices. The gray level cooccurrence matrix (GLCM) method is a way of extracting second-order statistical texture features. The GLCM gives information about the distribution of gray level intensities. A GLCM is a matrix where the number of rows and columns is equal to the number of gray levels, G, in the image. The matrix element P(i,j∣Δx,Δy) is the relative frequency with which two pixels, separated by a pixel distance (Δx,Δy), occur within a given neighborhood, one with intensity i and the other with intensity j. One may also say that the matrix element P(i,j∣d,θ) contains the second-order statistical probability values for changes between gray levels i and j at a particular displacement distance d and at a particular angle (θ). The image features derived from the GLCM are contrast, correlation, energy, homogeneity, and entropy which are explained below are taken from [11].
(1) Contrast. This is also called sum of squares variance. This measure of contrast or local intensity variation will favour contributions from P(i,j) away from the diagonal; that is, i≠j. It is a measure of the intensity contrast between a pixel and its neighbor over the whole image. The value is 0 for a constant image.
Contrast equation:(1)CTR=∑n=0G-1n2{∑i=1G∑j=1GP(i,j)},|i-j|=n.
When i and j are equal, the cell is on the diagonal and (i-j)=0. These values represent pixels entirely similar to their neighbor, so they are given a weight of 0. If i and j differ by 1, there is a small contrast, and the weight is 1. If i and j differ by 2, contrast is increasing and the weight is 4. The weights continue to increase exponentially as (i-j) increases.
(2) Correlation. Correlation is a measure of gray level linear dependence between the pixels at the specified positions relative to each other. It is a measure of how correlated a pixel is to its neighbor over the whole image. The range of correlation is in [-11]. Correlation is 1 or −1 for a perfectly positively or negatively correlated image. Correlation is NaN for a constant image.
(3) Energy. Energy returns the sum of squared elements in the GLCM. The range is in [01]. Energy is 1 for a constant image.
Energy equation:(3)Energy=∑i=0G-1∑j=0G-1{P(i,j)}2.
(4) Homogeneity. Homogeneity returns a value that measures the closeness of the distribution of elements in the GLCM to the GLCM diagonal. The range is in [01]. Homogeneity is 1 for a diagonal GLCM.
(5) Entropy. Homogeneous scene has high entropy while inhomogeneous scenes have a low first-order entropy. Maximum entropy is reached when all probabilities are equal.
Entropy equation:(5)Entropy=-∑i=0G-1∑j=0G-1P(i,j)×log(P(i,j)).
To study the effect of parameters on the classification ability of the ANN, twenty texture features concerning five features described above for all the four offsets were used for classifying the grains in four categories. To specify the distance between the pixel of interest and its neighbor, offset (the relationship, of a pair of pixels) is used. The offset is expressed as an angle. Figure 2 illustrates the offset values that specify common angles, given the pixel distance; the distance specified here is 1 from the center pixel. Consider
(6)Offset={[01]for(0°);[-11]for(45°);[-10]for(90°);[-1-1]for(135°)}.
Offset values.
3.4.2. Shape Features Extraction
For a set of complete and compact descriptors to characterize the seeds algorithms were written to extract region descriptors. Region descriptors characterize an arrangement of pixels within the area. There are actually many techniques that can be used to obtain region descriptors of an object. Here, we have extracted two forms of region descriptors: basic descriptors (geometric features) and statistical descriptors (shape related features) defined by moments.
Feature analysis of grains include extraction of a total of eight geometric features, five shape factors, ten standard moments, seven central moments (μ), seven invariant moments (φ), and seven normalized central moments from the high-resolution images of kernels of paddy grains. Geometric descriptors include area, perimeter, major axis length, minor axis length, axis ratio, and compactness. From the values of axis length, perimeter, and area, shape factors 1–5 were derived, as described in [8, 21]:
(7)Shapefactor1=1compactness,Shapefactor2=MajoraxislengthArea,Shapefactor3=AreaMajoraxislength3,Shapefactor4=Area(Majoraxislength/2)(Majoraxislength/2)π,Shapefactor5=Area(Majoraxislength/2)(Minoraxislength/2)π.
Moments include standard, central, normalized central, and invariant moments. Moments were also used to find the spread and slenderness of individual grains according to Hu [22]. However, moments include ten standard moments m00,m01,m10,m02,m20,m03,m30,m11,m12, and m21, seven central moments μ11,μ02,μ20,μ03,μ30,μ12, and μ21, seven normalized central moments η11,η02,η20,η03,η30,η12, and η21, and seven invariant moments φ1, φ2, φ3, φ4, φ5, φ6, and φ7. Standard formulae from [11, 23] were used to calculate the moments. Moments describe a shape’s layout (the arrangement of its pixels), a bit like combining area, compactness, irregularity, and higher-order descriptions together [24].
The above sixty-four features are used to study their effect on the classification ability of the ANN.
4. Artificial Neural Network Architectures
Artificial neural network (ANN) classifier is emerging as the best suited classifiers for pattern recognition which are regarded as an extension of many classification techniques. They are based on the concept of biological nervous system. NNs explore many hypotheses simultaneously using massive parallelism instead of sequentially performing a programme of instructions.
Pattern classification was done using a two-layer (i.e., one-hidden-layer) backpropagation supervised neural network with a single hidden layer of 20 neurons with Levenberg-Marquardt training functions. A backpropagation network (BPN) consists of an input layer, one or more hidden layers, and an output layer and has ability to generalize. The number of neurons was varied to see any significant improvement in performance. As no improvement was observed, the number of neurons as 20 was used to train the network. The choice of the BPN classifier was based on previous research conducted in [4]. The transfer function used was tangent sigmoid. The data division function divided the targets into three sets, Training (70%), Validation (15%), and Testing (15%), using random indices. The feature vectors were split into Training and Testing sets. The accuracy was computed on Testing set. The trained neural network was tested with the testing samples to find how well the network will do when applied to data from the real world. One measure to find how well the neural network has fit the data, the confusion matrix, was plotted across all samples. The sixty-four features were used as inputs to a neural network and the type of the seed was used as target. Given an input, which constitutes the features of a seed, the neural network is expected to identify the type of the seed which is achieved by neural network training. Figure 3 shows the ANN topology.
ANN topology: xi is the xth input feature; ii is the ith input node; ji is the jth hidden layer; ki is the kth output layer neuron and yi is the yth output; wij is weight between input and hidden layers and who is weight between hidden and output layers.
The objectives of this study are as follows:
to extract sixty-four texture-n-shape features and form three different feature sets, namely, shape, texture, and texture-n-shape feature set,
to compare the performance of three feature sets for classification of K6, R2, R4, and R24 paddy types,
to find the most suitable feature set from the three feature sets for accurate classification.
5. Mathematical Model Using Set Theory
(1) Let S be system that classifies seeds from the four varieties S={I,O,P1,P2,F,G∣φs}.
(2) Identify I/P as I:
(8)S={I,…},
where I∈ images.
(3) Identify O/P as O:
(9)S={I,O,…}O={T∣Tistypeofseedfromthefourtypestobeclassified}.
Three feature set models are derived from the above set as follows:
First feature set: texture ⊆F feature set; which are the first twenty features.
Second feature set: shape ⊆F feature set; which are the next forty-four features.
Third feature set: texture-n-shape =F feature set
(12)f2={net()∣net()returnsapatternrecognitionneuralnetworktoclassifyinputsnaccordingtotargetclasses},f3={train()∣train()trainstheneuralnetworkandreturnsthenewnetwork}f3:net,F⟶N.
(5) Identify Process as P2:
(13)S={I,O,P1,P2,….},P2={f4()∣f4()hhhhhhperformmatchofseedwiththefeaturesfromhhhhhhfeaturevector},f4:N,F⟶O.
(6) Identify failure cases as F:
(14)S={I,O,P1,P2,F,…}.
Failure occurs when
the training is not properly done,
the images are taken in variable illumination,
the different types of seeds have same features.
(7) Identify success cases as G:
(15)S={I,O,P1,P2,F,G}.
Success is defined as, for given set of images, system gives output with exact match.
6. Results and Discussions
The effect of texture, shape, and texture-n-shape features on the sensitivity of class was studied using the selected ANN configuration described above. Table 2 shows the sensitivity and total accuracy for different feature sets and different variety of grains using a BPN classifier. The results showed that the accuracy of classification of the ANN was best most of the time when shape features were used. This proves that the texture of the seed has less discriminating power than shape. The sensitivity and accuracy are calculated as follows.
Sensitivity for class and Total Accuracy %.
Seeds
Sensitivity for class
Total Accuracy
Texture features
K6
86.9
82.61%
R2
83.6
R4
70.4
R24
85.0
Shape features
K6
82.0
88.00%
R2
96.3
R4
90.4
R24
93.6
Texture-n-shape features
K6
80.6
87.27%
R2
96.1
R4
88.5
R24
94.8
Sensitivity. The sensitivity for ith class is defined as the number of patterns correctly predicted to be in class i with respect to the total number of patterns in class i:
(16)Sensitivity=niifi,
where fi is number of patterns associated with class i and is given by fi=∑j=1Cnij, i=1,…,C, and nii is number of times the patterns were predicted to be in class i, when they really are in the same class, that is, number of correctly classified patterns.
Accuracy. The accuracy is defined as the sum of number of correctly classified patterns upon the number of testing patterns:
(17)Accuracy=(1N)∑i=1Cnii,
where C is number of classes, here four, N is testing patterns, and nii is the same as in (16).
Figure 4 shows the sensitivity for different variety of grains. Class-1 in the plot indicates K6, class-2 indicates R2, class-3 indicates R4, and class-4 indicates R24.
Plot of sensitivity classwise.
7. Conclusions and Future Scope
The texture, shape, and texture-n-shape features were extracted from images of individual grains and the same were assessed for classification of grains. The accuracy shown is 82.61%, 88.00%, and 87.27% with texture, shape, and texture-n-shape features, respectively. The most satisfactory results were delivered by the shape feature set. Texture feature set gave lower accuracy than all the other sets because the difference between the features (contrast, energy, and homogeneity) of different varieties is negligible. It can be concluded that invariant moments, standard moments, and central moments of shape have a significant role in discriminating the paddy varieties. Thus shape moments have the potential to improve the classification accuracy of the computer vision systems used for classification of paddy grains.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
The authors thank the Seed Testing Laboratory, Pune, India, for providing the grains for this study.
ChauguleA.MaliS.Seed technological development—a surveyProceedings of the International Conference on Information Technology in Signal and Image Processing2013ACEEE7178GranittoP. M.NavoneH. D.VerdesP. F.CeccattoH. A.Weed seeds identification by machine visionGranittoP. M.VerdesP. F.CeccattoH. A.Large-scale investigation of weed seed identification by machine visionChenX.XunY.LiW.ZhangJ.Combining discriminant analysis and neural networks for corn variety identificationZhaoM.WuW.ZhangY. Q.LiX.Combining genetic algorithm and SVM for corn variety identificationProceedings of the International Conference on Mechatronic Science, Electric Engineering and Computer (MEC '11)August 2011Jilin, China99099310.1109/MEC.2011.60256312-s2.0-80053924011KiratiratanaprukK.SinthupinyoW.Color and texture for corn seed classification by machine visionProceedings of the 19th IEEE International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS '11)December 2011Chiang Mai, Thailand10.1109/ISPACS.2011.61461002-s2.0-84857303667PaliwalJ.VisenN. S.JayasD. S.Evaluation of neural network architectures for cereal grain classification using morphological featuresShoucheS. P.RastogiR.BhagwatS. G.SainisJ. K.Shape analysis of grains of Indian wheat varietiesVisenN. S.PaliwalJ.JayasD. S.WhiteN. D. G.Specialist neural networks for cereal grain classificationPaliwalJ.VisenN. S.JayasD. S.WhiteN. D. G.Cereal grain and dockage identification using machine visionGonzalezR. C.WoodsR. E.EddinsS. L.PaliwalJ.VisenN. S.JayasD. S.WhiteN. D. G.Comparison of a neural network and a non-parametric classifier for grain kernel identificationDubeyB. P.BhagwatS. G.ShoucheS. P.SainisJ. K.Potential of artificial neural networks in varietal identification using morphometry of wheat grainsSinghC. B.JayasD. S.PaliwalJ.WhiteN. D. G.Identification of insect-damaged wheat kernels using short-wave near-infrared hyperspectral and digital colour imagingPourrezaA.PourrezaH.Abbaspour-FardM.SadrniaH.Identification of nine Iranian wheat seed varieties by textural analysis with image processingWiwartM.SuchowilskaE.LajsznerW.GrabanŁ.Identification of hybrids of spelt and wheat and their parental forms using shape and color descriptorsHuangK.-Y.Detection and classification of areca nuts with machine visionLiJ.ChenB.ShaoL.TianX.KanZ.Variety identification of delinted cottonseeds based on BP neural networkAdjemoutO.HammoucheK.DiafM.Automatic seeds recognition by size, form and texture featuresProceeding of the 9th International Symposium on Signal Processing and its Applications (ISSPA ’07)February 2007Sharjah, United Arab Emirates1410.1109/ISSPA.2007.45554282-s2.0-51549105830MebatsionH. K.PaliwalJ.JayasD. S.Automatic classification of non-touching cereal grains in digital images using limited morphological and color featuresSymonsS. J.FulcherR. G.Determination of wheat kernel morphological variation by digital image analysis, I variation in eastern Canadian milling quality wheatsHuM. K.Visual pattern recognition by moment invariantJainA. K.JainA. K.Moment representationNixonM. S.AguadoA. S.