Blended Features Classification of Leaf-Based Cucumber Disease Using Image Processing Techniques

,


Introduction
Plant leaf infections are a very inferior risk and require a robust economy for the nation. e development of farming products and accountability for a valuable part of these products are quite imperative.
In image processing, pipelined techniques have created incredible advancement results with high accuracy, yet there are a few concerns as well. Firstly, proficiency exceedingly relies upon features extraction and after that features selection for infected leaf image, in which the highlighted features are extracted, and secondly pipelined procedures are moderately unpredictable. e boisterous images, irregular lighting, and turmoil background in the dataset are unneglectable.
is may damage the feature's eminence and reduce the detection rate. For this reason, a viable technique is utilized to dispense with noise and another hazard [1]. e plant leaf disease detection through the naked eye observes the manifestations, incorporating an extensive level of complications, because of this multifaceted nature and countless disease in crops and current pathology issues, even a ranch expert and vegetable pathologists are generally neglected to dissect specific disease and therefore conducted wrong results and deductions.
An automatically computerized strategy to perceive and classify the plant leaf disease would give full support to the plant pathologists for detection of disease through visual perceptions [2]. With the use of graphical processing units' numerous applications related to artificial intelligence (AI), machine learning has a higher rate of growth, which prompts curiosity in models and methodologies. An automated computerized identification is required for the detection of classification of plant leaf disease [2]. Features extraction [3] and features selection are essential for image illustration. Many features extraction and classification approaches have been introduced for organic products, such as fruits. e hybrid method is used for the recognition and detection method for the citrus fruit disease. Lesion spot detection and geometric and texture feature fusion are used to select the best features. PCA was used for scoring the features. e proposed method utilizes a dataset of citrus diseased fruit named Anthracnose, canker, scab, greening, and melanoses. e proposed algorithm accomplished diverse accuracy which is 97% on citrus disease, 89% results on the consolidated, and 90.4% accuracy on the local dataset [4]. Different steps of image processing in which preprocessing, segmentation, features extraction, and classification are used are depicted in Figure 1.
Classical image processing contains propelled procedure of Computer Vision (CV) for detection of disease and expands the rate of accuracy in results. Approaches for the segmentation incorporate thresholding [5], adaptable thresholding approach [6] segmentation dependent on Neural Network (NN) [7,8], bend segmentation [9], and edge recognition-based segmentation [10]. ese methods can be applied for plant leaf disease detection. e image processing techniques are used to check different diseases in the crop and the diseases in insects of the crop. Deep learning techniques also help in identifying infections. e detection implementation by using the Convolutional Neural Network (CNN) is improved in terms of accuracy, well-defined results, and precision [11].
When it comes to know about the challenges, then different methods are used to compare the existing work with the previous one. e challenges include detection speed problem, occlusion problem, and lighting problem [5].

Problem Statement.
Generally, Computer Vision-(CV-) based techniques for identification and classification consist of mainly five steps including preprocessing, features extraction, features selection, features fusion, and classification. For these types of structures, many challenges are encountered that are to be addressed to increase the efficiency and accuracy rate for classification.
e visually appealing quality of images of cucumber leaves for accurate feature extraction is the first challenge in the preprocessing stage. e maintenance of low-and highquality contrast of spots on leaves, curved edges may affect the accuracy of classification. e changes required for the dataset include scale spacing for data augmentation. Feature fusion and feature selection use texture and shape features; therefore, an appropriate method for feature extraction and selection is needed for the improvement of accuracy in results and as well as for classification. Moreover, the domain of computer-based approaches indicates that visually interesting features in features selection are the main challenge for the improvement of accuracy performance and recognition rate.

Author's Contributions.
Significant contribution in this approach is the preprocessing, feature extraction, and classification steps. In preprocessing resized, the greyscale transformed image uses the Otsu threshold, which converts the intensity image into a binary image. Tan-Triggs is used for normalization, which is never utilized in existing work. e second influential contribution in this algorithm is an amalgamation of three features. ese features are extracted and further fused by using serial-based feature fusion. e dataset is usually too small for cucumber leaf disease, and preprocessing is the primary issue. e preprocessing step is necessary for this cause. e contrast enhancement problem, as well as the light variation problem, can render detecting cucumber leaf disease difficult.

Paper Organization.
e remaining paper is arranged as follows: Section 2 describes literature work. Section 3 presents a proposed step along with diagram and all proceeding steps. Section 4 presents feature extraction and selection; Section 5 presents features fusion; Section 6 provides classification; in Section 7, results and experiments are presented; and the conclusion is described in Section 8.

Related Work
In related work, a number of techniques are presented for recognition and classification of plant leave disease. For the detection of crop disease, the CNN model was used for training and testing purposes, based on a plant leaf dataset of different classes. Existing deep learning models are applied, which are specific and are very easily applicable to crops attack. ese models are determined with a small amount of data but give the best results for specific objects [6]. ese modeling's contain the very high-performing technique for the development of disease recognition. For the sake of better results, the classification method using many classifiers gives satisfactory results [7].
Zhang et al. [12] described an IoT-based approach that handles the problem of discriminant features selection of disease parts. ey explained that equal importance of features fusion, clustering, and PHOG features are extracted. A complete process of segmentation is done, and in this way, plant disease can be recognized.
By comparing different conventional techniques with CNN models using the images taken from the dataset, their results for the conventional perform better than a deep model. e losses may occur due to the mismanagement of CLR infection present on the coffee leaves, resulting in the drop of immature leaves. e severity of plant disease can be described with less amount of work with remote sensing bands and different vegetation catalogs. is is because of the multifaceted relationship between remote sensing catalog and leaf disease complexity [9]. Multiscreening with multifeatures is used for disease detection [10]. e diagnosis of leaf diseases is important in the cultivation of crops. e production can be achieved by observation which requires a high degree of learning and practice. e novelty in the domain of Artificial intelligence is the observation of retaining data, which is further enhanced by using different tools merging with the AI techniques helpful in diagnosing the crop diseases. e stimulating task is a visualization of features or interesting parts of a leaf image.
is task is difficult in the case of automated detection [13]. e proposed approach for segmentation and detection method for plant leaf disease include the fusion of superpixel and k-mean clustering. Feature extraction method PHOG uses the color components and greyscale image. e accuracy achieved by the proposed method is 92.15% with fivefold cross-validation [12].

Proposed Model and Benefit
is section of the proposed model elaborates three major steps like preprocessing, feature extraction, and leave disease classification. Each step is further divided into multiple subparts in a sequence. e first step consists of preprocessing and normalization. Whereas Figure 1 shows the second step of feature extraction based on the texture and shape features of cucumber leaf images, Figure 2 shows the proposed model for cucumber leave disease identification and classification.

Dataset Expansion.
e dataset for cucumber leaf disease is increased by increasing the instances of the dataset. e main reason for enlarging the number of images is to get the improving accuracy and decrease the error rate. A number of instances can be increased by using the MATLAB command and we also increase the dataset by using the flipping function, which also increases the dataset. e main reason for increasing the number of instances is to manage the dataset to get the optimal solution.

Preprocessing.
Preprocessing plays an essential role in image enhancement in order to achieve the milestone to refine edges, eliminate noise, and remove blurriness, and challenge is required for image augmentation. e preprocessing step involves the conversion of a color image into a greyscale image. e intensity of the color image is converted into another channel with the image size of 640 × 480. Figure 3 shows the cucumber dataset. e purpose of image preprocessing is to encounter the interesting part (i.e., diseased part). Normalization is a process in which data is rearranged to meet the needs of removing the redundancy from the data and adding all related data (logical data) or interesting parts. All normalization techniques perform in the preprocessing stage. First of all, RGB images from the dataset are converted into greyscale, resized images, and normalization techniques are performed.
Tan-Triggs is a function to normalize the data in a vector and matrix which can be computed through scoring. Our work includes the Tan-Triggs normalization and Otsu threshold to binarize the image into greyscale. e dataset consists of six diseased classes of cucumber containing Downy mildew, Powdery mildew, Anthracnose, Blight, Corynespora, and Angular leaf spot. Dataset is collected on the basis of private reference [14] excluding healthy images.

Tan-Triggs Normalization.
Tan-Triggs is used for the enhancement of the local texture features of the image for disease recognition under lighting conditions. e Illumination Normalisation (IN) is used in the prepossessing stage to describe the difference between gamma correction, nonlinear filtering, and Gaussian filtering [15].
Tan-Triggs contains different steps: Gamma correction, the difference of Gaussian, masking, and contrast equalization. Details are as follows [16].

Illumination Normalization and Reflectance.
e amount of illumination is reflected from the image, and components contain the reflectance; these components are (1) In the above equation, J is light incident and depends on circumstances, and it varies when compared with reflectance.
We also take the log of J term when the objects are compressed by using the sum.  (2)

Gamma Correction.
Gamma correction is a transformation of a grey-level image. Its purpose is to convert every pixel into its intensity as follows: where g(J) increases the dynamic series of pixels. e dynamic range enhances the brightness while doing compression and enhances the dynamic range in dark areas.

Complexity
Dog filter used for edge detection and Gaussian can be described by standard deviation sigma. Sigma σ) is used to remove noise only. Another Gaussian removes high-frequency details from the pixels. So, we can obtain a highfrequency edge by subtracting low-frequency pixels.

Contrast Equalization.
Pixel intensifies and maintains the contrast. We can get pixel intensity by using the following equations: (4) Figure 4 shows the grey scale and Tan-Triggs normalized image of a cucumber leaf.
A complete proposed diagram of the proposed model for the cucumber leaf disease recognition is presented in Figure 2.

Features Extraction and Selection
Feature extraction in image processing extracts the interesting part of an image (i.e., the diseased part of an image). e selected features reduce the dimensions. In the proposed system, texture and shape features including HOG, LBP, and COLOR features are extracted. As HOG stands for Histogram of Gradient, LBP stands for Local Binary Pattern, and SVM stands for Support Vector Machine. By using many parameters, every feature returns a different feature vector. e detailed diagram is shown in Figure 5. e details of every feature are described as follows.
HOG features are widely used for object detection, which is represented as a single feature vector where each represents a segment of an object. Mostly computed by using the sliding window for each position in a segment of an image with an SVM classifier. erefore, we use these features for the detection of disease in the cucumber leaves. Preprocessing phases resize the original image into any type of size through which the position of an image can get through several scales. e visualization of the HOG feature for cucumber anthracnose is shown in Figure 6. e local binary pattern that is used for the texture feature was first to discover by [17] in 2002. e mathematical modeling of LBP can be described by using blocks. e block is overlapped and divided into the same size. e center and neighboring pixels are matched with the greyscale values. e threshold is fixed to compare the block size value with the grey pixel value. If the equivalent grey-level value is greater than the center pixel value, the position is marked as "1." Otherwise, the position is marked as "0" [18]. e 7 layers of CNN are used with the LBP. e LBP stands for Local Binary Pattern that is used for the features extraction of disease [14].
Color features are used to find out the diseases in the plants or crops. Different colors spaces are explored in [11], and then features are extracted from different color channels.
In the above figure, the first two tables show the difference of pixels, and the third table shows the threshold value and binary code accordingly. e formal form of the LBP feature can be explained by using the following equation [18]: In the above equation, the terms L a , M a show the location of the pixel that is located at the center; i q and i a show the brightness of adjacent pixels where the term f shows symbolic function: e local binary pattern has the ability to adapt to the circumstances of different texture features in detail and is used in many fields, including texture recognition [19] and hyperspectral image classification [20]. e diseased anthracnose leaf has LBP feature extraction with pixel-wise LBP image shown in Figures 5 and 7, showing the calculation of features extraction and features scoring of HOG, LBP, and color, respectively.

Features Fusion
Feature fusion plays an important role in the field of machine learning and computer vision. Many features are joined together to make a new feature vector by applying the serial-based fusion. To get the optimal solution, the HOG, LBP, and color feature vectors are fused.
Features reduction is substantial in most disease recognition processes as it eliminates the unwanted features and removes the redundant features from the images [14]. As a result, we will get an accurate classification. e fusion method is very helpful for getting better results. is paper implements a probability process for the removal of inappropriate features [21]. Features are named as feature factors F 1 , F 2 , and F 3 ; these factors belong to R, where R represents the real values. Hog, LBP, and color features are to be considered, which represent the positive value features. e feature vector can be described by using the following equation: Equation (7) shows the fused feature vector of HOG, where i � 1, F is the feature vector, and H shows the HOG feature.
Complexity Equation (8) shows the fused feature vector of LBP, where j � 1, F is the feature vector, and L shows the LBP feature.
Equation (9) shows the fused feature vector of color, where k � 1, F is the feature vector, and C shows the color feature.
HOG, LBP, and color features are extracted and fused together. Figure 8 shows the complete feature extraction of texture, color, and shape features.

Classification
Classification plays an essential role in the field of image processing. To attain satisfactory results from the multi-SVM, many experiments have been performed by using the selected feature. In the category of classification, many classifiers are used to classify the disease. KNN classifiers include medium, weighted, and fine KNN. e geometric family of classifiers includes the diversity of SVM. ese can be substantiated by using the functions quadratic, Gaussian kernel, linear, and cubic. e tree category contains the ensemble boosted and bagged trees. Probability classifications include naïve Bayes, multikind of naïve Bayes, Bayesian logistic regression, and Bayesian net. ese classifiers are used to evaluate the performance of the cucumber leaf dataset, which is based on different performance measures later used in the result section.

Experimental Results and Analysis
A set of models are presented for the final analysis of the suggested framework. In the first experiment, HOG and BRISK features are fused and later fed to the classifiers. In the second test, all types of features (shape, texture, and colors) are fused and given to the classifiers, and in the final test, reduced features using the proposed method are supplied to the classifiers. e key reason behind these experiments is to analyze the performance of each step, involved in this proposed methodology. In true positive rate means diseased samples, and true negative means both actual and predicted values are negative. False-negative means that actual values are positive and predicted values are negative. e mathematical results can be taken by using the formulas for CR, FPR, Sensitivity, Specificity, Precision, and FNR as follows [22]:  Complexity ere are many cross-validation (CV) methods available in MATLAB, including k-fold, HoldOut, LeaveMout. e method performed by machine learning mainly involves tuning, assembly of the trained model, and performance evaluation of the proposed model. CV methods besides the performance measure also perform estimation and configuration. is is enthusiastically inclined and effective. An effective bootstrap method for cross-validation (BBC-CV) is used for selecting the best results and performance [23]. Features were combined with Principal Component Analysis (PCA) with key point's I-e principal component and index. We perform 10-fold cross-validation on each trial to get better results and generate classifying data for the cucumber dataset. All tests were performed on MATLAB 2018a using the personnel laptop with Windows 8.1, 64-bit operating system, and 6 GB RAM.

Angular Leaf Spot.
e spots may consist of different colors of light yellow, and the condition of leaves becomes very severe in case of less nitrogen. Angular leaf spot means that spots appear on veins of leaf caused by the virus pathogens, P. syringae PV. Lachrymans.
is disease is mostly caused by infection and the pseudomonas virus. For this purpose, medicated protection element Plant Growth-Promoting Rhizobacteria (PGPR) mediated ISR is recycled [24].

Powdery Mildew.
Powdery mildew is a severe disease that causes mildew toxicities in leaves of cucumber. Infected fluid from leaves causes reductions in growth, premature vegetation, as a result of loss in economic. A fungal virus that is commonly known as phytopathogenic is present in powdery mildew. ere is fast and divergent increase of the fungus-virus in infected leaves preserved with Milana. Milana may be suitable in the case of plant disease defense in the incorporated organization of powdery mildew [25].

Downy Mildew.
Fungus bacterial diseased spreads from the leaves, caused by the infection known as Pseudoperonospora. e growing reason for downy mildew is the increased rate of leaf temperature. Unhealthy regions with complex backgrounds show infected parts [26].

Anthracnose.
Anthracnose is caused by yeast effects and the fungal virus is known as Colletotrichum. ese viruses cause resistance possessions in the growth rate of cucumber leaves [3].

Corynespora.
e infection in leaves starts from slight spots of brown color with increasing the yellow glory. Leaves become uneven in shape and result in leafless plants. Symptoms rapidly increase in the fungal virus known as Corynespora [8]. e graphical illustration of certain classification methods is shown in Figure 9. e other classifiers including bagged KNN, weighted KNN, and SVM cubic are also described as below. e first experiment gives the best results as compared with other experiments from the above discussion and analysis our proposed method gives the performance for the feature vector size for HOG is 1 × 5940, LBP is 1 × 59, and color is 1 × 6120. In the confusion matrix ensemble, fine KNN confirms the classification results shown in Table 2. Table 3 shows the experimental results of experiment 1. e confusion matrix of cucumber leaf dataset of Test 1 on ensemble fine KNN is shown in Table 2.

Experiment 2.
In the second experiment, HOG, LBP, and color features are extracted. 300 features for HOG and LBP are taken at 10 cross-validations. In the second experiment, fine KNN and subspace KNN give 94.60% and 94.50% accuracy with 84.0% specificity and sensitivity, 84.6% precision, 5.4% FNR rate, and 0.011 FPR rate, respectively.
ese classifiers give the highest accuracy as compared to others. e instances for the feature vector (FV2) are double. Performance measures include specificity, sensitivity, precision, FNR, FPR, and time. e confusion matrix confirms the classification results in Table 4. Table 5 shows the performance evaluation of experiment 2. Graphical representation of classification methods in terms of accuracy and time for experiment 2 is shown in Figure 10.

Experiment 3.
In the third experiment, HOG and LBP features are extracted. 500 features for HOG and LBP are taken at 10 cross-validations. In the first experiment, fine KNN and subspace KNN give 94.2% and 82.5% specificity and 83.66% precision, 5.8% FNR, and 0.011 FPR rate, respectively.
ese classifiers give the highest accuracy as compared to others. e instances for the feature vector (FV2) are double for the reason of improving accuracy. Performance measures include specificity, sensitivity, precision, FNR, FPR, and time. e confusion matrix approves the result of the classification method in Table 6. e remaining performance of classifiers includes Fine SVM, cubic SVM, and fine Gaussian SVM, fine tree, weighted 8 Complexity KNN and their specificity, sensitivity, precision, FNR, and FPR, respectively, are also shown in Table 7. e accuracy improves by the subspace classifier by using 10-fold cross-validation. Graphical representation of classification methods in terms of accuracy, time, precision, and specificity for experiment 2 is shown in Figure 11.

Conclusion
e proposed approach is mainly used for cucumber leaf detection, based on preprocessing, normalization, feature extraction, feature selection, fusion, and classification. From above all discussion and influences, it is concluded that cucumber leaf disease is addressed by using the feature extraction of HOG, LBP, and color features. ese texture and shape features help in the recognition of the disease in leaves and color features help in the recognition of diseased part of a leaf. Furthermore, feature selection and feature fusion are important to improve the accuracy of different performance measures, including accuracy, specificity, sensitivity, and precision. e proposed algorithm method shows 94.6% accuracy, which is better than that of existing work.

Data Availability
e data used to support the findings of this study can be obtained from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
All the authors contributed equally to this work and were involved in its development at every phase. e submitted version of the work has been read and approved by all authors.  Complexity 11