Cotton is one of the economically significant agricultural products in Ethiopia, but it is exposed to different constraints in the leaf area. Mostly, these constraints are identified as diseases and pests that are hard to detect with bare eyes. This study focused to develop a model to boost the detection of cotton leaf disease and pests using the deep learning technique, CNN. To do so, the researchers have used common cotton leaf disease and pests such as bacterial blight, spider mite, and leaf miner. K-fold cross-validation strategy was worn to dataset splitting and boosted generalization of the CNN model. For this research, nearly 2400 specimens (600 images in each class) were accessed for training purposes. This developed model is implemented using python version 3.7.3 and the model is equipped on the deep learning package called Keras, TensorFlow backed, and Jupyter which are used as the developmental environment. This model achieved an accuracy of 96.4% for identifying classes of leaf disease and pests in cotton plants. This revealed the feasibility of its usage in real-time applications and the potential need for IT-based solutions to support traditional or manual disease and pest’s identification.
In Ethiopia, agriculture is the basis for national economy from which 85% of livelihood and 90% of total foreign trade comes from this agricultural sector [
Even though agriculture is the backbone of Ethiopia, so far no advanced technologies have been explored in the development of automation in agricultural science and also there are high problems in production and quality due to different diseases and pests. In recent times, the sophisticated emerging technology has attracted many researchers in the field of detection and classification of cotton leaf diseases and pests. In Ethiopia, there are several constraints which reduce the yield and quality of the product. Particularly, identification of potential diseases or pests on Ethiopian cotton is based on traditional ways. There is a wide area of farm suitable for cotton plantation, but only limited research attention is given to cotton crop production. Traditionally, experts detect and identify such plant diseases and pests on bared eyes. Bared eye determination is considered as a loss of low-level accuracy in order to detect any diseases. On high demand, different advanced technologies were aided for structuring the systems to assist nonautomatic recognition of the plant diseases and pests to increase the accuracy for any corrective measures. With the help of advanced technologies, the plant diseases were reduced, thus increasing the productivity which helped to raise the economy via boosting the production. For that reason, the implementation of information technology-based solutions in the sector of agriculture had high level of significance for Ethiopia’s development in monetary, community, and eco-friendly developments by increased cotton crops’ productions.
Among different diseases and pests occurred, about 80–90% were on the leaves of cotton [
The cotton plant is susceptible to several disorder (biotic and abiotic constraints) attacks due to temperature fluctuation, diseases, and pests. Indeed, the whole world produced nearly 576 kg per hectare of cotton crops, where only 10% of production loss occurred due to different cotton leaf diseases. The United States of America (USA) is a major exporter of cotton in the world and it obtained 5.1 billion US dollars in 2016, but there are well-known native pests which were the reason for the distraction of cotton farms [
Detecting these diseases with bare eyes increased the complexity of cotton crops productivity which decreased the accuracy in identification precision. Even an expert would fail to assess and diagnose the diseases with their bare eyes, and this inadequate technique leads to more wastage of cotton crops. Due to these mistaken conclusions, most of the time, certain unnecessary pesticides which badly affect healthy cotton are applied. Leaving the farm for even a short time interval without production will affect overall nation GDP [
The researchers forwarded the following research questions with consideration of the issues cited in the statement of problems: What is the suitable technique used for diagnosing cotton disease and pests? How to develop an automatic cotton disease and pests diagnosis system? How to determine the acquisition of the model?
Deep learning incorporates image processing and data analysis as a path for more possible findings. As it has been a successful application, it has now entered the domain of agriculture. Today, several deep learning-based computer vision applications such as CNN (convolutional neural network), RNN (recurrent neural network), DBN (deep belief network), and DBM (deep Boltzmann Machine) are performing tasks with high accuracy. However, the most prominent application for this research work is CNN [
Nowadays, CNN techniques are used to detect different objects and to perform automatic drawings of instructions for analysis purposes [
Deep learning draws an attention in order to maximize the performances to classify different tasks which help to promise the human intervention data [
. To make an efficient and effective interface system, the human plays an important role.Graph convolutional neural networks, a novel deep learning framework, addressed the issues in order to differentiate the four-class motor imaginary intentions by mutually agreeing through the similarity of electro encephalography electrodes. To find the motor imaginary, four tasks are preferred with the prediction of highest accuracy [
This research study focused on developing an identification model for cotton leaf diseases and pests using deep learning technique called convolutional neural networking. Three common types of disease and pests such as bacterial blight, leaf miner, and spider mite have been affecting cotton productivity and quality. Also, the model applied made a supervised learning technique on datasets with four prime feature extraction process and 2400 datasets. The datasets are limited to four different feature descriptors. Taking into consideration the time constraints and reach of the regions that grow cotton, the research focused in the southern part of Ethiopia such as Arba Minch, Shele, and Woyto. MelkaWorer agricultural research center was also proposed as a focus area because it is responsible for cotton farms in SNNPR. Deep learning techniques were used to perform the automatic feature eradication from the different input datasets.
According to Shuyue [
A study [
In [
The research study [
In [
The research [
This study used a design science to build and evaluate an approach that creates innovations and defines ideas, practices, technical capabilities, and products using qualitative or quantitative data. One of DSRM outputs is a model; it is a conceptual representation and abstraction of datasets. According to Hevner [
DSRM processes’ model.
Among different entry points, “problem-centered initiation” is the best fit for this design science research. The problem-centered initiation entry point is applicable because the problem is being observed by the researchers and business within the cotton disease identification domain [
The sample leaf images which the researchers have used in this research are both primary as well as secondary types of dataset. Primary data is a type of data collected fresh for the first time. In this study, the primary types were collected from July to August 2019 from Arba Minch, Shele, and Woyto cotton farms where cotton plants are widely planted and there is high infection in SNNPR, whereas secondary data collected in each class were obtained from Melaka-Worker agricultural research center founded in the Afar region and SNNPR.
For this study, the researcher has used purposive or judgmental sampling techniques, selecting three infected and a healthy sample from the population, which is nonprobabilistic. During data collection, 2400 images of data are captured and distributed into four equal classes such as bacterial blight, healthy, leaf miner, and spider mite used to train with balanced dataset, as shown in Figure
Dataset classes: (a) bacterial blight, (b) leaf miner, (c) spider mite, and (d) healthy.
The data acquisition system in this research was used with regard to generate clear, unbiased, and simplified digital images of leaf in the cotton plant sample database for further analysis and processing. The aim was to provide the digitizing system with uniform lightning or balanced illumination. The images captured using a smartphone camera and digital camera are then transferred to a computer, displayed on a screen, and stored on the hard disk in the PNG format as digital color images.
Inserting preprocessed images into a network is the first and basic task in all image processing projects. Common image preprocessing tasks in any image processing project are vectorization, normalization, image resizing, and image augmentation. In this research, these image preprocessing tasks are carried out before going to further deep learning processing using OpenCV library in python [
Deep learning solves different shortcomes of machine learning feature extraction such as extracting features manually by using the best and robust technique called a CNN [
The used dataset partitioning technique is
To collect cotton leaf images for this research, two image capturing devices were used such as a smartphone and digital camera. The proposed model was implemented using python version 3.7.3 for its usages. Also, the model is trained on the deep learning package called Keras, Version: 2.2.4-tf TensorFlow backed. TensorFlow, Version: 1.14.0 was recommended to adopt the proposed system. To evaluate the performance, many experimental setups were conducted with the help of a graphical user interface using Tkinter. From hardware, training and test was carried out on CPU instead of GPU.
To evaluate the routine of the structure, the researchers used various techniques in different periods, such as the developmental stage and at the end. First, the researchers evaluate the acquirements of the prototype using the confusion matrix and four evaluation metrics for confusion matrix reports such as F1-score, Precision, Recall, and Accuracy on the test dataset. Secondly, in this study for subjective evaluation, the researcher has used a questionnaire to measure the performance of a prototype by domain experts, as shown in Figures
Cotton disease and pest identification prototype identified as bacteria.
Cotton disease and pest identification prototype identified as spider mite.
The first task in this model designing is image acquisition from the field with digital camera and smartphone. Then, image preprocessing techniques were applied to prepare acquired images for further analysis. After this, preprocessed images were inserted into the CNN algorithm to feature extraction with neural network. Then, best-suited extractions to represent the image are extracted from the image using an image analysis technique. Based on the extracted features, the training and testing data that are used to identify are extracted. Finally, a trained knowledge base classifies a new image into its class of syndromes, as shown in Figure
Cotton leaf diseases and pests recognition model process.
CNN architecture consists of two broad sections such as feature learning and classification section. In general, the cotton images feed into an input layer and end with an output layer. The hidden layer consists of different layers, as shown in Figure
Developed CNN architecture for training.
During experimentation, different experiments were undergone to get an efficient model by customizing various parameters that provided different results. Those parameters are dataset color, number of epochs, augmentation, optimizer, and dropout. According to Serawork Wallelign [
For this new model, the researcher has trained three different numbers of epochs such as 50, 100, and 150. However, the model achieved the best performance on 100 epochs, as shown in Figure
Color and augmentation parameters’ experiment result.
Effects of number of epochs and regularization methods during experiment.
Training accuracy and validation accuracy of the model.
Researchers observed highest training accuracy at the 100th epoch as 0.990. The graphs show all the training and validation success rates that the network achieved during the process, as shown in Figure
Training loss and validation loss of the model.
To analyze the performance of the model, the last result is achieved using parameters such as K-fold cross-validation using 10 folds. RGB-colored image dataset with augmentation provides 15% best performance for the model. The researchers used the transferred learning CNN model and the grayscale dataset achieved 98.6% accuracy [
For the prototype, the researchers focused on the convention of the digital forensic investigation process, which is ISO and IEC to evaluate the prototype in terms of efficiency, effectiveness, fault tolerance, helpfulness, learn ability, and the control to assess the quality of the prototype. For the time being, the system prototypical test is carried out as a desktop application which is conducted with the help of Tkinter, a graphical user interface in Python programming language.
For the prototype, researchers focused on the convention of the digital forensic investigation process, which is ISO and IEC, to evaluate the prototype in terms of efficiency, effectiveness, fault tolerance, helpfulness, learn ability, and to control the assess quality of the prototype. For questioners, evaluators were allowed to rate the options as extremely satisfied, very satisfied, somewhat satisfied, not so satisfied, and not at all satisfied for five closed-ended questions and one open-ended question. The questionnaires are distributed to Ethiopian cotton farm experts, as shown in Figure
Prototype evaluation questionnaire are distributed to Ethiopian cotton farm experts.
Model performance evaluation result.
Questions | Total | Extremely satisfied (%) | Very satisfied (%) | Somewhat satisfied (%) | Not so satisfied (%) | Not at all satisfied (%) | Total |
---|---|---|---|---|---|---|---|
Q1 | 15 | 93 | 7 | 0 | 0 | 0 | 100% |
Q2 | 15 | 60 | 20 | 20 | 0 | 0 | 100% |
Q3 | 6 | 33 | 33 | 34 | 0 | 0 | 100% |
Q4 | 4 | 0 | 75 | 25 | 0 | 0 | 100% |
Q5 | 5 | 0 | 60 | 40 | 0 | 0 | 100% |
The overall performance of the cotton leaf disease and pest identification prototype evaluation selected by the evaluator was 60% extremely satisfied option for all questions and 20% of very satisfied and somewhat satisfied option. Also, for the open-ended question, almost all experts reflect constrictive thoughts on the overall performance of the system and prototype. So, this result shows that the prototype of cotton leaf diseases and pests was performed well in problem-solving ability and making a correct prediction is shown in Figure
Graphical representation of model performance evaluation result.
This deep learning-based model was implemented using Python and Keras package, and Jupyter was used as a development environment. Different experiments have been undergone in this research study to get an efficient model by customizing various parameters such as dataset color, number of epochs, augmentation, and regularization methods. RGB-colored image dataset with augmentation provided 15% best performance for the model. The numbers of epoch and regularization methods are very significant to boost the model performance by 10% and 5.2%, respectively. The proposed prototype has achieved the highest efficiency of 96.4% for identifying each class of leaf disease and pests in cotton plants. Developments of such automated systems are used to assist the farmers and experts to identify cotton disease and pests by leaf visual symptoms. Obtained results evidence that the designed system for the farmers are much helpful in order to reduce the complexity, time, and cost of diagnosing the leaves from any diseases.
The main challenge while developing an object detection model on deep learning was to collect a large number of training high-quality images with different shapes, sizes, different backgrounds, light intensity, and orientations in different classes. Therefore, future researchers should try to include a solution for such challenges in their work and not only identify but also suggest remedies for diseases and pests. Ethiopia launched the satellite in 2019, and this is the best initiative for the future researcher to access remote-accessing high-resolution satellite images to train high-performance deep learning technique-based model.
During data collection, 3117 images of data are collected from those varied environments.
The authors declare that they have no conflicts of interest.
This research was funded by the Arba Minch University, affiliated with Ministry of Science and Higher Education, Ethiopia.