Grey Blight Disease Detection on Tea Leaves Using Improved Deep Convolutional Neural Network

We proposed a novel deep convolutional neural network (DCNN) using inverted residuals and linear bottleneck layers for diagnosing grey blight disease on tea leaves. The proposed DCNN consists of three bottleneck blocks, two pairs of convolutional (Conv) layers, and three dense layers. The bottleneck blocks contain depthwise, standard, and linear convolution layers. A single-lens reflex digital image camera was used to collect 1320 images of tea leaves from the North Bengal region of India for preparing the tea grey blight disease dataset. The nongrey blight diseased tea leaf images in the dataset were categorized into two subclasses, such as healthy and other diseased leaves. Image transformation techniques such as principal component analysis (PCA) color, random rotations, random shifts, random flips, resizing, and rescaling were used to generate augmented images of tea leaves. The augmentation techniques enhanced the dataset size from 1320 images to 5280 images. The proposed DCNN model was trained and validated on 5016 images of healthy, grey blight infected, and other diseased tea leaves. The classification performance of the proposed and existing state-of-the-art techniques were tested using 264 tea leaf images. Classification accuracy, precision, recall, F measure, and misclassification rates of the proposed DCNN are 98.99%, 98.51%, 98.48%, 98.49%, and 1.01%, respectively, on test data. The test results show that the proposed DCNN model performed superior to the existing techniques for tea grey blight disease detection.


Introduction
Tea farming is an ever-growing industrial sector with increasing production demand as tea is the second most consumed beverage worldwide, next to the water. India is the second largest producer of tea and produced around 1250 million kg in the year 2020 [1]. Tea foliar diseases are of greater concern as they directly afect the harvest, and fungal diseases in particular have a huge impact on the quality and quantity of the produce. In specifc, grey blight caused by Pestalotiopsis theae is one of the most highly reported diseases from all major tea-growing countries of the world [2]. Mechanical damage to plants incurred by the use of farming equipment initiates infection and disease development. Te fungus attacks the maintenance leaves of the tea plant, which ensure nourishment to the tender foliage, indirectly resulting in huge crop loss [3]. Te grey blight disease symptoms appear in the middle part of the leaf as brown concentric spots, which later turn grey with brown margins and spread throughout the whole leaf. Detection and diagnosis of symptoms are crucial to controlling the spread of diseases towards sustainable production. Te tea cultivation regions are usually large and include mountainous terrains that are difcult to investigate on a routine basis. Concerning tea plantations, conventional methods of disease detection have become inefective as they rely on intensive manpower and highly specifc instruments [4]. Moreover, incorrect diagnosis of the disease leads to inappropriate use of fungicides adding to the production costs and environmental pollution. Te above challenges in diagnosing the grey blight infection on tea leaves using conventional techniques are motivated to develop an automatic diagnosis technique.
Computer vision and machine learning techniques have been employed recently in a variety of crops to accurately diagnose diseases and pest attacks based on characteristic symptoms [5][6][7]. Tis approach relies on the extraction of features from the leaf images and their identifcation and classifcation using an artifcial neural network (ANN) [8]. Deep learning, an advanced machine learning technique, which uses deep convolutional neural networks (DCNN) for crop disease identifcation, is gaining increased application due to its automatic feature extraction ability, accuracy, and robustness in detection [9][10][11]. For tea disease detection and classifcation, a few machine learning-based approaches have been employed with considerable performance [5,[12][13][14][15].
Te major contributions of this research are as follows: Te forthcoming sections of the research article are organized as follows. Section 2 discusses the existing stateof-the-art techniques for leaf disease detection and highlights the signifcance of the research. Section 3 elaborates on the tea grey blight disease detection dataset preparation and proposed DCNN model development. In Section 4, the performance of the proposed DCNN model on tea grey blight disease detection is reviewed and compared with the performance of the advanced plant leaf disease detection models and existing transfer learning techniques. Finally, concluding remarks and future directions of the research are discussed in Section 5.

Related Works
Grey blight, a fungal disease caused by Pestalotiopsis-like species, is a widespread disease afecting tea crops in many tea-growing countries, including India, resulting in huge losses in tea production. Te disease typically afects tea leaves from June to September in India. Initially, small brownish spots on the upper surface of the leaves enlarge slowly [16]. Tese spots may be of various sizes and shapes with an irregular outline. Later, the spot looks dark brown with a greyish appearance at the center and is surrounded by narrow concentric zonation at the leaf margin. Te host range of the grey blight pathogen includes guava, strawberry, oil palm, kiwi fruit, mango, pine, and avocado. Te authors in [17] found that this disease is caused by Pestalotiopsis, Neopestalotiopsis, and Pseudopestalotiopsis using multilocus DNA sequence-based identifcation.
Tis disease has caused around 17-46% crop loss in India and 10-20% yield loss in Japan [3,18]. Te grey blight disease has reduced the tea quality and production by up to 50% in the major tea-growing regions of China and Taiwan [19,20]. Tis disease is extensively spreading in the tea gardens of North Bengal of India and other countries such as Korea, China, Kenya, Japan, and Sri Lanka. Advanced artifcial intelligence techniques such as machine learning and deep learning performed a signifcant role in the disease detection of various plant leaves. Te DCNNs are the most successful plant disease detection techniques using leaf images [21,26]. Table 1 compares the numerous detection approaches using machine learning techniques proposed by diferent articles.
Te extensive literature survey shows the signifcance of DCNN models in tea leaf disease detection. Also, the literature survey identifed the following challenges faced by the existing techniques for tea grey blight disease detection. Te frst challenge is about the visual symptoms. Te brown blight, white blight, and bud blight visual symptoms are similar to the grey-blight disease in tea leaves [13,[27][28][29] that leads to the disease detection models misclassifying the diseases. Second, there is a minimum number of research studies considered to diagnose the grey blight disease in tea crops [30,31]. However, grey blight is one of the most common yield-restricting diseases of tea crops in India. Finally, the existing techniques were not achieved signifcant performance in grey blight disease detection. At maximum, the tea leaf disease detection model achieved 94.45% of classifcation accuracy [30]. Te above studies show the signifcance of proposing a novel approach to the diagnosis of grey blight disease with better performance than the existing works. Also, the approach should understand the diference between grey blight disease, brown blight, white blight, and bud blight disease symptoms. Te development and training process of the proposed grey blight disease detection model is discussed in Section 3.

Materials and Methods
Tis section describes the proposed DCNN model and dataset for grey blight disease detection. Section 3.1 explains the grey blight disease dataset collection and preparation. Te construction and training process of the proposed DCNN model is explained in Section 3.2.

Data Collection and Preparation.
In the present study, tea gardens located in North Bengal, India, were visited during 2020-2021 to examine the disease pattern of grey blight. Almost all 27 tea gardens visited were infected with grey blight disease with diferent symptoms. Te image of the symptoms was taken using a Canon digital single-lens refex camera of 500 pixels. Tere are 1320 images of healthy, grey blight diseased and other diseased leaves captured for preparing the tea grey blight disease dataset. Figure 1 shows the sample leaf images of the captured data.
Te data augmentation techniques, such as the principal component analysis (PCA) color, random rotations, random shifts, random fips, resizing, and rescaling, were used to create 3960 images in the tea grey blight disease dataset. Te PCA is an unsupervised machine learning technique that was generally used for clustering data. Recently, PCA techniques have been used as an augmentation technique in various image classifcation applications [32]. Te PCA color-augmented images of the sample images from the tea grey blight disease dataset are shown in Figure 2.
Te augmented images were used to increase the number of data and balance the data count in each class of the tea grey blight dataset. Te augmented data are added only to the training and validation datasets. Te tea grey blight disease dataset was split into training, validating, and testing datasets. Te dataset is split up for training, validation, and testing processes as illustrated in Table 2.
Te training and validation datasets were used for the training and validation process of the proposed DCNN and standard transfer learning techniques. Each class in the training data consists of 1584 tea leaves images. Te test data were used for testing the performance of the proposed DCNN and existing transfer learning techniques. Te test dataset contains only originally collected data. Te subsequent subsection explains the layered architecture and training process of the proposed DCNN model for the tea grey blight detection task.

Classifcation Model Design and
Training. Te proposed DCNN model consists of a sequence of 13 layers. Te proposed DCNN model design was inspired by the architectures of MobileNet and VGG19 Net [33,34]. It uses the inverted residual connections and bottleneck layers from MobileNet and convolutional layer pairs and the downsampling process from VGG19Net. Figure 3 shows the layered architecture of the proposed DCNN model.
Te frst layer in the proposed DCNN model was named the input layer; it performs resizing the input image dimensions to 224 * 224 * 3 pixels. Te resized image was forwarded as an input to the pair of two-dimensional Computational Intelligence and Neuroscience convolutional (Conv2D) layers with a flter size of 128, kernel size of 3 * 3 with a stride value of 1 * 1, and a ReLU activation function. Te max-pooling layer was introduced as a fourth layer of the proposed DCNN model to down the sampling size of the convolutional layer output. Te maxpooling layer uses a 2 * 2-sized kernel with 1 * 1-sized strides. Te downsampled data were forwarded to the sequence of three bottleneck blocks. Figure 4 illustrates the internal layers of the bottleneck block.

Computational Intelligence and Neuroscience
Each bottleneck block consists of four internal layers, such as convolutional (Conv) layers, depthwise convolutional (Dwise Conv) layers, linear convolutional layers, and adder (Add) layers. Te Conv layer performs the convolutional operation for extracting the feature information from the output data of the previous layer. Te Dwise Conv layers perform the Conv operation on the output of the conv layer with a single flter for all the channels. Te Dwise Conv layers require very less computational process compared with the traditional Conv layers. Te linear Conv layer was introduced after the Dwise Conv in the bottleneck layer of the proposed DCNN model. Te linear Conv layer implements the convolution operations using linear activation functions. Te Add layer combines the output data of the linear Conv layer and the input data of the current bottleneck layer in the model. All the Conv layers in the bottleneck block used the kernel size of 3 * 3, the stride value of 1 * 1, and the activation function of ReLU.
In addition, the output data of bottleneck block 3 was forwarded to the pair of Conv layers. Te Conv layer pair performs a convolutional operation with a flter size of 64, kernel size of 3 * 3, and stride value of 1 * 1. Te maxpooling layer was introduced after the pair of Conv layers in the network to perform downsampling. Te maxpooling layer uses a flter size of 2 * 2 with a stride value of 1 * 1. Te downsampled data were forwarded to the sequence of three fully connected layers (Dense) with flter sizes of 1024, 512, and 3, respectively. Te fnal dense layer classifes the input data into three classes using the softmax activation function. Te Bayes grid search technique was used to optimize the parameter values for the proposed DCNN model. Te Bayes grid search technique identifed the optimized batch size as 32, the loss function as categorical cross-entropy, the optimizer as Adam, and the learning rate as 0.001 for the proposed DCNN model. Te proposed DCNN was trained on the grey blight dataset using optimized hyperparameters up to 1000 epochs. Te training progress of the proposed DCNN model is shown in Figure 5. Te proposed DCNN model achieved a training accuracy of 99.53% and a training loss of 0.042 on the fnal training epoch. Also, the model performance was validated using a validation dataset. Figure 6 illustrates the validation epoch-wise validation performance of the proposed DCNN model on the grey-blight disease dataset. Te proposed DCNN model achieved the validation accuracy and loss on the fnal training epoch of 99.27% and 0.096, respectively. Te trained DCNN model architecture and weights were stored for the testing and deployment processes. Section 4 discusses the test performance of the proposed DCNN model on the test dataset. Also, it compares the performance of the proposed DCNN model with recent transfer learning techniques.

Results and Discussion
Tis section compares the performance of the proposed DCNN model with recent transfer learning techniques for grey blight disease detection. Te recent transfer learning techniques are AlexNet, DenseNet201, InceptionV3Net, MobileNetV2, NASNet Large, ResNet152, VGG19Net, and XceptionNet. Tere 264 tea leaf images were used for testing the performance of the proposed and existing models. Figure 7 shows the code-generated confusion matrix of the proposed DCNN on the test dataset.
Te test outcome of the proposed DCNN as True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) is shown in Table 3. Te code generated a confusion matrix, and scores of the existing transfer learning techniques are included in the Supplementary Materials (available here).
Classifcation accuracy, precision, recall, F measure, misclassifcation rate, and receiver operating characteristic (ROC) curve were used as metrics for estimating the performance of the grey blight leaf disease detection models in tea leaves. Te TP, TN, FP, and FN were used to calculate the performance metric scores. Equations (1)(2)(3)(4)(5) show the standard formulas to calculate the accuracy, precision, recall, F measure, and misclassifcation rate [35]:   Computational Intelligence and Neuroscience 5 Misclassification Rate � FP + PN FP + FN + TP + TN . Table 4 illustrates the class-wise performance metrics score and weighted average performance score of the proposed DCNN model on test data.
Te average performance metric scores of the existing transfer learning techniques are given in the supplementary materials. Te average performance metric scores of the proposed model were compared with the transfer learning techniques. At frst, the average classifcation accuracy of the proposed DCNN model was compared with the transfer learning techniques, and the result is illustrated in Figure 8.
Te classifcation accuracy comparison result shows that the proposed DCNN model achieved better accuracy than the AlexNet, DenseNet201, InceptionV3Net, MobileNetV2, NASNet Large, ResNet152, VGG19Net, and XceptionNet models. Te proposed DCNN model achieved a classifcation accuracy of 98.99% in test data, which is 4.27% higher than the second-best performed model named DenseNet201. Also, the proposed DCNN model was trained on the original and augmented datasets separately. Te test accuracy of the original dataset trained and augmented dataset trained DCNN models is compared in Figure 9. Te comparison result proves that the augmented dataset increased the performance of the proposed DCNN model by 16.27% from the original dataset trained model.
As well, the average precision of the proposed DCNN model is compared with the transfer learning techniques in Figure 10. Te comparison result shows that the average precision of the proposed DCNN model achieved 98.51% on the grey blight leaf disease dataset. Te precision score of the proposed DCNN model was 6.43% higher than the Den-seNet201 model and much higher than other transfer learning techniques.
In addition, the proposed DCNN model achieved a recall score of 98.48% on the grey blight test dataset. Te comparison of the recall score of the proposed DCNN and existing techniques is shown in Figure 11. Te comparison result illustrates that the proposed DCNN model achieved a better recall score than the AlexNet, DenseNet201, Incep-tionV3Net, MobileNetV2, NASNet Large, ResNet152, VGG19Net, and XceptionNet models.
Furthermore, the F measure of the proposed DCNN and existing models on the grey blight dataset is illustrated in Figure 12. Te proposed DCNN model achieved an F measure score of 98.48% on test data. Te F measure of the proposed DCNN model was better than the transfer learning techniques such as AlexNet, DenseNet201, InceptionV3Net, MobileNetV2, NASNet Large, ResNet152, VGG19Net, and XceptionNet. Te F measure score of the proposed DCNN model was 6.44% higher than the Dense201 model on test data.
Te misclassifcation rate is also one of the important performance metrics, which predicts the percentage of samples that were incorrectly classifed by the models. Figure 13 shows the misclassifcation rate of the proposed DCNN and existing models on test data. Te proposed DCNN models reached a 1.01% of misclassifcation rate on the grey blight dataset. Te misclassifcation rate of the proposed DCNN model was very lesser than other techniques.
Te Receiver operating characteristic (ROC) curve represents the performance of the classifcation models on all the classifcation thresholds [36]. Te RoC curves of the proposed and transfer learning techniques for individual classes in the dataset are shown in Figure 14. Te area under the ROC curve of the proposed DCNN on the grey blight disease class was 97%. Te comparison graph shows that the area under the ROC curve of the proposed model for the grey blight disease class was higher than the transfer learning techniques.
Te comparison results show that the classifcation accuracy, precision, recall, F-measure, and RoC of the proposed DCNN model was superior to recent transfer learning techniques such as AlexNet, DenseNet201, InceptionV3Net, MobileNetV2, NASNet Large, ResNet152, VGG19Net, and XceptionNet.
Similarly, the classifcation performance of the proposed DCNN model was compared with the existing state-of-theart tea leaf disease detection models such as Impro-ved_Deep_CNN [37], AX-RetinaNet [31], and MergeModel [38]. Te existing models were trained and tested on the tea grey blight disease dataset. Figure 15 shows the performance comparison of the proposed and existing models on grey blight disease detection in tea leaves.
Te comparison result illustrates that the proposed DCNN model performed better than the existing tea leaf

Conclusions
Grey blight is one of the most yield-afecting diseases of tea crops worldwide. Tea plantation regions are generally surface areas and mountainous terrains. It is difcult to diagnose the disease in the entire area manually. Tis article proposed a novel deep convolutional neural network (DCNN) for the automatic diagnosis of grey blight disease in tea crops. Te tea leaf data were collected for the training and testing process of the DCNN model in the North Bengal region of India. Also, the data augmentation techniques were used to increase the number of samples in the training dataset. Principal component analysis (PCA), random rotations, random shifts, random fips, resizing, and rescaling were used to produce the augmented images for the training data. Te dataset consists of three classes such as grey blight, healthy, and other diseases. Tere are 4752 images used for the training and 264 images used for the validation process of the proposed DCNN model, respectively.

Data Availability
Te data used to support the fndings of this study can be obtained from the corresponding author upon request.

Conflicts of Interest
Te authors declare that they have no conficts of interest.

Supplementary Materials
Te experimental results of the transfer learning and existing tea leaf disease detection models were illustrated in the supplementary fle. Figure 1 shows the confusion matrix of the DenseNet201 model for tea grey blight disease detection. Table 1 illustrates the TP, TN, FP, and FN of the Dense-Net201. And, table 2 represents the class-wise and average accuracy, precision, recall, F measure, and misclassifcation rate of the DenseNet201. Figure 2, Table 3, and Table 4 illustrate the confusion matrix, confusion matrix score, and performance of the ResNet152 model, respectively. Te confusion matrix of the InceptionV3Net is shown in Figure 3. Table 5 shows the TP, TN, FP, and FN scores of the InceptionV3Net. Te performance of the InceptionV3Net on tea grey blight disease detection is illustrated in Table 6. Te confusion matrix, confusion matrix score, and classwise performance of the NASNet Large were shown in Figure 4, Table 7, and Table 8, respectively. Figure 5 depicts the MobileNetV2 confusion matrix. Te confusion matrix scores of the MobileNetV2 are shown in Table 9. Table 10 shows how efective the MobileNetV2 is at detecting tea grey blight. Figure 6, Table 11, and Table 12 illustrate the confusion matrix, confusion matrix score, and performance of the VGG19Net model, respectively. Te confusion matrix of the XceptionNet for grey blight detection is shown in fgure 7. Te scores of the confusion matrix were represented in Table 13. Te classifcation performance of XceptionNet is illustrated in Table 14. Te confusion matrix, confusion matrix score, and class-wise performance of the AlexNet were shown in Figure 8, Table 15, and Table 16, respectively. Moreover, the confusion matrices of the existing tea leaf disease detection models such as AX-RetinaNet, merge model, and Improved_Deep_CNN were shown in Figures 9, 10, and 11, respectively. Tables 17 and 18 illustrate the confusion matrix score and class-wise performance of the AX-RetinaNet. Table 19 shows the confusion matrix score of the merge model. Te class-wise performance of the model was shown in Table 20. Tables 21 and 22 illustrate the confusion matrix scores and class-wise performance of the Improved_Deep_CNN, respectively.