The Potential Application of Innovative Methods in Neural Networks for Surface Crack Recognition of Unshelled Hazelnut

,


Introduction
Hazelnut (Corylus avellana L.), the edible seed of the hazelnut, has been a feature of the human diet since prehistory.Hazelnut is mainly distributed on the coasts of the Black Sea region of Turkey, in southern Europe, and in some areas of the USA (Oregon and Washington).It is also cultivated in other countries, such as New Zealand, China, Azerbaijan, Chile, and Iran [1,2].
The appearance of fruits and nuts is a primary criterion in the purchasing decisions of consumers, and it plays an important role in the design of agricultural machinery, equipment, and facilities for proper conveying, separation, and cracking processes [3].
In some countries, especially in the Middle East region, most hazelnuts are cracked using semi-industrial or handcrafted machines and marketed as open-shell [4][5][6].Separating the closed-shell or cracked hazelnuts from the open-shell is currently based on visual inspection, which is carried out by the workers.Nevertheless, it can be considered potentially unhealthy, time-consuming, expensive, and inconsistent in nature.
Shelling and cracking operations are the most important processes in hazelnut processing.The shelling and cracking operations lead to broken and damaged kernels due to the mechanical forces applied to the nut.Damaging the kernels during the shelling process dramatically reduces the market value of hazelnuts [7].To increase the shelling performance, hazelnuts were separated into different size groups by using a cylindrical or a vibrational sizer [8].
In the process of hazelnut cracking, because of the differences in size, shell stiffness, and shape of hazelnuts, many of them do not become open-shell and leave the cracking machine in the form of cracked or closed-shell.Separating closed-shell and cracked hazelnuts from open-shell increases the marketability of the final product.Also, during postharvest drying, hazelnuts experience shell cracks, which could accelerate quality deterioration and microbial contamination of hazelnuts during storage.Separating these cracked hazelnuts could reduce the waste of products in subsequent processing operations [9].
Most of the studies on hazelnut have been focused on the shape or quality characteristics of the closed-shell and kernel of hazelnut, including dimension, shape, color, defects, mechanical properties, and cultivar classification [10][11][12][13][14][15].In the detection of underdeveloped hazelnuts from fully developed ones, an acoustic sorter system is developed to separate empty hazelnuts using features extracted from wavelet transform [16] and the combined feature vector of length 78 [17].Menesatti et al. [15] demonstrated the potential of modern multivariate techniques using shape-based methods to discriminate between four traditional Italian hazelnut cultivars.Defective hazelnut kernels were identified automatically with multivariate analysis methods in RGB images [10].Solak and Altinişik [18] used image processing and the k-mean clustering technique to classify hazelnut varieties.
Although several researches have been carried out to identify empty hazelnuts or classify hazelnut varieties, there has not been much research done to separate closed-shell or cracked hazelnuts from open-shell ones.Thus, this study is aimed at accurately identifying and classifying hazelnuts based on the percentages of open-shell, cracked, or closedshell hazelnuts using proposed and common pretrained DCNN models.

Materials and Methods
2.1.Hazelnut Samples.Hazelnut samples were collected from the local markets in Iran (Rahimabad district of Gilan).In total, 16 kg of hazelnuts was purchased from three different dealers whose jobs were related to the processing of this product.Samples were then mixed and then divided into two classes including open-shell and cracked or closedshell (Figure 1).Open-shell hazelnuts have a large crack on their surface, and their shell can be easily removed by hand, whereas cracked hazelnuts have a tiny crack, and removing their shell by hand is very hard.The average length, width, thickness, and geometric mean diameter of the samples were 2.2.Deep Convolutional Neural Networks.Deep learning was recently used in many research efforts offering modern techniques in image processing and data analysis, with promising results and considerable potential [32].In this study, a powerful deep learning technique, namely, a deep convolutional neural network (DCNN), was used to recognize and classify hazelnuts.DCNNs have two major advantages over traditional shallow neural networks, such as sparse interaction and parameter sharing, which makes DCNN a superior classifier to other machine learning classifiers [33].To minimize interference and achieve high accuracy during training and validation, some techniques, such as image augmentation, transfer learning, batch normalization, and dropout, were used.augmentation is also a good way to enhance the model's performance.In this study, data augmentation was applied only to the training set and included horizontal flipping, vertical flipping, and contrast [33,34].The augmentations of hazelnut images are shown in Figure 3.

Transfer Learning.
Transfer learning (TL) aims to provide a framework to utilize previously acquired knowledge to solve new but similar problems much more quickly and effectively [35].Transfer learning has been illustrated to be effective for many applications as it uses knowledge of labeled training data from a source domain to increase a model's performance in a target domain, which has little target training data [33].This technique is effective when it is not possible to train a network from scratch due to having a small training dataset or having a complex multi-task network [36][37][38].In this study, during hazelnut classification, transfer learning was implemented by training the built models, VGG-19, Inception-V3, and ResNet-50 on the captured hazelnut images.

Proposed Model.
The proposed neural network model consists of an architecture that includes the input of a fixed size of 224 × 224 × 3 RGB images to the first convolutional layer with 16 filters, followed by several additional convolutional layers from 32 to 256 filters.Our proposed DCNN comprises convolution layers, max-pool layers, and one fully connected layer.The rectification linear unit (ReLU) is used as an activation function in the CNN layers.
The DCNN pretrained models and proposed model codes were executed in Colab (Google Colaboratory), which used Python 3.6, the TensorFlow backend, and the Keras library.The hazelnut images were stored on Google Drive to be called for while executing the script.Input, kernel, and pooling size as well as the number of convolutional layers and activation function of the models are listed in Table 1.For all models, to minimize ANN training time, just one hidden layer was used in the fully connected layer.The network was trained with RMSprop optimization, and each hidden layer was activated using the ReLU function.The output layer used the soft-max classifier cost function as the activation function.Softmax provides probabilities for each output neuron.Common detailed hyperparameters for the proposed models were adopted as follows: batch size 32, a number of epochs 50, learning rate 0.0001, and momentum 0.9.    4 shows the architecture of the modified model that was suggested in this research.It consists of three pooling layers and three convolutional layers, with 16, 32, and 64 filters in each layer.

Statistical Analysis.
To analyze and compare the performance of DCNN models, four important metrics such as precision, accuracy, sensitivity, and F1-score were used in research [33,34,39].The value of these measures was computed using a confusion matrix for both the training and test datasets.However, to compare the models, just the results of the test dataset were used.In the confusion matrix (Table 2)

Results and Discussion
In this work, the performance of the different deep-learning architectures was evaluated in the classification of hazelnut samples.To compare models, the training process was carried out using the same settings for each model.Therefore, the input shape, the size of training, validation and test datasets, batch size, the learning rate, and the optimizer were the same for VGG-19, ResNet-50, Inception-V3, and the proposed models.Because the ResNet-50 showed low performance, this model was allowed to train up to 80 epochs.
In order to investigate the models' performance, the training and validation dataset curves were compared (Figures 5-8).The trend of curves in these figures showed a good fitting of some models implemented for the classification of hazelnut, which indicates that the number of datasets used for training models was sufficient.All investigated models, especially the proposed model, had good performance in the classification of hazelnut classes.But among these models, ResNet-50 showed high fluctuation and low classification performance (<70%).The lower performances of the ResNet-50 could be related to the high structural complexity and the high number of parameters (36.4 million).In other words, maybe the number of datasets for training ResNet-50 was insufficient.There have been few studies in the literature performing classification of hazelnut based on its crack using deep learning models to compare with   our results.However, in some similar studies in the literatures, such as the of concrete crack using deep fully convolutional neural networks [39] and automatic crack classification and segmentation on masonry surfaces using convolutional neural networks and transfer learning [40], comparable results were reported.Also, Chen et al. [41] reported similar results in the study using deep transfer learning for image-based plant disease identification.In these similar studies, ResNet-50 showed low performance and had higher fluctuation in the period of training.But other models such as Inception-V3 and VGG-19 had high performance similar to those we obtained in this study.Figure 8 shows the accuracy and loss of the proposed model.As this figure shows, the train and validation curves are the same trend and are overlapped at the end of the training process.This situation illustrated that the problem of overfitting does not occur in the proposed model with the parameters chosen during the training process.These results show that the proposed network has a good ability to discriminate the hazelnut images in the input well.
The number of neurons in the hidden layer is one of the most important factors that can directly affect the neural network performance.Excessive hidden neurons and hidden layers can improve the accuracy of the network; on the other hand, it also increases the computation time and the chance of overfitting [42].In this study, to reduce the complexity of the network, just one hidden layer was used.The results (Table 3) showed that all models except ResNet-50 have good accuracy in detecting the two classes.In assessing the effect of neuron number in the hidden layer on the accuracy of 5 Journal of Food Processing and Preservation estimation, results indicated that the performance of the models does not seem to improve by further increasing the neuron number.Therefore, prevent overfitting, 64 neurons were selected as the optimal number in the hidden layer.
Table 4 shows the effect of the number of filters at CNN layers on the performance of the proposed network.The results showed that the network with the structure of three convolutional layers with 16, 32, and 64 filters had higher   performance than other structures.Also, using a dropout of 0.5 in the fully connected layers had no significant effect on the network accuracy.
The evaluation metrics of both the validation and test sets of the four CNN models are shown in Table 5.Here, the results are summarized for each model.The results showed that all three DCNN models of VGG-19, Inception-V3, and the proposed model got satisfactory results.Inception-V3 and the proposed models showed the best performance, with the highest accuracy of 98%, followed by VGG-19 (96%).But the ResNet-50 model had the lowest accuracy (72%).Similar trends were obtained for precision and recall measures.Inception-V3, the proposed model, and VGG-19 had the highest values for both precision and recall (>96%).The ResNet-50 model had the lowest values of precision and recall measures (73% and 71%, respectively).The classification accuracy of some DCNN models used in this study was higher than that obtained by Rahimzadeh and Attar [43], who used ResNet152, ResNet-50, and VGG-16 to classify pistachios into open and closed-shell.For these models, they reported 95%, 92%, and 90% accuracies, respectively.In a similar study, Omid [44] implemented a decision tree and fuzzy logic classifier for sorting pistachio nuts and reported 95.56% classification accuracy for the test dataset.Also, the F1-score was defined as the harmonic mean of precision, and recall was calculated.The F1-scores of the Inception˗V3, VGG˗19, and the proposed model were approximately the same (98%) and were higher than those of the ResNet-50 (72%).The results described in Table 5 indicated that the proposed model, as well as Inception˗V3 and VGG˗19, are superior to the ResNet-50 model and can predict the two classes of hazelnuts with high accuracy regarding precision, recall, and F1-score.Similar results were obtained for the proposed model in the study to classify the hazelnut varieties [29].

Conclusion
This article proposed an effective method to classify hazelnuts based on their surface cracks.The proposed and pretrained models were trained on a dataset that included the images of two hazelnut classes: closed-shell and open-shell.In this study, four DCNN models were investigated for hazelnut classification into two classes based on the surface cracks.Overall, most of the investigated models showed satisfactory performance, with accuracy varying between 96.32 and 98.15%.Among pretrained models, Inception-V3 had the highest accuracy (98.15%) and F1-score (each 97.91%), followed by VGG-16 with an accuracy of 96.32% and F1 -score of 96.35%.The custom-built model had a high accuracy (97.85%) and F1-score (96.84%).The ResNet-50 model had the least accuracy (71.92%) and F1-score (72.28%).Results revealed that the proposed model and Inception-V3 had the highest accuracies.However, considering the F 1˗score as the harmonic mean of precision and recall, both Inception-V3 and the proposed model have similar performance in the detection of cracks on the hazelnut surface.But comparing the size and number of the models' parameters, the proposed model is recommended for real-time detection tasks.

Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.

Additional Points
Practical Application.In some countries, most hazelnuts are cracked using semi-industrial or handcrafted machines and marketed as open-shells.Separating the cracked or closedshell nuts from the open-shell nuts leads to an increase in the price of the final product.The process of separation is currently based on visual inspection, which is still a laborintensive and noncareful process.The separation of hazelnuts through machine learning algorithm and DCNN based on image processing can be extremely valuable for the rapid and automated separation of hazelnuts on an industrial scale.

Disclosure
The corresponding author would be the sole contact for the editorial process.

1 Precision
, TO and FO are the percentages of the true and false openshells.TC and FC are the percentages of the true and false cracked or closed-shell, respectively.In the following equations (1-4), precision and recall metrics are computed just for the open-shell class.Similar relationships were used for cracked and closed-shell classes.Accuracy = TO + TC TO + FO + TC + FC ,

Figure 4 :
Figure 4: The architecture of the proposed built model used for hazelnut classification.

Figure 5 :Figure 6 :
Figure 5: The accuracy (a) and the loss (b) of the Inception-V3 pretrain model.

Table 1 :
The parameters of DCNN.

Table 2 :
The confusion matrix of the DCNN classifier for two classes.

Table 3 :
Accuracy of DCNN models at different numbers of neurons in the hidden layer.

Table 4 :
The effect of the number of filters at CNN layers on the accuracy of DCNN models.

Table 5 :
Classification performances of DCNN models used in this study.