Rapid Detection of Hybrid Maize Parental Lines Using Stacking Ensemble Machine Learning

Hybrid maize seed production is a relatively complex task due to the coexistence of three distinct types of maize plants in the field: female, male, and contaminant/off-type plants. Female and contaminant/off-type plants’ tassels should be removed immediately following flowering initiation, while male tassels should be retained to allow cross-pollination between male and female plants. Therefore, development of an intelligent tassel classification system is deemed critical for hybrid purity decision-making. The research’s primary contribution is the integration of two widely used transfer learning architectures, Inception V3 and SqueezeNet, with stacking ensemble machine learning using four algorithms (logistic regression, support vector machine, random forest, and k-nearest neighbors) for rapid classification of tassel images. Tenfold cross-validation was used to evaluate the model performance. Cloud computing was also investigated using EfficientNet to compare the predictive performance of the models. The models’ performance was assessed using four metrics: accuracy, AUC, precision, and recall. The results depicted an appropriate developed model that properly distinguished male, female, and contaminant plants. The integration of the model with machine learnings (logistic regression, SVM, random forest, and KNNs) enables rapid recognition of off-type plants even though it is operated by personnel with limited skills of seed technology on ideotype recognition. Among all the evaluated CNN architecture and stacking models, Inception V3-embedded images with logistic regression metaclassifier outperformed other models with accuracy of about 98%. SqueezeNet and EfficientNet provided comparable results for consistent tassel classification with slightly lower performance measures. The model was also subjected to a multidimensional scaling (MDS) analysis to investigate and comprehend misclassification. Male and female plants are clearly distinguished by MDS, but female and off-type/contamination plants are ambiguous. This indicates that the prediction errors were caused by highly similar data features among female and off-type images. The developed modern plant phenotyping model can be used to assist breeders/technicians in maintaining the quality of large-scale hybrid maize seed production activities in Indonesia.


Introduction
e Government of Indonesia has recently launched a national maize self-su ciency acceleration program through various approaches, including high-yielding maize hybrid seeds, to increase domestic production. For instance, the Ministry of Agriculture supplied certi ed national superior maize hybrid variety seeds nationwide, although the implementation success rate remained constant at around 60% [1]. One of the problems faced in hybrid maize seeds is maintaining purity and quality from the parent seeds to produce hybrid seeds (i.e., rst-generation seeds). Maize is a cross-pollinating crop that is easily contaminated by pollens from surrounding plants, later reducing seed purity and yield potential.
Furthermore, during the implementation of this national program, the national research agency called the Indonesian Agency for Agricultural Research and Development (IAARD) was selected to execute this self-sufficiency acceleration program in collaboration with more than 30 private licensed seed producers. Hybrid maize seeds are formed from a cross between two desired maize parent lines, namely, the male and female lines. Generally, in the field, one row of male plants is sandwiched between four rows of female plants (1 : 4). However, during plant growth, contaminant plants usually emerged due to some constraints such as volunteer plants and adulteration of maize seeds during preparation [2]. When entering the flowering phase, the female parents' tassel should be removed to allow pollen from the male plants to fertilize the female lines, leading to hybrid seed generation. Besides, field inspections must be carried out carefully and regularly to ensure that all female tassels or off-type plants have been properly removed. is activity is usually done manually by a specialist or group of specialists with different knowledge levels on plant breeding. An additional problem faced in the production of hybrid seeds was in the large and scattered areas, making monitoring and execution of activities at critical stages, such as eliminating tassel and off-type plants, quite difficult.
In the last decade, artificial intelligence technology has experienced significant scale-up developments, including its application in field/on-site research. Machine learning technology has experienced improvements, especially in the accuracy of recognizing objects. e introduction of a new technique, which is nondestructive based on CNN, has raised the expectation for image recognition technology's acceleration [3]. CNN is a deep learning method from the development of Multilayer Perceptron (MLP) designed for two-dimensional processing data. CNN is included in the type of Deep Neural Network (DNN) because it comprises a deep level of recognition applied to an object. CNN is generally applied for image data processing, including the development of self-driving cars, distance recognition, disease diagnosis, product marketing, and others.
Currently, modern plant phenotyping of maize using machine learning and deep learning can differentiate tassels between male and female parents as well as other contaminating plants. No off-type plants are permitted to remain in the field during the two parents' flowering. Tassels exhibit a variety of morphological characteristics, including curvature and cohesiveness, and cohesiveness. e challenges faced in tassel detection include tassels that appear unsimultaneously, different in sizes and shapes of tassels, and other lines producing differences in colour and texture of the tassels, changes in light illumination during capturing, and the background of the shooting. us, the tassel classification should be conducted with care after the detection [4]. Image capture of plant disease for automatic detection of maize Northern Leaf Blight disease and tassel type were also developed with the use of a convolution neural network and integrated into smartphone application [5,6]. Additionally, the estimated time and the number of tassel emergence in maize were managed automatically with the neural network regression approach. Furthermore, [7] reported deep learning based on the recurrent neural network through preselection data to estimate maize yields in the USA. e utilized constituents included weather data, soil quality, and soil moisture collected at an interval of one hour. ese data were processed using machine learning which predicted the spatial distribution of maize yields with an accuracy of above 80%. Machine vision-based technology was used to separate seeds based on parameters such as colour, texture, size, and shape characteristics based on photo images which were later applied to seed handling and seed sorting activities [8].
Other researches on object detection were introduced by involving hyperparameter setting for kernel counting on maize ear by using the CNN supervised learning method [9]. e results of the parameters through training revealed that the model is appropriate for calculating the number of seeds on an ear automatically with a depicted correlation level of more than 0.90. e integration of deep learning and machine learning into prediction and classification tasks has attracted increasing attention in recent years. By combining diverse base machine learning estimators of various types, it is possible to create a more performant model, referred to as an ensemble model [10]. Ensemble methods outperform more established machine learning techniques referred to as base learners. Stacking is an ensemble approach that is frequently used to improve prediction accuracy by combining two or more independent machine learning algorithms into a metamodel [11]. Once it relates to prediction, stacking ensemble machine algorithms outperform single algorithms. e stacking of ensembles technique has been applied to a variety of agricultural problems, including prediction, classification, regression, and feature selection.
To address quality control concerns in maize seed production, the study sought to develop efficient tassel recognition methods capable of rapidly capturing various types of maize tassels during generative development. e detection model is capable of accurately identifying a variety of tassel parameters, including height, width, branches, another colour, glume colour, and other morphological characteristics of a particular line.

Study Area.
e study to classify maize hybrid parental lines was conducted in 2020 in two different seasons, namely, rainy and dry seasons, at four different locations: Gowa, Bone, Maros, and Pangkep in South Sulawesi Province, Republic of Indonesia. Two lines were planted, including male parent of Nasa 29 variety (i.e., G102612 line) and female parent (i.e., MAL03 line). Planting system adopted was 1 : 4 ratio where one row of male plants was sandwiched to four rows of female plants ( Figure 1). All lines/genotypes were bred and generated by the Ministry of Agriculture, the Republic of Indonesia, in collaboration with the International Maize and Wheat Improvement Centre (CIMMYT). Substantial variability was observed for all agronomic characteristics at various growing environment conditions across the country. During plant growth, contamination/off-type plants began to appear, necessitating their immediate removal from the fields.
To prevent the genetic aspect from off-type, all two lines were planted separately in a 0.5 ha area and isolated distance of approximately 300 m. Each genetic material was planted at a distance spacing of 70 cm × 25 cm with one plant per stand. Fertilizer was applied twice at 10 and 30 days after planting (DAP) with urea (46% N) and NPK (15% N, 15% P, and 15% K) fertilizers with doses of 300 and 400 kg/ha, respectively.
When entering the owering phase, the female parents' tassel should be removed to allow pollen from the male plants to fertilize the female lines, leading to hybrid seed generation. Arbitrary tassel image datasets were obtained for female parents, male parents, and contaminant plants. e contamination plants (neither male nor female plant) frequently existed as volunteer plants that should be quickly removed, such as Mr14, CLYN230, and CY7 lines. e dataset collected from the parental lines di ered regarding the tassel types and was trained and validated with the deep learning CNN model. Images of the experimental sites were captured using an Inspire 2 drone equipped with a Zenmuse X5 Camera and 3-Axis Gimbal with 15 mmf/1.7 lenses, whereas the tassel image was captured using a high-  Applied Computational Intelligence and Soft Computing resolution CCD Digital Camera (Samsung). e captured maize tassel datasets were utilized for training and validation, as shown in Table 1. All tassel images were divided into 80% for model development using training datasets and the remaining 20% to validate tassel datasets. is method follows the majority of machine learning studies' recommendations that the validation set should be 10%-30% of the total dataset [12].

Model Development.
e classification model for maize hybrid lines was developed using the Convolutional Neural Network (CNN) algorithm and was based on tassel-type features. CNN is the Multilayer Perceptron (MLP), which is designed to process two-dimensional data. Meanwhile, CNN is included in Deep Neural Network (DNN) analysis because of its high deep network that has been applied in several data image classification. Besides, CNN is a type of neural network commonly used in image data that can detect and recognize objects of an image. Generally, it does much from the usual neural network. Also, it consists of neurons that have weight, bias, and activation functions. However, the only difference can be seen in the architecture divided into two parts: Feature Extraction Layer (FEL) and Fully Connected Layer. FEL processed an image into features in the form of numbers, thus representing an image. Additionally, FEL consists of two parts, namely, the convolutional layer and the pooling layer.
Much of recent research on deep CNN has focused on increasing the accuracy of computer vision datasets, including improvement of network architecture. In this study, the classification of parental lines hybrid maize variety was developed by comparing several well-known CNN architectures, that is, Inception V3, SqueezeNet, and EfficientNet-B1. Inception V3 and SqueezeNet CNN analysis were done by using Orange visual programming. Additionally, Effi-cientNet-B1 execution was done in Google collaborator GPU deep learning service.

Ensemble Machine Learning.
Transfer learning was combined with a deep learning model and several machine learning algorithms in this study to determine the type of parental lines (female line, male line, and off-type lines). Transfer learning was used to create the features, which were generated using the Inception V3 and SqueezeNet models. e models were built using logistic regression (LR), support vector machine (SVM), random forest (RF), and k-nearest neighbors (KNN) algorithms. By comparing the contribution of different models, the best model can be chosen as the base model for the stacking stage. e overall framework for the detection of maize parental lines images using ensemble machine learning is shown in Figure 2.
Image embedding was used for proceeding tassel classification on parental maize lines using Orange Software [13]. e embedding process will generate a vector representation of each image in a large number of features from a deep network. e Keras Python library provides an interface to the InceptionV3 and SqueezeNet tassel detection models. Inception V3 is among the CNN architectures that Szegedy first introduced [14]. e study involved Inception V3 by considering that auxiliary classifiers did not contribute much during the training stage in the previous versions, and the possibility of fixing Inception V2 without drastically changing modules should be investigated further. As a solution, Inception V3 was involved in the study where all the improvements over Inception v2 were involved, such as the use of RMSProp optimizer, factorized 7 × 7 convolutions, batch norm in the auxiliary classifiers, and label smoothing (a type of regularizing component added to a loss formula which serves to prevent overfitting). Inception V3 represents an image as a set of 2048 features, which can then be further processed using supervised or unsupervised machine learning techniques. Figure 3 depicts the ensemble process using transfer learning based on the Inception V3 model.
Another model that was also assessed in the developed framework is SqueezeNet, which was done through transfer learning from the pretrained ImageNet datasets. SqueezeNet transforms raw images into their vector representation using a Deep Neural Network that was trained on millions of reallife prints (called image embedding). SqueezeNet is a deep model with microarchitecture to enable accurate image classification with fewer parameters [15]. It has a vital parameter, defined as a fire model that comprises a squeeze convolutional layer that has filters fitting into an expanded layer with a mix of convolutional field filters. It has fewer parameters and offers several advantages, such as a more efficient distributed training communication among servers and faster training time due to the small model architecture. SqueezeNet implemented three significant strategies for building efficient network architecture; that is, instead of 3 × 3, SqueezeNet used 1 × 1 filters and expanded into convolution filters (called fire module). Also, a bypass system was employed to increase the filters per fire module. Max-pooling and global average pooling were involved before generating prediction/classification. Figure 4 depicts the ensemble process using transfer learning based on the SqueezeNet model. A fixed training/validation percentage of the sampling method was applied to assess the performance test and score of the tassel prediction by incorporating four popular machine learning methods: logistic regression, support vector machine, random forest, and k-nearest neighbors.

Machine Learning Algorithms.
Machine learning has recently gained popularity in the field of pattern recognition, including image embedding, particularly in agricultural imaging. In this study, we examined a large number of features from a deep network via transfer learning of Inception V3 and SqueezeNet, as well as four popular machine learning models, that is, LR, RF, SVM, and KNN.
Logistic regression (LR) aimed to model the probability of an event occurring depending on the independent variables' values, in either numerical or categorical value. In machine learning, LR is used to classify data observations by estimating the probability that the observation is in a particular category. For example, the off-type plant's probabilities in the maize eld can be predicted by various morphological information such as tassel shape, colour, glume, circularity, and silk colour. LR is also used to predict the e ect of a series of variables on a binary response variable.
Support vector machine (SVM) is a type of machine learning to precisely classify di erent objects by drawing a decision boundary known as a hyperplane near the extreme points in the dataset. Essentially, SVM is a frontier that best segregates the two classes by creating a decision boundary that segregates the two classes. SVM can be used in multidimensional datasets, and the data points are referred to as vectors as they have coordinates within the space of data. In high-dimensional space, a function was used to transform the data from two-dimensional to three-dimensional features. A kernel trick was adopted to reduce the computational cost of a function that takes as inputs vectors in the original space and returns the vectors' dot product in the feature space (kernel function). Among the popular kernel types to transform the data into high-dimensional features are RBF, sigmoid, and polynomial. By selecting the appropriate kernel, high accuracy in classifying objects is permitted.
Random forests (RF) combine the simplicity of decision trees with various exibility, resulting in a fast improvement in accuracy. RF makes a simple yet e ective machine learning method for optimal classi cation or prediction. RF was created by creating a bootstrap dataset by randomly selecting the samples from the original dataset. Ensemble model through aggregations will improve the accuracy of RF and reduce the cost associated with storing and getting inferences from multiple models.
K-nearest neighbors (KNN) is a classi cation method for a particular dataset based on previously classi ed data. KNN is supervised learning, where the results of new query instances are classi ed based on the majority of the existing categories' proximity. Proximity can be thought of as the inverse of distance or inversely proportional to distance. e smaller the distance between two instances, the greater the "proximity" between the two cases.
us, the k-nearest neighbors of an instance are de ned as the k instance with the smallest distance/greatest proximity.

Cross-Validation
Algorithms. K-fold cross-validation is a technique for determining the average success rate of a system by performing redundancy on the input attributes and testing the system for several random input attributes. Cross-validation on a k-fold scale begins by dividing the desired number of n-fold datasets. e rst fold occurs when the rst part of the data is considered as validation data and the remaining as training data. en, using that portion of the data, determine the accuracy, similarity, or proximity of a measurement result to the actual number or data. e second fold occurs when the second part of the data is treated as validation data, and the remaining data is treated as training data. Furthermore, accuracy is calculated by segmenting the data and continuing until it reaches the k-fold. Calculate the average precision of the k accuracy pieces mentioned previously.
is average precision becomes the nal precision. Figure 5 illustrates the overall process of k-folds.

Model Stacking.
Stacking is a type of ensemble machine learning algorithm that utilizes metalearning algorithms to determine the optimal way to combine predictions from several base models. It is primarily used to train a new  Applied Computational Intelligence and Soft Computing model using the output of multiple weak learners. Wolpert [16] introduced the stacking algorithm and generated a new direction in the eld of combined models. In comparison to base models, the stacking model outperformed them. Sigletos et al. [17] conducted a thorough comparison of the bagging, boosting, and stacking algorithms and discovered that the stacking algorithm o ers signi cant robustness bene ts. e model was developed over the sequence of a twostage training procedure. e model stacking process is described as follows: (1) Using k-fold cross-validation, train the base model on the same datasets as the test model.    selected in the previous step. While training with new features, the 5-fold cross-validation algorithm was used to ensure the robustness of the new features. Following that, LR was used as the metamodel to construct the model and make a nal prediction in the subsequent stage. Figure 6 shows the overall ow of the model stacking process.

E cientNet-B1
. E cientNet is a relatively new-scale neural architecture search to e ciently scale up CNN's size. e signi cant advantages of E cientNet include higher accuracy, e ciency, and scaling of the CNN model [18]. is model is a compact model and may be combined with pretrained models for the purpose of determining the optimal scaling parameters for the computational process. is model also integrated with the TensorFlow Keras application ecosystem for executing the E cientNet model. E cientNet comprises ve active modules that will be further combined to build a subblock in a skip connection. E cientNet consisted of eight versions from E cientNet-B0 to B8 (Figure 7). Model architecture is almost the same except for the number of feature maps used in the model version.

Accuracy Assessment.
Four popular measures were employed to determine the robustness of the models, that is, classi cation accuracy (CA), precision, recall, and AUC (equations (1) to (3) and AUC) as follows: True Negative (TN) values are data that are correctly classi ed as negative or false outputs. True Positive (TP) is data that is properly classi ed as positive or true output. False Positive (FP) is data that is classi ed incorrectly if the output is positive or true. False Negative (FN) is data that is classi ed incorrectly. Also, a confusion matrix was used for summarizing the performance of the tassel classi cation model. e AUC is used to measure performance and separate them among di erent classi cation thresholds. Meanwhile, the AUC probability ranked classi cation model randomly with the positive value ranked higher than the negative. e values of AUC range between 0 and 1. For instance, a model that is predicted as 100% wrong has an AUC value of 0.0, whereas the prediction of 100% correct AUC value is 1.0. AUC was preferably used for this research because it determines how well predictions are ranked rather than using their absolute values. During the onset of the owering phase, the tassel of the female plants (Mal03) was removed to prevent cross-pollination from another source through dispersal agents. e pollen dust from the male plants (G102612) was used to pollinate the female, producing the rst generation called hybrid. Field researchers carried out strict surveillance to ensure the female tassel was carefully, regularly, and completely removed. is activity was done manually and involved many eld researchers and technicians with di erent expertise knowledge within ve to ten days per hectare.

Results and Discussion
Proper detection of the type of maize tassel will signi cantly a ect the purity of the seeds being produced. Prior to seed marketing in Indonesia, the national standard requires 98 percent purity and a 2% germination rate. Additionally, o -type/contamination tassel removal must consider several factors such as plant agronomic and environmental parameters. e male/G102612 line was developed from the recombination of 5 drought-tolerant lines introduced by CIMMYT in 2005. Additionally, the female/ Mal03 line was developed from a base population that was resistant to downy mildew disease as part of the Asian Biotechnology Maize Network (AMBIONET) project, which was a collaboration between the CIMMYT and IAARD. e agronomic characters of the two parental lines and other contaminant lines are shown in Table 2.
e selection of appropriate lines should consider the genetic background and combine ability, either general combining ability or speci c combining ability of the crossed lines. In general, the ideal parental pair for a hybrid has a good speci c combining ability value, taking into account the tested lines of agronomic characters. Morphologically, the female/Mal03 parent was characterized by tassels with an upright and compact type with a length ranging from 230 to 260 mm and the number of branches ranging from 12 to 13 branches with spikelets in each branch ranging from 23 to 73 spikelets per plant. e compact and erect tassel character and semierect leaf type are expected to minimize obstacles in male plants' pollination so that the pollination occurs smoothly and more responsive toward high yield [19]. e choice of Mal03 as the female parent was due to the excellent Applied Computational Intelligence and Soft Computing cob appearance, enabling a good grain lling and increasing seed yield. e male/G102612 parent has di erent morphological features compared to the female parent. e choice of G102612 as the male parent was mainly due to the longer shape ranging from 320 to 400 mm, with a higher number of branches and spikelets surrounding the tassel. Furthermore, the good tassel performance enables the production of huge amounts of pollen applied to the female plants through cross-pollination. Another merit of the male line was the compact tassel shape and its ability to slowly release pollens to enhance female plants' pollination. As for male performance, our previous research indicated that the G102612 line could release pollen up to 10 days after the initiation of the maize hybrid tassel.

Metamodel Comparison for Accurate Classi cations.
ree popular deep learning models were trained and validated in two cloud-based software services, namely, GPUbased Google collab and Orange Software. Notwithstanding, the model framework relies on detailed tassel morphological information input, which is crucial for generating accurate classi cation. Two popular CNN models were also examined for our study as comparisons, that is, Inception model Version 3 and SqueezeNet using visual programming Orange 3.1 environment. InceptionV3 is pretrained on ImageNet datasets, and the embedding process makes use of activation from the model's penultimate layer, which represents the image as a vector. Transfer learning was involved by generating a vector representation of each image in a huge number of features from the Inception model. SqueezeNet is built on ImageNet using the pretrained model weights. e embedding is pre-SoftMax ( atten10) layer activation. As many as 2048 vector values (de ned as N0 until N2047) were generated to allow visual assignment of tassels according to their classes (male, female, or o -type/contamination). e model has a total of 3,251,763 parameters that were further used to train and validate the tassel classi cation model.
To build models for predicting parental lines, we used traditional machine learning and ensemble methods (stacking model) in combination with pretrained Inception V3 and SqueezeNet. Four popular machine learning techniques were used in this study: logistic regression, support  vector machine, random forest, and k-nearest neighbor. Similar to the EfficientNet datasets, 80% of datasets were used for training the model and the remaining 20% for validation. e tuning of hyperparameters will be included in order to obtain optimal prediction, as shown in Table 3. Cross-validation was applied to derive predictions from training data and to train higher-level machine learning models. Tenfold cross-validation was adopted to evaluate all nine unstacking examined models. e EfficientNet B1 used in this study is part of the TensorFlow and/or Keras applications ecosystem of pretrained models to adapt the transfer learning paradigm and to design and implement the prediction framework by considering the parameters optimization. TensorFlow is a popular software library used by engineers to study deep computation [24]. Integrating TensorFlow with Keras can execute the model faster in an efficient model-building approach.
e input shape was set to 240 × 240-pixel image resolution, followed by advanced data augmentation schemes such as flip, rotation, zoom, and shear, which were applied prior to model training. EfficientNet B1 utilizes a stochastic depth regularization to drop out neurons and the entire path of a network. e hyperparameter model used in the simulation included an optimal learning rate of 0.001; the number of epochs was 50 with three classification classes: Mal03, G102612, and contamination plant. e loss function used was the neural network cross-entropy loss which was optimized using the Adam optimizer. A SoftMax activation dropout and average pooling were adopted to smoothen out the last output dimension from the EfficientNet with noisystudent weight. e noisy student is a semisupervised learning technique that uses a larger or equal-sized student model and adds noise to the student during training [25]. e model has 7,890,051 parameters, with a composition of 7,828,003 trainable parameters and 62,048 nontrainable parameters. A summary of the model parameter of blocks 1A and 7b is shown in Table 4. ese long blocks will have a varying number of subblocks with the increase of the model version.
e output was presented as model loss and accuracy, respectively, in Figure 8, which were later used to assess the impact of parameter setting on the model performance. e model was first tested against the training datasets to check the accuracy of tassel classification. e model gave a good precision in matching features with an approximation of 99.8% of the tassels correctly classified. Simultaneously, the model validation indicated that the magnitude of the model's response to the input change might cause a less significant reduction in the model accuracy of 94.9%. Adjustment of the dropout layer parameter to an optimum value of 0.5 provided a maximum increase in model accuracy for distinguishing features.
Meanwhile, the loss value of the model during training and validation processes remains stagnant at 0.15. A few errors in classification may be attributed to the model's difficulties to quickly identify tassels with similar morphological shapes, particularly female and off-type/contaminant categories. Most tassels were vulnerable to contamination from the surrounding maize plant due to exposure in the field.
Four indices, accuracy, AUC, precision, and recall, were used to assess the model performances. Inception V3 (LR with ridge regularization type/Model 4) and SVM with radial basis function kernel/RBF (Model 2) showed a performance rate of 0.975 or 97.50% and 0.967 or 96.70% correctly classified on test data. KNN and RF performed worse than LR and SVM, although the accuracy level was still under acceptable level, that is, 95.30% and 91.50%, respectively. Model improvement was also assessed by incorporating a newer CNN architecture, SqueezeNet, as a light image embedding widget to generate 1000 vector values (defined as N0 until N999) assigned to the tassel class. e model has a total of 1,587.000 parameters, the lowest parameter numbers as compared to the EfficientNet and Inception V3. SVM and LR SqueezeNet (Models 6 and 8, resp.) perform better at predicting tassels than KNN and RF models (Table 5).
Two additional models were evaluated for accuracy using ensemble methods: Inception V3 (LR aggregate) and SqueezeNet (SVM aggregate). Since LR, SVM, and KNN outperform all other models, they were used as the base models in stacking running. Stacking works by deducing individual learners' biases in relation to the training set [26].
During the initial training stage, the three models generated new features, and cross-validation was used to demonstrate objective accuracy. e final ensemble model is validated  Final estimator � logistic regression; classifiers � random forest, support vector machine, and KNN [23] using LR and SVM algorithms. e model selection process is critical to the performance of the stacking model. All base models should perform admirably. Following model selection, there are no additional limitations on the newly created features. e stacking model and the other models' evaluation results are summarized in Table 6. Table 6 indicated that the Inception V3 stacking model with LR (Model 10) produced the most accurate results. Additionally, the SqueezeNet stacking model with SVM outperformed the KNN and RF combination, indicating that the combination of stacking signi cantly improved the model's performance. Combining Inception V3 deep learning and LR as metalearner, stacking ensemble produced the best classi cation performance. Additionally, it demonstrated slightly improved classi cation performance for the datasets when SqueezeNet-SVM metalearners were used.
is result demonstrated that stacking LR models can signi cantly improve prediction performance without requiring additional parameter tuning. In terms of performance, the stacking model outperformed the others in terms of accuracy, AUC, precision, and recall ( Figure 9). Likewise, E cientNet B1 showed a lower performance, although the cores are still comparable with approximately 4% di erence with the best model. By ne-tuning or changing the B version of E cientNet from B1 to a higher version, it is possible to explore a deeper network and increase the model's accuracy. Model ensemble has been shown to be superior to traditional ML in making predictions [27]. Additionally, [28] reported that incorporating logistic regression, support vector machine, and neural network approaches into Fuzzy Dempster-Shafer (FDS) analysis improves the classi cation accuracy of paddy rice images signi cantly.
A simple corresponding plot of prediction errors was created for each of the evaluation datasets in order to investigate the e ect of the various tassel traits on classi cation ability ( Figure 10). e prediction errors indicate how perplexed the model is while predicting the tassel type. e plot indicates that the majority of misclassi cation occurs in female and contamination lines, whereas male lines exhibit less misclassi cation. Breeding of hybrid maize is essential for developing new plant varieties with higher yield potential and diseased resistance. Tassel morphology is considered one of the most essential traits for generating high purity maize hybrid [29]. Several lines were observed with slightly similar tassel morphological characteristics, including Mal03, CLYN 230, and CY7 (parental lines of other varieties) variety, which potentially makes misclassi cation. e male parent has also shown slightly similar attributes to the Mr14 line (parental line of other varieties). Further, the CNN model is susceptible to the slight change of tassel input. Tassel that emerged mainly occurred between 50 and 54 days after planting under a tropical lowland environment. However, the tassel that emerged within 3-4 days earlier   produced a di erent pattern of pollen shed and hanging o of the tassel. After emerging for 7-10 days, the tassel will gradually be dried o , which could be attributed to environmental conditions. Illumination and background images were factors that a ected the proper detection of the tassel at random moments. Under natural conditions, information extraction from the plant would produce substantial amounts of noise such as the surrounding leaf, skylight, and sunshine illumination. Furthermore, [30] explained that computer vision advancement allows the machine to identify and detect tassel location with a challenging accuracy in uenced by background information. Besides, [31] also reported that the overlapping technique used in this study was one commonly used in the detection of the maize tassel worldwide using global regression [32][33][34] as well as local regression [35][36][37][38].

Interactive Data Visualization.
e CNN method is a complex model, and many researchers have not fully understood the knowledge of network operations and behaviour in achieving good performance. In many cases, trial  Figure 9: Comparison measures to determine the robustness of the models. and error methods have been deployed to ascertain the tuning parameters and network architecture to produce optimal results where needed. During the training process, a layer visualization was performed to observe how the model identi ed the inputs that were entered into the model. e weights of the convolution process were visualized to estimate how well the model can be trained. A good training model usually has a smooth and continuous lter, whereas the over tting model will display a pattern with lots of noise [39]. e visualizations became more complex by encoding a more profound process of extracting features that were actively performed. Furthermore, a connected layer was mainly used to reconnect all nodes and determine which node was the most correlated with a particular class. Although the training visualization process has been carried out regarding the di erences in tassels' morphology, the types of tassels that are o type are sometimes complicated to analyse even by using visual observation.
To explore and gain deeper information on misclassication and make sense of it, a multidimensional scaling (MDS) analysis was applied to the model. Firstly, all vector parameters of the deep learning model were fed into a hierarchical clustering widget to generate a tassel dendrogram to classify each input parameter's similarity according to their target class.
e Euclidian distance was applied to calculate the spaces between the embedded image parameters and display the result by constructing a connecting dendrogram. For further exploration, an MDS analysis was involved in projecting multidimensional data to a 2D space to allow a deeper understanding of the dataset's underlying pattern.
e work ow of the dendrogram and multidimensional data projection is shown in Figure 11. e highlighted MDS plot is shown in Figure 7 and derived from three di erent maize lines/genotypes classes and has various vector features related to the line/genotype class. ere are three groups, blue for male plants (G102612), red for female plants (Mal03), and green for o -type plants. MDS separates the G102612 and Mal03 plants very well, and the separation is less pronounced between Mal03 and otype plants (Figure 7). is indicates that the prediction  errors mainly occurred between Mal03 and off-type plants due to the highly similar data features, making it difficult for the model to accurately diagnose these two classes in several datasets. is unsupervised data is very useful to help plant breeders to get insight into the morphological aspects and additional information from big data processing of plant features (over 3,251,763 parameters). Several studies indicated that the problematic light condition affects the overall classification of the model being tested [40]. Several False Negative cases were found in the classification process, especially in tassels that had similar shapes. Breeders or technicians assigned for detasseling activities in the field were generally made up of specialists, fully equipped with technical knowledge to differentiate offtype lines not based only on tassel characters but also on other supporting factors such as leaf shape, type, leaf angle, and hair colour. Another essential method used to improve the accuracy of tassel-type classification was rearranging the input image in the relevant region of interest. It enforced the network to identify the line-off-type from the entirely appropriate, focused area. Additionally, a maximum time is warranted at a fixed interval for capturing the tassel before its rupture and also to prevent pollens from off-type lines to adulterate the hybrid maize seeds produced.

Conclusions
e classification of lines was performed using deep learning and ensemble machine learning combinations. is pair of cutting-edge tools is capable of distinguishing between tassel morphological characteristics. ree popular CNN models were examined for the classification of hybrid maize parental lines based on their tassel characteristics: Inception V3, SqueezeNet, and EfficientNet. To summarize, stacking with advanced deep learning as the base learner and logistic regression as the metalearner may be considered the optimal classification model for tassel classification and removal during seed production. e accuracy of the stacking model Inception V3-logistic regression is 98 percent. It also proves that model stacking is effective for high-dimensional features. Similarly, SqueezeNet and EfficientNet B1 demonstrated slightly lower performance, despite the fact that their scores are comparable. To further investigate, a multidimensional scaling analysis was performed by projecting multidimensional data to a two-dimensional space. is analysis revealed that prediction errors occurred primarily between female and off-type plants due to the highly similar data features, making it difficult for the model to accurately diagnose these two classes across multiple datasets. e ensemble model is advantageous for assisting plant breeders in gaining insight into morphological aspects and additional information derived from big data processing of plant characteristics. Further research is required to incorporate machine learning into smartphones to enable in-field and real-time tassel classification.

Data Availability
All data used to support the findings of this work are available upon request from the corresponding author.

Conflicts of Interest
e authors declare no conflicts of interest.