Intelligent Deep Learning Enabled Oral Squamous Cell Carcinoma Detection and Classification Using Biomedical Images

Oral cancer is one of the lethal diseases among the available malignant tumors globally, and it has become a challenging health issue in developing and low-to-middle income countries. The prognosis of oral cancer remains poor because over 50% of patients are recognized at advanced stages. Earlier detection and screening models for oral cancer are mainly based on experts' knowledge, and it necessitates an automated tool for oral cancer detection. The recent developments of computational intelligence (CI) and computer vision-based approaches help to accomplish enhanced performance in medical-image-related tasks. This article develops an intelligent deep learning enabled oral squamous cell carcinoma detection and classification (IDL-OSCDC) technique using biomedical images. The presented IDL-OSCDC model involves the recognition and classification of oral cancer on biomedical images. The proposed IDL-OSCDC model employs Gabor filtering (GF) as a preprocessing step to eliminate noise content. In addition, the NasNet model is exploited for the generation of high-level deep features from the input images. Moreover, an enhanced grasshopper optimization algorithm (EGOA)-based deep belief network (DBN) model is employed for oral cancer detection and classification. The hyperparameter tuning of the DBN model is performed using the EGOA algorithm which in turn boosts the classification outcomes. The experimentation outcomes of the IDL-OSCDC model using a benchmark biomedical imaging dataset highlighted its promising performance over the other methods with maximum accuy, precn, recal, and Fscore of 95%, 96.15%, 93.75%, and 94.67% correspondingly.


Introduction
Oral cancer is leading cancer globally and is considered by late diagnoses, morbidity higher, and mortality rates. Twothird of the total occurrence arises in low-and middle-income countries (LMICs), and half of the cases are in South Asia [1,2]. Excessive usage of alcohol and tobacco are the main risk factor for oral tumors. e major factor in South and Southeast Asia is betel quid chewing which usually comprises slaked lime, betel leaf, and areca nut and might comprise tobacco [3]. Currently, they are commercially offered in sachets and are common in public because of their dynamic marketing strategy. e oral lesion is related to late presentation, mainly in LMIC, around two-third present at a late stage, and consequently, the survival rate is poor [4]. Cancer management, particularly at the late stage, is too expensive [5]. e lack of knowledge of health professionals and lack of public awareness concerning oral lesions are major reasons for late diagnosis. e OPMD diagnosis has a risk of malignant transformation, is of great significance to reduce mortality and morbidity from oral tumors, and has been the major emphasis of the screening program [6]. But the application of this program depends on visual inspection has turned out to be challenging in real-time settings as they depend on healthcare professionals, who are not experienced or adequately trained to identify this lesion [7,8].
Earlier identification of OSCC gains significant importance for improved diagnosis, treatment, and survival [5,6]. Late diagnosis has hampered the quest for precision medicine in spite of the advancements in the understanding of the molecular mechanism of cancer. us, machine learning (ML) and deep learning (DL) models have been employed for improving recognition and thereby reducing cancerspecific death rates and morbidity [7]. Automated image examination clearly has the significance of assisting pathologists and clinicians in the earlier detection of OSCC and decision-making in management.
e considerable heterogeneity in the presence of oral cancer makes the detection highly complex for healthcare professionals and common cause of delays in inpatient referral to oral lesion specialists [9]. In addition, early-stage OSCC lesions and OPMD are generally asymptomatic and might seem like small, harmless lesions, leading to late presentation of the patient and eventually leading to diagnosis delay [10,11]. Advancement in the fields of deep learning and computer vision offers an effective method to propose adjunctive technology that could implement an automatic screening of the oral cavity and present feedback to individuals for self-examination and healthcare professionals at the time of patient examination.
Bhandari et al. [12] aim to improve the performance of classifying and detecting oral tumors within a minimized processing time.
e presented technique comprises a convolution neural network with an adapted loss function to minimize the error in classifying and predicting oral tumors by supporting multiclass classification and minimizing the over-fitting of the data. Lu et al. [13] presented an automatic approach for oral tumor diagnosis on slide cytology images. e pipeline comprises per-cell focus selection, CNN-based classification, and fully convolution regression-based nucleus recognition. e proposed method offers faster per-cell focus decisions at human-level accuracy. Song et al. [14] introduced an image classification method based on autofluorescence and white-light images with the DL method. e data are fused, extracted, and calculated to feed the DL-NN. Next, compared and investigated the efficiency of regularization, convolution neural network, and transfer learning technique for classifying oral tumors.
Figueroa et al. [15] designed a DL training model which provides understandability to its prediction and guides the network to remain focused and precisely delineate the tumorous region of the image. Lim et al. [16] developed a DL architecture called D'OraCa to categorize oral lesions with photographic images. It develops a mouth landmark recognition method for the oral image and integrates it with oral cancer classification as guidance to enhance the classification performance. Shamim et al. [17] evaluated and applied the effectiveness of six deep convolutions neural network (DCNN) models with transfer learning, for recognizing precancerous tongue lesions through a smaller data set. DCNN model can distinguish between five kinds of tongue cancer and differentiate between benign and precancerous tongue lesions.
In comparison with conventional ML models, the DL models receive input and do not involve a complicated feature extraction process. Besides, the heterogeneous pattern can result in variance over distinct instances and thereby causes complexity in handcrafted features with restricted generalization ability. In addition, the DL models exhibit high scalability owing to the capability of processing large amounts of data. e considerable heterogeneity in the presence of oral lesions makes the detection process difficult and is considered to be the leading cause of delays in inpatient referrals to oral cancer specialists. In addition, earlystage OSCC lesions remain symptomless and may look like small, inoffensive lesions, resulting in the late demonstration of the patient and eventually leading to additional diagnosis delay.
erefore, it is needed to design effective OSCC classification models.
is article presents an intelligent deep learning enabled oral squamous cell carcinoma detection and classification (IDL-OSCDC) model using biomedical images.
e suggested model employs Gabor filtering (GF) as a preprocessing process to eliminate noise content. In addition, the NasNet model is exploited for the generation of highlevel deep features from the input images. Moreover, an enhanced grasshopper optimization algorithm (EGOA)based deep belief network (DBN) model is employed for oral cancer classification and detection. e hyperparameter tuning of the DBN model is performed using the EGOA algorithm which in turn boosts the classification outcomes. e experimentation outcomes of the IDL-OSCDC model are performed using a benchmark biomedical imaging dataset. e rest of the paper is organized as follows. Section 2 provides the proposed IDL-OSCDC model, and Section 3 offers the performance validation. At last, Section 4 concludes the study.

The Proposed IDL-OSCDC Model
In this article, a novel IDL-OSCDC model was introduced for the identification and classification of oral tumors using biomedical images. At the initial stage, the IDL-OSCDC model utilized the GF technique to get rid of noise content. Following this, the NasNet model is exploited for the generation of higher-level deep features from the input images. Finally, the EGOA-DBN model is utilized to detect and categorize oral cancer. Figure 1 illustrates the overall process of the IDL-OSCDC technique.
2.1. Image Preprocessing Using GF Technique. In this study, the IDL-OSCDC model utilized the GF technique to get rid of noise content. e GF is a bandpass filter that is effectively executed for variation of image processing and machine vision application. In 2D, the Gabor function is an oriented complex sinusoidal grating reduced by a 2D Gaussian envelope. In a 2D co-ordinate (a, b) scheme, the GFs containing real components and imaginary ones are demonstrated as [18] where in which δ implies the wavelength of sinusoidal factors, and θ signifies the orientation separation angle of the Gabor kernel. Notably, it can be required only to assume θ from the interval [0°, 180°] as symmetry creates other directions redundant. ψ defines the phase offset, σ demonstrates the standard derivation (SD) of the Gaussian envelope, and c denotes the ratio of spatial features (the default value is 0.5) identifying the ellipticity of supports of the Gabor functions. e parameter 0 has been defined by 6 and spatial frequency bandwidth bw as e searching space involved in the NASNet is the factorization of the networks to cells and again splits into blocks. e number and type of cells/blocks are not predefined. However, they need to be optimized for the chosen dataset. e probable functioning of the block comprises convolution, separable convolution, max pooling, average pooling, and identify map. e block has the ability of mapping two inputs into an output feature map. It performs element-wise addition. When the cell receives a block with a feature map size of H × W and stride of 1, the outcome will be the identical size of the feature map. Figure 2 depicts the framework of the NASNet model.
Once the stride is 2, the size is decreased by 2. e cells have been integrated from an optimizing method. e network progress is concentrated on 3 features: the cell infrastructure, the amount of cells that are stacked (N), and the amount of filters from the primary layer (F). Primarily N and F are set in the search. en, N and F from the primary layer are changed for controlling the depth as well as the width of networks. If the search was complete, methods are created with various sizes for fitting the data set. e cell is then related in an optimizing method for developing the NASNet infrastructure. All the cells are associated with 2 input states named hidden state. For providing higher accuracy, NASNetLarge is obtained N as 6, but the essential concern to NASNetMobile is for running with restricted resources. In order to both normal as well as reduce cells, an input size of 224 × 224 × 3 was decreased to a size of 7 × 7 at the output with a chosen group of functions utilizing  [20]. e input/ visible layer is placed at the end of the model, and the features are passed via many hidden layers at the time of the learning procedure. At last, the proper class label will be allocated at the output layer. In addition, RBM comprises input and hidden layers where bidirectional links exist among two layers. Consider that there are m units in the input layer with vector v � v 1 , v 2 , · · · , v i , · · · v m and n units in the hidden layer with vector h � h 1 , h 2 , · · · h n . e energy function of the RBM can be represented using the following equation: where θ signifies the parameters of RBM, comprising unit bias of input layer a i and unit bias of hidden layer b i , and ω ij denotes link weight among the nodes that exist among the input and hidden layers. Based on the energy function of the RBM model, the joint distribution can be defined as follows: where R(θ) is termed as a normalization factor. e independent probability distribution of the input layer can be formulated as follows: As there exist no links among the nodes in the equivalent layer, the conditional probability distribution of all layers can be defined as follows: where σ(x) � 1/(1 + exp (x)) indicates sigmoid function. e intention of RBM is the maximization of probability p(v) via modifying bias a i , b j , and weight ω ij . e RBM parameters set θ � a i , b i , ω ii is attained from training data by the use of the maximum likelihood estimation approach. e gradient value of the parameters can be represented as follows: where 〈·〉 data signifies the probability of p(hv) derived by RBM, 〈·〉 model characterizes probability p(v, h) provided by the reconstructed RBM. Also, the parameter set θ can be reorganized using the contrast divergence model.
where α and β indicate learning rate and batch size. Once the initial training process of RBM is done, the present hidden layer turned it into the visible layer of the succeeding RBM. Once every RBM training is done, the deep features are classified.

Hyperparameter
Optimization: EGOA Algorithm. e hyperparameter tuning of the DBN model is performed using the EGOA algorithm which in turn boosts the classification outcomes. GOA emulates the behavior of grasshopper insects.
is insect affects agriculture and crop productivity, and the life cycle comprises egg, nymph, and adulthood [21]. In the nymph stage, the key feature includes moving and jumping in the rolling cylinder (with slow movement and small steps). In the adulthood stage, grasshopper migrates a longer distance in a swarm (with longrange and abrupt movement). Such behaviors are arithmetically expressed by taking the location of the grasshopper into account (x i ).
whereas S i signifies social interaction of the ith grasshopper as follows: Now, d ij indicates the distance between the ith and jth grasshoppers whereas s denotes the strength of social force function.
In which G i and A i represents the gravity force and wind advection for ith grasshopper correspondingly, l and f indicate the attractive length scale and the intensity of attraction as follows: where e w and e g represent the unity vector to the direction of the wind and the center of Earth, and g and u represent the gravitational constant and constant drift correspondingly. But equation (10) could be directly used for finding the solution to the optimization issue; hence, the researcher is rewritten as the following equation: where l and u represent the lower and upper bounds of the searching region, correspondingly; T d denotes the value of the optimal solution, and s is determined in equation (12). But, in equation (14), gravity is not taken into account, and the direction of the wind is often considered as T d . Now, c represents a reduction coefficient to shrink the attraction, comfort, and repulsion zones.
where c max and c min represent the maximal value (equivalent to 1) and minimal value (equivalent to 0.00001) of c, correspondingly; t denotes the existing iteration, and t max represent the maximal amount of iterations. At last, the pseudocode of the GOA is given in Algorithm 1.
In the EGOA, the OBL approach was utilized for determining the opposite solution to the existing solution, and it then utilizes the value of fitness function (f ) for determining if the opposite has superior to the existing solutions. e fundamental explanation of OBL is presented in [22], by considering the opposite value x to the real value x ∈ u [ that is computed as is definition is the generalization to n-dimensional by utilizing the following subsequent formula: whereas x ∈ R n refers to the opposite vector in the real vector x ∈ R n . Besides, with the optimized procedure, the 2 solutions x ( and x) are calculated, and the optimum solution is saved, but the other was eliminated by relating the fitness function. For sample, if f(x) < f(x) (to minimized), x is stored; else, x is saved.

Results and Discussion
is section investigates the oral cancer classification performance of the IDL-OSCDC model using the benchmark Kaggle repository [23]. e dataset includes images of lips and tongue which are classified into cancerous and noncancerous groups. A sample image is demonstrated in Figure 3.  Table 1 and Figure 5 report an extensive oral cancer classification performance of the IDL-OSCDC approach on the test and training dataset. e results are inspected under distinct sizes of TR/TS data. e experimental outcome signified that the IDL-OSCDC model has reached proficient values under all sizes of TR/TS data. For instance, with TR/ TS set of 90 : 10, the IDL-OSCDC model has provided accu y , prec n , reca l , and F score of 92.86%, 90%, 95%, and 91.81%, respectively.
A brief precision-recall examination of the IDL-OSCDC model on different TR/TS datasets is portrayed in Figure 6. By observing the figure, it is noticed that the IDL-OSCDC model has accomplished maximum precision-recall performance under all datasets.

Computational Intelligence and Neuroscience
Initialize the value of the parameters namely population size (N), c max , c min and maximal amount of iteration (t max ) Produce a population (X) randomly Set the recent iteration t � 1 While (t < t max ) do Calculate the fitness function f Choice of the optimal solution T d Upgrade the value of c by equation (15) for i � 1: N do Normalize the distance among the solutions in X.
Where t ALGORITHM 1: Pseudo-code of GOA  Computational Intelligence and Neuroscience Figure 7 demonstrates the ROC inspection of the IDL-OSCDC model under different sets of training and testing datasets. e result indicates that the IDL-OSCDC model has resulted in the highest performance on the testing dataset over the other ones. Figure 8 illustrates the training and validation accuracy investigation of the IDL-OSCDC approach on the applied dataset. e figure conveyed that the IDL-OSCDC model has offered maximum training/validation accuracy in the classification process.
Next, Figure 9 represents the training and validation loss examination of the IDL-OSCDC model on the applied dataset. e figure reported that the IDL-OSCDC model has exhibited reduced loss values. Table 2 investigates the comparative study of the IDL-OSCDC technique with recent approaches [24]. Figure 10 inspects the detailed accu y examination of the IDL-OSCDC model with other models. e figure revealed that the SVM method has resulted in least performance with a lower accu y of 88.38%. In addition, the ANN-SVM technique has reached a slightly enhanced outcome with accu y of 90.48% whereas the fuzzy technique has depicted a moderately improved accu y of 92.76%. Following this, the RF and CapsNet technique have shown closer results than the other methods. However, the IDL-OSCDC model has shown an effectual outcome with a maximum accu y of 95%. Figure 11 examines the detailed prec n , reca l , and F measure examination of the IDL-OSCDC model with other techniques. e figure exposed that the SVM system has resulted in least performance with lower prec n , reca l , and F measure of 89.82%, 90.65%, and 88.01%. Furthermore, the ANN-SVM model has reached slightly enhanced outcome with prec n ,

Conclusion
In this article, a novel IDL-OSCDC model has been established for the identification and classification of oral lesions using biomedical images. At the initial stage, the IDL-OSCDC model utilized the GF technique to get rid of noise content. Following this, the NasNet model is exploited for the generation of higher-level deep features from the input images. Finally, the EGOA-DBN model is utilized to detect and categorize oral cancer. e hyperparameter tuning of the DBN model is performed using the EGOA algorithm which in turn boosts the classification outcomes. e experimentation outcomes of the IDL-OSCDC model are performed using a benchmark biomedical imaging dataset. An extensive comparison study highlighted its promising performance over the other methods. In the future, advanced DL models can be utilized as a classifier to optimize the detection performance.

Data Availability
Data sharing is not applicable to this article as no datasets were generated during the current study.

Ethical Approval
is article does not contain any studies with human participants performed by any of the authors.

Consent
Consent is not applicable in this study.

Conflicts of Interest
e authors declare that they have no conflicts of interest to report regarding the present study.