Multimodal Brain Tumor Classification Using Convolutional Tumnet Architecture

The most common and aggressive tumor is brain malignancy, which has a short life span in the fourth grade of the disease. As a result, the medical plan may be a crucial step toward improving the well-being of a patient. Both diagnosis and therapy are part of the medical plan. Brain tumors are commonly imaged with magnetic resonance imaging (MRI), positron emission tomography (PET), and computed tomography (CT). In this paper, multimodal fused imaging with classification and segmentation for brain tumors was proposed using the deep learning method. The MRI and CT brain tumor images of the same slices (308 slices of meningioma and sarcoma) are combined using three different types of pixel-level fusion methods. The presence/absence of a tumor is classified using the proposed Tumnet technique, and the tumor area is found accordingly. In the other case, Tumnet is also applied for single-modal MRI/CT (561 image slices) for classification. The proposed Tumnet was modeled with 5 convolutional layers, 3 pooling layers with ReLU activation function, and 3 fully connected layers. The first-order statistical fusion metrics for an average method of MRI-CT images are obtained as SSIM tissue at 83%, SSIM bone at 84%, accuracy at 90%, sensitivity at 96%, and specificity at 95%, and the second-order statistical fusion metrics are obtained as the standard deviation of fused images at 79% and entropy at 0.99. The entropy value confirms the presence of additional features in the fused image. The proposed Tumnet yields a sensitivity of 96%, an accuracy of 98%, a specificity of 99%, normalized values of the mean of 0.75, a standard deviation of 0.4, a variance of 0.16, and an entropy of 0.90.


Introduction
A tumor is an unusual mass detected inside or on the brain.A tumor is a solid or fluid-filled mass of aberrant tissues.The tumor is also known as a neoplasm.According to the global cancer registered data, cancer cases in both sexes account for approximately 18,000,000, with around 20,000 instances of brain tumors.Very high HDI (human development index) regions had the maximum occurrence (102,260 cases, or 34.4%) and mortality (77,815 cases, or 32.3%) [1].There are various approaches to imaging which are MRI, PET, and CT to diagnose the tumor's position and size.MRI is a technique that is noninvasive and produces comprehensive 3D anatomy images [2].CT scan is another imaging technique that provides tumor information in a few seconds.Apart from this, PET is a functional imaging technique [3].The image fusion method is stated as collecting all of the necessary data from several images and fusing them into a single fused image.More informative data will be obtained from the single fused image than from any of the input images, and it contains all the mandatory data [4].Image fusion's objective is not just to minimize the amount of information but also to create images that are better appropriate and suited for understanding humans and machines.Image fusion is useful in medical imaging applications because it improves radiologists' detection of abnormalities in CT and MR brain images [5].Image fusion will provide fused pictures that are more insightful than the separate input images, which makes them more appropriate for classification problems [6].It decreases the volume of data, holds significant features, removes artifacts, and provides an output image that will be more suitable for interpretation.Image fusion can be broadly categorized as follows: (i) Multimodal image fusion (ii) Multiview image fusion (iii) Multifocus image fusion (iv) Multitemporal image fusion Multimodal is obtaining different imaging sensors and is used for medical diagnosis and security.A multiview is a single sensor image from different viewpoints.Multiple focal lengths of imaging equipment are used to capture multifocus images.Multitemporal refers to pictures taken at various intervals of time.There are various stages for grading image fusion processes.Multimodal fused images are employed in biomedical processing among the four categories of fusion techniques based on image acquisition.This is due to the integration of multiple pieces of information into a single image, an essential requirement for physicians to conduct in-depth analysis and proceed with further assessments [7].
The three types of fusion are feature-level fusion, pixellevel fusion [5], and decision-level fusion.Pixel-level picture fusion is thought to be the simplest and most successful method for analyzing it [8].Unlike alternative approaches, pixel-level image fusion generates a combined image that is richer in information for both computer processing and visual perception.This is achieved through the direct integration of the original information from the source images [9].In contrast to alternative methods, pixel-level image fusion directly integrates the inherent details from the source images to generate a fused image that is more comprehensive in terms of data for both computer processing and human vision.In these approaches, the resulting fused image incorporates either the maximum, minimum, or average values of corresponding pixels from the two input images.Feature-level fusion extracts features such as edges and textures and then fuses these supplementary features.In decision-level fusion, a decision is obtained from the source images through certain criteria and then the information from the source images is fused [9].Among these techniques, pixel-level fusion is a simple technique, feature-level removes redundancy, and decision-level is a robust technique.Among the other methods of image segmentation, segmentation based on artificial intelligence (AI) improves accuracy comparatively and also saves time.Artificial neural networks (ANNs) are among the most dominant AI techniques available, which can categorize and quantify lesions with pinpoint accuracy as well as mimic the clinical evaluation for a given problem [8].
An artificial neural network separates the defective regions of a picture by pixel-by-pixel processing [10].After statistical features are extracted from the problematic regions, a supervised algorithm grades the image [11].AI systems are thought to have a significant interest in the field of medical diagnosis using machine learning and image processing [12].
The most common application of artificial intelligence in medical image classification and recognition is artificial neural network approaches [13].A significant method for the effective identification of brain tumors is the artificial neural network.Steps are taken in this work to detect brain tumors accurately [10].Although other methods of segmentation have their own merits and demerits, the convolutional neural network method is based on decision-making by learning from the given set of images [11].
Our work's contribution is as follows:

Related Work
Extensive literature was made on the fusion, classification, and segmentation of brain tumor images.A detailed analysis was performed on various techniques of image fusion, segmentation, and classification.Liu et al. [12] provided a thorough analysis of recent advances in deep learning-based pixel-level image fusion techniques.They discuss the current state-of-the-art approaches, including single-shot and multiscale fusion methods, and analyze their benefits and drawbacks.Altaf [14] proposed a method for accurately delineating the gross tumor volume in brain gliomas using CT-MRI image fusion.The author presents a framework that integrates different image processing techniques, 2 Behavioural Neurology including registration, segmentation, and fusion, to achieve better tumor volume estimation.Selvakumar et al. [15] proposed a method for neoplastic segmentation and area calculation using fuzzy C-means and K-means clustering algorithms.The authors discuss the merits and demerits of each technique and show that the proposed method can achieve better segmentation results compared to other state-of-the-art methods.Rammurthy and Mahesh [16] proposed an MRI image-based deep learning method for the identification of brain tumors.The authors employ a Whale Harris Hawks optimization algorithm to make CNN learn for classification.The review provides useful insights into the cutting-edge methodologies for medical image registration and fusion [17].Maqsood et al. [7] proposed a technique to identify brain tumors using image fusion based on CNN.Selmakuvar et al. [15] presented a narrative review of brain image segmentation methods in recent years.They discussed various techniques such as deep learning, clustering, and graph-based methods, and highlighted their advantages and disadvantages.Pereira et al. [18] developed a CNN-based method for segmenting brain tumors in MRI images.They used a small kernel of size (3 × 3) and attained 0.74 as an average dice coefficient.Ramamoorthy and Banu [19] presented a review of video enhancement techniques for medical and surveillance applications.They discussed various approaches such as contrast enhancement, noise reduction, and superresolution.Bhandari et al. [20] suggested a CNN-based segmentation of brain lesions.They used a 3D CNN architecture and attained 0.80 as the dice coefficient.Ranjbarzadeh et al. [21] developed a segmentation of brain lesions for multimodal MRI images implementing deep learning and attention mechanisms.The authors [22] proposed a thresholding-based method for medical image segmentation.They discussed various applications of thresholding techniques in medical image segmentation.Arif et al. [23] presented a technique for brain tumor identification and classification by means of biologically inspired orthogonal wavelet transform and deep learning techniques.Bahadure et al. [24] proposed a method for MRI-based detection of brain tumor and feature extraction using biologically inspired BWT and SVM.They obtained an accuracy of 95.45% for tumor detection and 91.67% for feature extraction.
Rammurthy and Mahesh [16] proposed a deep learning classifier for the identification of brain tumors utilizing MRI images.The classifier relies on Whale Harris Hawks optimization (WHHO) and can accurately classify MRI images as tumor or no tumor.Çinar and Yildirim [25] suggested a hybrid CNN architecture for brain tumor detection on MRI images.The proposed architecture combines convolutional neural networks with handcrafted features and can accurately detect brain tumors.Nayak et al. [26] suggested a classification system for brain tumors using a dense efficient-net.The proposed approach can accurately put brain tumors into distinct groups.Isin et al. [27] provide an overview of deep learning techniques for MRI-based brain tumor image segmentation.Saravanan et al. [28] proposed a CNN-based approach for the identification and clas-sification of glioma brain lesions.Pereira et al. [29] proposed an automatic brain tumor grading approach using CNN and an assessment of quality.The proposed approach can accurately grade brain tumors based on their characteristics.Vankdothu and Hammed [30] proposed a recurrent convolutional neural network-based approach for the identification and classification of brain tumor MRI images.The technique suggested can exactly classify brain tumor MRI images into different categories.The authors proposed a multilevel CNN model for brain tumor classification in IoT healthcare systems [31].
The literature review highlights the increasing popularity of deep learning-based approaches for the detection and classification of tumors in MRI brain images.Despite these benefits, there are still challenges that need to be addressed, such as the increase in accuracy of detection, classification, and complexity in the deep learning model.The proposed system incorporates a pixel-level fusion technique for multimodal images and utilizes a simple thresholding technique for segmentation.A CNN model with a small kernel and minimal layers is utilized for the categorization of brain tumors.This model as a whole improves the accuracy of classification.

Materials and Methods
3.1.Preprocessing.The downloadable images of the brain and its lesion can be found at http://www.med.harvard.edu/AANLIB/home.html [32] and https://www.kaggle.com/datasets/navoneel/brain-mri-images-for-brain-tumordetection?resource=download [33].The file format and spatial resolution of downloaded images are 256 × 256 gif.As a preprocessing, 256 × 256 images are resized to 227 × 227 as the Tumnet model was designed to process only 227 × 227 images, and the gif format is converted to jpg images for further processing.Meningioma and sarcoma types of brain tumors are taken from the database.Out of 170 sets of meningioma and sarcoma together available in the database, 154 sets of MR-CT combination (70 sets of meningioma and 84 sets of sarcoma) are considered multimodal slices from the MedHarvard database, and 561 single-modal slices (280 MRI meningioma and 281 CT sarcoma images) are 3 Behavioural Neurology considered for the dataset from the Kaggle database.This accounts for a total of 869 slices of images for further processing.Our primary concern in preprocessing here is the removal of salt-and-pepper noise, and the downloaded images from this specified database are already preprocessed and void of salt-and-pepper noise.So, the only preprocessing stage includes resizing images to 227 × 227.From the database, 170 sets of meningioma and sarcoma brain tumors were extracted.Among these, 154 sets (70 meningiomas and 84 sarcomas) were identified as multimodal slices, consisting of both MR and CT images.Additionally, 561 single-modal slices were included in the dataset, comprising 280 MRI images of meningioma and 281 CT images of sarcoma.In total, there are 869 image slices available for further processing.Training and validation images are 70% and 30%, respectively.

Fusion
Using the Averaging Method.The images can be fused using the averaging method.This method takes up the two images, and the resultant images will have the average pixels of both images [34].The pixels of each image will be considered and added, and they will be divided by the quantity of the images utilized.All the pixels in the images will continue this method, and the output fused image will be obtained.The fusion structure is depicted in Figure 1.
The fused image is stated by where Ki is an image fusion algorithm to result in infused images from N source images and M fusion structures.Behavioural Neurology The mathematical equation of image fusion is given by where MC1(x,y) is the fused MRI and CT images.
The image averaging method considers the corresponding pixels from both MRI and CT images [35] and is fused by considering the average of those pixels [36].This method of pixel-wise image fusion carries the dominant features from a couple of MRI and CT images [8].

Image Segmentation with Deep
Learning.Because it allows us to interpret the image content, image segmentation is an important aspect of computer vision and image processing [27].It can be used for image reduction, scene interpretation, and finding objects in medical images and satellite images, among others.Many image segmentation methods have been created over time, and while several picture segmentation techniques have been developed over time, deep learning for computer vision has allowed for the evolution of numerous image segmentation deep learning models [23].Recurrent neural networks, convolutional neural networks, deep belief networks, and multilayer perceptron (MLP) are examples of deep learning techniques [24].CNN could be a feed-forward neural network that is usually accustomed to analyzing visual pictures by processing information with a grid-like topology.It is additionally called a Tumnet.CNN is employed to perceive a neural network which is conceived as the human visual cortex and classify objects in a picture.Figures 2 and 3 represent a block diagram of the proposed model and its corresponding flowchart [37].In the proposed model, novelty has been brought about by combining fusion and segmentation techniques.Input brain tumor images are initially preprocessed to make them compatible with the architecture used and fused by using the image architecture for feature extraction, and image classification is performed by a fully connected layer [38].If the tumor is present, then the tumor region is extracted and its area is calculated.The Tumnet method, which involves feature extraction and classification [25], is illustrated in Figure 3.The intricate architecture of the Tumnet model is seen in Figure 4.The Tumnet algorithm helps to extract information from images by using the ideal number of hidden layers [26].Convolution, ReLU, pooling, and fully linked layers are the Tumnet model layers.The layer parameters and measurements are shown in Table 1 [28].

Convolution Layer.
Convolution is a process in image processing from which features can be extracted.For example, simple low pass filter, high pass filter, and image segmentation operations involve convolution [39].Table 2 displays the sample convolution kernels to extract the features.It is clear from these operations that extraction of features requires convolution operation concerning images.Convolution operation goes on as where X is the input image, f is the filter, and Z is the filtered image.
Based on the number of filters and layers, CNN obtain the features from the given image in which the object has to be identified.
In general, convolution operation is defined mathematically as where f x, y is the image input and g x, y is the image output.
The kernel chosen for convolution operation is 3 × 3 which makes the Tumnet model an optimum model to be implemented [12].

Pooling Layer.
The pooling layer is derived after the convolutional layer, which subsamples the pixels to reduce computational effort without impacting the individual properties of the activation maps.The pooling layer's purpose is to combine related characteristics into one [40].For subsampling, the minimum image rate, called the Nyquist rate, must be followed.
where f s is the sampling frequency and f max is the maximum sampling frequency.
The pooling layer has three major types, namely, maximum pooling, minimum pooling, and average pooling.In this technique, maximum pooling is implemented to bring dominant features from the original image.In the pooling layer, the filter size is 2 × 2 and the stride length is 2. The where n h is the activation map height, n w is the activation map width, n c is the number of channels in the activation map, f is the filter size, and s is the length of the stride 3.3.3.ReLU Layer.The following formula applies to the rectified linear unit (ReLU): f x = max 0, x , which has become popular in recent years, as well as the more traditional sigmoids, which are nonlinear functions used in neural networks.

9
(i) the result equals the x portion of the "max" function for all x ≥ 0 (ii) the result equals the 0 portions of the "max" function for all x < 0 Because the presence of anomalies in medical images is nonlinear, this layer is in charge of nonlinear data conversion [41].To avoid overfitting, some of the neurons are dropped out which is a regularization method of avoiding overfitting of data.The applied dropout value is 0.6.

Flattening Layer.
Flattening is a technique for converting a pooled feature map's 2D array into a long single continuous linear array.Figure 5 illustrates the function of the flattening layer.

Fully Connected (FC) Layer.
A fully connected layer is an example of a feed-forward.The fully connected levels are the network's final tiers.
A bias vector is added after the input has been multiplied by a weight matrix in the FC layer, as shown in Figure 6 [42].
Following the convolution and downsampling layers are one or more FC layers.Every neuron in the FC layer is linked to every neuron in the next layer.This layer collects all of the characteristics obtained by the previous layers across the image to discover the bigger patterns [43].The characteristics are combined in the final fully linked layer which is used to classify images in classification challenges.The completely linked layer has six neurons in the softmax layer.
3.3.6.Activation Function Analysis.Figure 7 illustrates feature maps of stage 1 of the Tumnet model, in which feature maps of pooling layer 1, convolution layer 1, and ReLU layer 1 were displayed.Also, the strongest activation channel of convolution layer 1 has been compared with the input image slice.Figure 8 depicts the activation function outputs of convolution layer 5 and ReLU layer 5. Also, the strongest activation channel of layer 4 has been shown.Channels in the deeper layer learn complex elements like fissures and gyri, while channels in the early layers learn only simple features like color and boundaries [44].
Strong positive activations are represented by white pixels, and strong negative activations are represented by black pixels [45].A colormap has been assigned for the feature maps to enhance the visibility of the features.Feature map regions that are gray do not activate as strongly from the image input [46].A white pixel in a feature map represents that the input image features are carried out in the resulting feature map effectively.The figure represents the strongest activation channel of convolution layer 4. From the feature map, it is clear that the tumor part is activated well at the deeper convolution layer 4 which indicates that by implementing this architecture, it can achieve strong activations at this layer.The pseudocode of the mode is depicted in Pseudocode 1.

Experimental Outcomes
This work demonstrates the results of the segmented image of a brain lesion from the fused MRI and CT images of size 227 × 227 pixels (images downloaded from http://www .medharvard.edu)[47].The experiment was conducted in the MATLAB 2019a version on 8 GB RAM and a 64-bit   Behavioural Neurology Then, the images are fused by using pixel-level fusion, namely, the averaging method.This method carries dominant features from the original image.Image fusion is performed in brain tumors which is helpful for neurophysicians while giving treatment for radiotherapy or postoperative radiotherapy [48].Following that, the fused pictures are trained using the CNN approach of the Tumnet model (which learns features from the dataset already provided), which includes the processes of convolution, pooling, and feature extraction (brain tumor component), with an input layer, a hidden layer, and an output layer.If a tumor exists, the size of the tumor component is retrieved from brain imaging by using a simple threshold method.Structural similarity (SSIM) for image fusion, as well as sensitivity, specificity, entropy, standard deviation, and variance, are calculated as performance measures.

(i) Structural similarity (image fusion)
It is a measure of structural similarities between two images in which one image is a reference image and the other one is compared with this image [49].SSIM is found by where An algorithm can correctly identify a disease that is not there.

Specificity = true negative TN true negative TN + false positive FP , 12
where TP is the true positive, segmented pixels appropriately stated as positive; FP is the false positive, segmented pixels inaccurately stated as positive; TN is the true negative, segmented pixels appropriately stated as negative; and FN is the false negative, segmented pixels inaccurately stated as negative.
(iv) Entropy The information content of an image is measured by entropy.It describes how much randomness or ambiguity is present in an image.The higher the quality of a photograph, the more details it holds.The more the entropy, the more detailed the image will be.
Table 2: Filter kernels in the convolution layer.
Low pass filter 1 9 1 1 1 High pass filter      A random variable's variance indicates how far it diverges from its mean value.The variance is the average of the squares of the discrepancies between the individual value and the expected value.This implies that it is always in the affirmative.
4.1.Discussion. Figure 9 depicts a sample of input MRI and CT images in an axial slice, with the MRI revealing a tumor as well as the outline of gray and white matter [36].The fused image depicts a combination of prominent MRI and CT features as a consequence of the pixel-level fusion averaging approach.This approach is deemed simple because it works with pixels directly.Other pixel-level fusion approaches, such as the maximum and minimum methods, are contrasted with this fusion method.This is called fusion at the feature level as stated by and classification by using advanced classification techniques such as CNN combined with small kernel concept.This seems to be the novelty of this proposed method.This colearning of a different set of features from various modals can be an added advantage to the convolutional neural network which in turn extracts features from the fused multimodal image.As far as carcinoma is concerned, an accurate diag-nosis is so vital that even one residual cell can multiply into many [50].This can be achieved by implementing multimodal images for feature extraction, followed by multimodal tumor classification (presence/absence of a tumor).
This kind of fusion-based CNN can be applied for patients who are taking both the MRI and CT scan as well as those patients who are undergoing radiotherapy after the operation of the tumor.During the treatment of radiotherapy, fusion-based classification is important in detecting any tumor cells.Figure 8 shows the extracted tumor outline from the fused image with the implementation of the CNN method.This method is implemented with the simple threshold method, with an object solidity value greater than 0.7 and object areas greater than 100 pixels considered to be the tumor region [21].Only the two parameters mentioned above can be changed in a simple threshold technique to extract tumor outlines.
Although the goal of brain lesion detection is to obtain active tumorous tissue and tumor regions that have spread, the localization and detection of active tumorous tissue were the focus of this study [51].The tumor's detected region is then excised, and the tumor area in mm 2 is determined, as shown in Table 3.The extracted tumor image is shown in Figure 10.Various performance metrics for image fusion   [52].Table 4 depicts the fusion technique's performance metrics derived by combining the averaging approach, the minimum pixel-level fusion method, and the maximum pixel-level fusion method, as well as a basic threshold and CNN.When both procedures are compared, the averaging method produces higher values for both SSIM tissue and SSIM bone.SSIM tissue is determined by comparing the MRI input picture with the fused image, whereas SSIM bone is determined by comparing the input CT image with the fused image.This metric conveys how far the tissue and bone structures are carried over to the fused image.A higher value of SSIM tissue and bone provides more details in the merged image.Tables 5, 6, and 7 show the analysis of fusion metrics for average, maximum, and minimum methods.It also illustrates the minimum and maximum range of performance metric values for the three fusion methods, in which the comparatively average method outperforms well.Apart from the gold standard metric SSIM, parameters like the fused images, standard deviation, and entropy are also determined, which are also shown in Table 7.Standard deviation conveys the deviation of the pixel value from the mean value, and entropy ranges between 1 and 8 for a 0 to 255 grayscale fused image.
Table 8 illustrates the brain tumor image segmentation metric comparison of IFST techniques for the same set of images.Parameters like accuracy, sensitivity, specificity, standard deviation, and entropy were calculated, and the accuracy value ranges from 60% to 90% for the IFST method.The next important metrics are specificity and sensitivity, which are high for the proposed method and are around 96% and 95% on average, respectively, as shown in Table 8.This is because pixel average is carried out at the output, whereas in the other methods of pixel-level fusion, there is a chance of a missing tumor due to the minimum pixel value or maximum pixel value at which the tumor may or may not be present.
The fusion approach is the first step in the simulation process.This pixel-level fusion approach is used in this technique.Initially, the average fusion technique is applied in which fusion metrics are obtained for 154 sets of tumor images.Fusion metrics displayed were SSIM bone, SSIM tissue, entropy of fused image, entropy of tumor image, mean of fused image, mean of tumor image, standard deviation of fused image, standard deviation of tumor image, variance, sensitivity, specificity, and accuracy.SSIM tissue ranges from 44% to 99%, and SSIM bone ranges from 61% to 98%, whereas the maximum method varies from 1% to 98% and the minimum method ranges from 1% to 98%.Although there is not much difference between the maximum value among the three methods, it is to be noted keenly that the minimum value of the maximum and minimum methods falls to a very low value of 1. Apart from second-order advanced statistical parameters, first-order statistical parameters are also obtained by using the three methods.The normalized value of sensitivity, specificity, and accuracy reaches its maximum value at 1.Among these parameters, SSIM tissue and bone convey to us the true scenario of the proposed algorithm performance.Table 7 conveys the average of suits well on average.Fusion-based convolutional modal network is preferred for specific cases such as those patients who are asked to take both MRI and CT after radiotherapy treatment [53].
Considering the average values of the following methods, such as Average-ST (simple threshold), Max-ST, and Min-ST, this implies that for a given set of 154 tumor images, this method holds good.This can be proved for more images also.The Tumnet model is implemented along with the average image fusion method.Tables 8 and 3 illustrate that the proposed method has a higher range of performance on average for the available tumor images.Performance metrics have been illustrated for both fusion as well as segmentation.The tumor area is calculated by using the formula of T * 0 264, where T is the segmented tumor in pixels, and this formula gives the value in mm 2 .
It is observed that classification accuracy increases to 98% and loss function decreases to almost 0 for the given set of images, as shown in Figure 11.Comparative analysis with the latest references for classification accuracy is shown in Table 9.
Tumnet, being the proposed model, shows the highest performance of 96% in testing accuracy parameters for the Kaggle dataset [33] which is explicitly shown in Table 10.Hence, the Tumnet model outperforms the other existing models such as VGG-19, Alexnet, GoogLeNet, and ConvNet for the dataset considered for classification.Among these models, VGG-19 and Alexnet show low-level variation with the proposed model which has 94% and 82%, respectively, whereas GoogLeNet and ConvNet indicate a high level of variation which shows 78% and 67%, respectively.Thus, the Tumnet model proves its robustness for different datasets.
Tumnet, the proposed model, performs to obtain an accuracy of 98% for multimodal images of MRI, CT, and fused MRI-CT images.A few methods shown in Table 11 outperform the Tumnet model, but they are limited to single-modality MRI images, whereas the Tumnet model shows its high potential for diversified images.Tumnet's strength lies in its ability to handle multimodal data, achieving a notable 98% accuracy.The results underscore the importance of model architecture, dataset characteristics, and multimodal approaches in achieving high accuracy in brain tumor detection.

Conclusion
This research work proposes a Tumnet deep learning model for the categorization of brain tumors from MRI, CT, and fused MRI-CT slices.The model comprises 11 layers, including convolution, pooling, and activation layers, with a smaller kernel size of 3 × 3 for the convolution layer.The proposed system was used with the MedHarvard database of different tumors, including meningioma and sarcoma.The performance of the 3 × 3 kernel architecture was compared with larger filter architectures.The results of the research showed that the proposed method achieved high accuracy, sensitivity, and specificity in detecting brain tumors in both multimodal and single-modal MRI/CT images.Hence, the proposed approach possesses the capacity to assist physicians in accurately diagnosing and treating brain tumors.In the future, the potential of the Tumnet model can be extended to other imaging databases by applying the 3 × 3 kernel to other standard models, which may enable accurate classification and decision support for oncologists.

1. 1 .
Image Segmentation Methods.Image segmentation techniques can be classified according to segmentation methods and their processing that is required to reach the objective of extracting features.They are (i) Simple threshold method (ii) Edge detection-based segmentation (iii) Region growing and splitting technique (iv) Cluster model (v) Watershed segmentation (vi) Artificial neural network-based segmentation

Figure 2 :
Figure 2: Structure of brain tumor detection model.

k 1
and k 2 are small constants of values 0.01 and 0.03.L is the pixel's dynamic range (L is 8 for 0-255 grayscale range image).(ii) Sensitivity An algorithm can correctly identify a disease.The mathematical formula is given by Sensitivity = true positive TP true positive TP + false negative FN 11 (iii) Specificity

Figure 7 :Figure 8 :
Figure 7: Feature maps of stage 1 of the Tumnet model.
I is the original image, P K is the probability of the value k appearing in image I, and L is the number of various gray levels.(v)Standard deviationThe standard deviation describes how far the values in a dataset depart from the average.One technique to assess contrast is to provide the pixel value standard deviation in an image.

Figure 9 :
Figure 9: MRI and CT Brain of size 256 × 256 pixels and its fused images.(a) Input-MRI image.(b) Input-CT image.(c) Fused MRI-CT image by using the average method.

Figure 11 :
Figure 11: Classification accuracy and loss function of the Tumnet model.

Table 1 :
Learnable parameters of the Tumnet model.

Table 3 :
Tumor area in mm 2 .

Table 6 :
Maximum method analysis.

Table 5 :
Average method analysis.
[42]n tumor images are available (PD)[42].Images are sliced axially, sagittally, and coronally, in which axial slices are usually preferred since they contain most of the parts of the brain.Images belong to a different set of patients having the abovementioned abnormalities in their brain images.This implies that, for a wide variety of cases, our proposed

Table 9 :
Comparative analysis for tumor classification accuracy.

Table 10 :
Comparison of testing accuracy of the Tumnet model with other existing models.

Table 11 :
Comparison of Tumnet model accuracy with other existing models of MedHarvard images.CNN model Year Number of samples from MedHarvard Image modality (single modal/multimodal) Accuracy (%)