An Ensemble of Deep Learning Enabled Brain Stroke Classification Model in Magnetic Resonance Images

Brain stroke is a major cause of global death and it necessitates earlier identification process to reduce the mortality rate. Magnetic resonance imaging (MRI) techniques is a commonly available imaging modality used to diagnose brain stroke. Presently, machine learning (ML) and deep learning (DL) models can be extremely utilized for disease detection and classification processes. Amongst the available approaches, the convolutional neural network (CNN) models have been widely used for computer vision and image processing issues such as ImageNet, facial detection, and digit classification. In this article, a novel computer aided diagnosis (CAD) based brain stroke detection and classification (CAD-BSDC) model has been developed for MRI images. The proposed CAD-BSDC technique aims in classifying the provided MR brain image as normal or abnormal. The CAD-BSDC technique involves different subprocesses such as preprocessing, feature extraction, and classification. Firstly, the input image undergoes preprocessing using adaptive thresholding (AT) technique for improving the image quality. Followed by, an ensemble of feature extractors such as MobileNet, CapsuleNet, and EfficientNet models are used. Besides, the hyperparameter tuning of the deep learning models takes place using the improved dragonfly optimization (IDFO) algorithm. Moreover, satin bowerbird optimization (SBO) based stacked autoencoder (SAE) is used for the classification of brain stroke. The design of optimal SAE using the SBO algorithm shows the novelty of the work. The performance of the presented technique was validated utilizing benchmark dataset which includes T2-weighted MR brain image collected from the axial axis with size of 256 × 256. The simulation outcomes indicated the promising efficiency of the proposed CAD-BSDC technique over the latest state of art approaches in terms of various performance measures.


Introduction
Strokes are the 3rd most common cause of death around the world as per the report of the world health organization (WHO). With around 87% the more common type are ischemic strokes, caused by disturbance in the brain blood supply. Measuring volume and lesion location could assist diagnoses and guide treatment decisions [1]. Moreover, lesion classifcation plays a signifcant role in cognitive neuroscience research. Tis frequently includes an anatomical analysis, where brain area is associated with the neurological defcit that requires manual examinations of massive stroke image databases. Hence, an automated methodology for segmenting ischemic lesions in brain images is extremely needed [2,3]. Ischemic stroke lesions undergo several developmental stages. Te moment of partial or total loss of blood supply to the infected brain areas is called the onset of the stroke and marks the beginning of the hyperacute stage. At onset, the infection areas are separated into a core of infarcted tissue and a surrounding penumbra of under perfused, however possibly salvageable tissue i.e., partially provided by collateral blood fow [4]. Te stroke developmental phase is related to amount of cell death and reconstructing mechanism which afects visibility of the stroke area in magnetic resonance imaging (MRI), mostly by the migration of water molecules [5].
Brain imaging methodologies, namely, computed tomography (CT) and MRI are very supportive for a physician for starting the early screening of the patients [6]. Also, there are several imaging modalities for analyzing brain that might involve difuse optical imaging, X-ray imaging, positron emission tomography, magnetoencephalography, and functional MRI [7]. But this imaging technique requires well-trained operators and higher operating costs, thus many of these imaging methods may not be presented in every hospital and clinic. Image classifcation is broadly utilized in medical imaging systems [8]. But the classifcation method outcomes must be closer to the manual diagnoses.
Tis study introduces a novel computer aided diagnosis (CAD) based brain stroke detection and classifcation (CAD-BSDC) model on MRI images. Te proposed CAD-BSDC technique involves preprocessing using adaptive thresholding (AT) technique to improve the image quality. In addition, an ensemble of feature extractors such as MobileNet, CapsuleNet, and EfcientNet models are used. Besides, the hyperparameter tuning of the deep learning models takes place using the improved dragonfy optimization (IDFO) algorithm. Furthermore, satin bowerbird optimization (SBO) based stacked autoencoder (SAE) is used to classify the MR brain image as normal or abnormal. Te experimental result analysis of the CAD-BSDC technique takes place utilizing benchmark dataset which comprises T2-weighted MR brain images.

Literature Review
Currently, deep learning (DL) method was widely utilized as a classifcation system since it calculates features automatically within the convolution layer of the deep system [9]. Te major beneft of utilizing DL method is that it outperforms other traditional methodologies for the classifcation of images. Several DL methodologies have existed like deep belief nets (DBN), RNNs, LSTM, and so on. Amongst this method, convolutional neural network (CNN) was widely employed in medical image processing and computer vision challenges such as house numbers digit classifcation, ImageNet, patch classifcation from medical images, face recognition, and so on [10].
Nishio et al. [11] evaluated and developed an automated acute ischemic stroke (AIS) detection method including 2phase DL models. Next, the 2-phase method implemented the AIS recognition system in the testing set. To evaluate the detection outcomes, a board-certifed radiologist assessed the testing set head CT image with and without help of detection system [12]. Hilbert et al. [13] examined DL methods to build model to directly forecast better reperfusion afterward endovascular treatment (EVT) and better functional outcomes using CT images. Tis model does not need image annotation and is faster to calculate. Te study compared DL to ML methods using conventional radiological image biomarkers. Pan et al. [14] investigated a new method based mainly on DL-ResNet for detecting infarct cores on non-contrast CT images and enhancing the performance of acute ischemic stroke diagnoses. Tey endlessly enrolled magnetic resonance difusion weighted image (MR-DWI) confrmed frst-episode ischemic stroke patients. Next, utilize decision curve analysis (DCA) model for analyzing the values of this technique in medical settings.
Zhang et al. [15] introduced a DL method that leverages MRI difusion series for classifying TSS based medically validated threshold. Also, the study presented an intradomain task-adoptive transfer learning technique that includes model training on simple medical tasks (stroke recognition) and refned the method with distinct binary thresholds of TSS. Wang et al. [16] evaluated and developed a DL based method to assist the selection of appropriate patients with acute ischemic stroke for endovascular treatment-based 3D pseudo-continuous arterial spin labeling (pCASL). Te DL and six ML methods have been trained by using 10-fold CV.

The Proposed Model
In this study, a new CAD-BSDC model has been developed for MRI images for classifying them into normal or abnormal. Te CAD-BSDC technique involves diferent subprocesses such as AT based preprocessing, ensemble of feature extraction, IDFO-based hyperparameter tuning, SAE-based classifcation, and SBO-based parameter tuning. Figure 1 illustrates the overall process of CAD-BSDC technique.
3.1. Image Preprocessing Using at Technique. At the primary level, the AT technique is applied on MRI images to remove the noise and enhance the quality. It is an efective method to determine the infected regions by the use of thresholding concept. In the AT technique, the investigation of the MRI images takes place for the distributed pixel intensities and the threshold value is chosen. In this case, the input MRI image can be denoted as g(x, y), I implies the threshold value, and the fnal image can be defned as f(x, y). It can be mathematically defned as follows [17]:

Ensemble of Feature Extraction Approaches.
During the feature extraction process, the ensemble of feature extractors namely MobileNet, CapsuleNet, and EfcientNet models are used. Te DL is a type of CNN and has extremely utilized for images [18]. In recent times, DL was extremely utilized in the analysis of several medicinal diseases. Also, several researchers are developed by analysis of skin disease utilizing DL. Te DL has several linked layers using distinct weight as well as activation functions. A fundamental DL technique involves convolution, pooling, and connected layers. Many activation functions were utilized for adjusting the weight. During this case, EfcientNetB3 was utilized for glaucoma detection. Te EfcientNetB3 is current, cost-efcient, and robust technique established by scaling 3 parameters namely depth, width, and resolution [19]. An EfcientNetB3 method with noisy-student weight was utilized from scenarios I and III to the transfer learning (TL) procedure, but "isi-call_ef3_weights" weight is utilized as pretrained to scenarios II and IV. Te GlobalAveragePooling2D layers were added to all scenarios for generalizing the optimum model. Te amount of parameters is decreased. Also, the rectifed linear unit (ReLU) activation function was utilized with 3 dense and 2 dropout layers. Te resultant layer has several outcome units to multiclass classifcation utilizing the softmax activation functions. Figure 2 demonstrates the structure of CapsNet. Te MobileNet [20] has lesser framework, minimum computation, and superior precision that is utilized to mobile terminal and embedding devices. According to depthwise separable convolutional, MobileNets utilize 2 global hyperparameters for keeping a balance amongst efcacy and accuracy. Te basic concept of MobileNet is decomposition of convolutional kernel. With utilizing depthwise separable convolutional, the typical convolutional was decomposed as to depthwise convolutional and pointwise convolutional with convolutional kernels. Te depthwise convolutional flter execute convolutional for all channels, and convolutional was utilized for combining the outcomes of depthwise convolutional layer. During this technique, N typical convolutional kernel. Te typical convolution flters integrate an input as to a novel group of outputs, but the depthwise separable convolutional separates the inputs as to 2 layers, one to flter and another to merge. Te MobileNetV2 establishes novel components with inverted remaining framework.
In order to compensate for shortcomings of CNN, the network framework named the CapsNet was presented [21]. Te CapsNet is a deep network technique involving capsules. Te capsule was comprised of a set of neurons. Te activation neuron signifes the features of modules from the objects. All the capsules are responsible to determine a single module from the object, and every capsule jointly defnes the entire framework of objects. Conversely, for any DNNs (for instance, DBN), this framework preserves object modules and spatial data. Related to CNN, the CapsNet was comprised of multi-layer networks.

Hyperparameter Tuning Using IDFO Algorithm.
For optimally adjusting the hyperparameters of the DL models, the IDFO algorithm is applied. Te DFO method was coined by Mirjalili at Grifth University in 2016 [22]. Tis method is a meta-heuristic approach-based SI is stimulated by dynamic as well as static behaviors of dragonfies in nature. Tere are 2 primary phases of optimization: exploitation and exploration. Tese two stages are modelled by dragonfies, either statically or dynamically searching for food or avoiding the enemy. Te 2 further behaviors are added to these three fundamental behavior in DA: move to the food and avoid the enemy. Tus, once each individual moves to food source (equation (5)), they need to avoid the enemy simultaneously (equation (6)).  Journal of Healthcare Engineering where, X indicates the immediate location of the individual, in which X j indicates the immediate location of j th individual. N characterizes the amount of neighboring individuals, in which y j shows the speed of j th neighboring individual. X + and X − denotes the position of the food and enemy source, correspondingly [23]. Te overall steps of the DFO algorithm are given in 1.
To upgrade the place of artifcial dragonfies in the searching space and simulate the motion, two vectors are taken into consideration: position (X) and step (X). Te step vector considers as speed, shows the direction of dragonfy motion (equation (7)). Ten estimating the step vectors, the position vector is upgraded (equation (8)): In which, and f, e, w, and t represents the food factor, enemy factor, inertia coefcient, iteration number, correspondingly and the, a, and c indicates separation, alignment, and cohesion coefcient, correspondingly. Tis coefcient and the abovementioned factor enable to implementation of exploitative and exploratory behaviors. In the IDFO algorithm, the traditional DFO algorithm is integrated into the fower authorization algorithm, we set the value ranges [S min , S max ]. To efciently evade the situation where the dragonfy collectively gathers in the frst phase, the uniform distribution is utilized for implementing random initialization process on all the dimensions [24], Afterward preprocessing, the 2 procedures are further merged for guiding the dragonfy to fy to an optimal location.

Image Classifcation Using SBO-SAE Model.
Finally, the SBO-SAE model can be employed for the classifcation of MRI images. Te SAE is developed based on the concept of auto encoder (AE). In SAE model, the encoding part of the AE is stacked together, i.e., the input of initial layer of an AE model is actual data and the input of lower layer is hidden layer data. Lastly, a classifcation model is appended to the network [25]. Te training process of the SAE model involves the pretraining and the inverse fne-tuning procedure. It makes use of a huge quantity of unlabeled data for unsupervised learning, independently extracted the features, and utilizes the labelled data for inverse fne tuning of the network. For boosting the performance of the SAE technique, the weight and bias values are optimally chosen by the SBO algorithm.
SBO technique begins generating a primary uniform arbitrary population that contains a group of places to bower [26]. All positions (pop(i).Pos) are determined to the parameter which is supposed that optimize as written in equation (6). It could be noticeable the value of primary population lie among the existing minimal as well as maximal limit of optimizing parameters.
Comparatively, same as ABC, the probability of fascinating of male/female (Prob i ) to bower was calculated as follows.
Same as other evolutionary dependent upon optimizer, elitism was utilized for storing an optimum solution(s) at all iterations of optimized procedure. In the mating season, males like every other bird utilize its drives for building and decorating the bower. Noticeably, older and experienced males are appealed further attention of others to their bower. Conversely, this bower has further ftness than the other bower. During the SBO processes, the place of an optimum bower created by bird was estimated as elite of k th iteration (x elite,k ) that is maximum ftness and is capable of afecting the other places. In all iterations, a novel modifcation at some bower was computed dependent upon equation demonstrated in It can be worth maintaining that roulette wheel selective process was utilized for picking up bower with superior probability (x jk ). In SBO, Parameter β k defnes the count of steps for selecting target bowers that are calculated to all variables and modifed based on  Journal of Healthcare Engineering Arbitrary modifes were executed to x ik with specifc probability, where normal distribution (N) has been utilized with average of x old i,k and variance of σ as stated in equation.
Finally, all the cycle is an old population and population attained in modifes as aforementioned were evaluated, integrated, sorted and novel population was created. Te pseudocode of SBO algorithm is given in 2.
Te SBO approach develops a FF for attaining enhanced classifcation efciency. It defnes a positive integer for representing the optimum efciency of candidate solution. During this case, the minimized classifcation error rate was regarded as FF is provided in equation (10) Journal of Healthcare Engineering

Experimental Validation
Te performance validation of the CAD-BSDC technique takes place using the benchmark dataset [27], which contains MRI images under six distinct classes. Te details relevant to the dataset are given in Table 1. Figure 3 shows the sample MRI images. Figure 4 shows the confusion matrices ofered by the CAD-BSDC technique on the classifcation of brain stroke. Te fgure shows that the CAD-BSDC technique has effectually identifed distinct classes of brain stroke. Te results show that the CAD-BSDC technique has resulted in improved classifcation results. For instance, with 500 epochs, the CAD-BSDC technique has resulted in the sens y , spec y , accu y , prec n , F score , and MCC of 94.71%, 99%, 98.31%, 94.63%, 94.55%, and 93.61%, respectively. As well as, with 1000 epochs, the CAD-BSDC process has resulted in sens y , spec y , accu y , prec n , F score , and MCC of 93.35%, 98.83%, 98.13%, 94.55%, 93.82%, and 92.78% correspondingly. Furthermore, with 1500 epochs, the CAD-BSDC method has resulted to the sens y , spec y , accu y , prec n , F score , and MCC of 94.63%, 99%, 98.31%, 94.71%, 94.44%, and 93.58% correspondingly. Finally, with 2000 epochs, the CAD-BSDC approach has resulted in the sens y , spec y , accu y , prec n , F score , and MCC of 95.28%, 99.16%, 98.69%, 96.76%, 95.75%, and 95.13% correspondingly.

Journal of Healthcare Engineering
In order to ensure the improvements of the CAD-BSDC technique, a comprehensive comparison study is made in Table 4 [28]. Figure 6 investigates the comparative accuracy analysis of the CAD-BSDC with recent methods on the test dataset. Te fgure demonstrated that the FODPSO-SVM technique has accomplished inefectual outcomes with the least values of accuracy. In line with, the SURF-DT and FODPSO-RF techniques have obtained slightly increased values of accuracy. Followed by, the EM-PSORF and EM-PSOSVM techniques have reached moderately improved accuracy values. Tough the SIFT-DT technique has reached near optimal accuracy of 97.25%, the CAD-BSDC technique has accomplished maximum accuracy of 98.36%. Figure 7 explores the comparative sens y , spec y , and F measure analysis of the CAD-BSDC with current methodologies on the test dataset. Te fgure illustrates that the FODPSO-SVM approach has achieved inefectual outcomes with the least values of sens y , spec y , and F measure . In line with, the SURF-DT and FODPSO-RF methodologies have attained slightly improved values of sens y , spec y , and F measure . Ten, the EM-PSORF and EM-PSOSVM algorithms have reached better sens y , spec y , and F measure values. Although the SIFT-DT model has reached to near optimum sens y , spec y , and F measure of 91.04% 98.23%, and 91.91%, the CAD-BSDC model has attained maximal sens y , spec y , and F measure of 94.49%, 99%, and 96.64%.
Te above mentioned tables and fgures demonstrated that the CAD-BSDC technique has showcased superior performance over the other techniques.

. Conclusion
In this study, a new CAD-BSDC model has been developed for MRI images for classifying them into normal or abnormal. Te CAD-BSDC technique involves diferent subprocesses such as AT based preprocessing, ensemble of feature extraction, IDFO-based hyperparameter tuning, SAE based classifcation, and SBO based parameter tuning. Te experimental result analysis of the CAD-BSDC technique takes place utilizing benchmark dataset which includes T2-weighted MR brain images. Te simulation outcomes indicated the promising efciency of the proposed CAD-BSDC technique over the latest state of art approaches in terms of various performance measures. Tus, the CAD-BSDC technique can be realized in a real time environment to aid physicians. As a part of future extension, the classifcation performance of the CAD-BSDC technique can be enhanced by the use of DL-based segmentation approaches.

Data Availability
Data sharing not applicable to this article as no datasets were generated during the current study.

Ethical Approval
Tis article does not contain any studies with human participants performed by any of the authors.

Consent
Not applicable.

Conflicts of Interest
Te authors declare that they have no conficts of interest.

Authors' Contributions
Te manuscript was written through contributions of all authors. All authors have given approval to the fnal version of the manuscript.