Early Diagnosis of Tuberculosis Using Deep Learning Approach for IOT Based Healthcare Applications

In the modern world, Tuberculosis (TB) is regarded as a serious health issue with a high rate of mortality. TB can be cured completely by early diagnosis. For achieving this, one tool utilized is CXR (Chest X-rays) which is used to screen active TB. An enhanced deep learning (DL) model is implemented for automatic Tuberculosis detection. This work undergoes the phases like preprocessing, segmentation, feature extraction, and optimized classification. Initially, the CXR image is preprocessed and segmented using AFCM (Adaptive Fuzzy C means) clustering. Then, feature extraction and several features are extracted. Finally, these features are given to the DL classifier Deep Belief Network (DBN). To improve the classification accuracy and to optimize the DBN, a metaheuristic optimization Adaptive Monarch butterfly optimization (AMBO) algorithm is used. Here, the Deep Belief Network with Adaptive Monarch butterfly optimization (DBN-AMBO) is used for enhancing the accuracy, reducing the error function, and optimizing weighting parameters. The overall implementation is carried out on the Python platform. The overall performance evaluations of the DBN-AMBO were carried out on MC and SC datasets and compared over the other approaches on the basis of certain metrics.


Introduction
In this world, TB (Tuberculosis) is regarded as one of the highest threats to humans and it is considered the fifth major cause of death [1]. It is caused due to Mycobacterium and it generally affects the lungs. Further, this bacterium can cause other organs like the brain and kidney. ere are two types of TB; they are active and latent TB. In active TB, Mycobacterium shows symptoms and it is spread easily; whereas the latent TB, the bacterium stays in the body and does not show any symptoms. Nearly one-fourth of the population has latent TB. During, inhalation, this bacterium is transferred via air from person to person. When proper treatment is given, TB can be cured and prevented using antimicrobial drugs [2]. Based on the 2020 annual report of WHO (World Health Organization) about 11 million people are tested as TB positive. e countries like China, Indonesia, and India are seriously affected due to TB. In addition to this, HIV-affected people died because of TB. Early detection of TB can overcome the advanced explosion of the disease [3]. WHO suggests rapid fundamental tests for TB patients? Hence, automatic CAD (Computer Aided Diagnosis) tools are used for the efficient treatment of TB.
Due to the advancement of computer vision approaches and the advancement of digital techniques several CAD techniques are used recently. With this advancement, TB can be detected quickly and overcome further transmission when it is determined early. CAD has the ability to speed up a mass screening in TB-spreading areas. Traditional examination of chest radiograph technique requires a high professional, is time-consuming, and leads to error. Imaging modalities such as CXR (Chest X-rays) and CT (computed tomography) are generally used to detect and screen for TB. However, the modality CT is not must use due to its cost and high radiation dosage Chest X-rays are mainly used for detecting and diagnosing TB. CXR is a fast, low cost and better tool for early diagnosis [4]. e existing research works on Machine learning (ML) and deep learning (DL) models. ML models are based on hand-crafted features and a DL model does not depend on hand-crafted features. Some of the ML classifiers used in the existing research work are EBT (ensemble baggage tree) [5], RF (Random Forest), [6], and SVM (Support Vector Machine) [7]. Certain research works attained excellent results in DL models. is enables DL as an effective model for medical analysis. Particularly, the DL model CNN (Convolutional Neural Network) has shown better results in TB classification [8][9][10][11]. e major contributions of the proposed work are as follows: (i) Introduces a fully automated optimized deep learning model for TB segmentation and classification (ii) Segmentation is carried out by adaptive fuzzy c means (AFCM) clustering and features like texture and shape features are extracted (iii) For improving the accuracy, optimizing weighting parameters, and reducing the error function, a Deep Belief network (DBN) with Adaptive Monarch butterfly optimization (AMBO) is used e rest of the research article is sorted as Section 2 depicts recent related works based on TB detection; Section 3 explains the developed TB classification model; Section 4 gives the discussion of evaluated results and the entire work is concluded in Section 5.

Related Works
Some of the recent related works based on TB classification are listed as follows.
Rahman et al. [12] presented a TL (Transfer Learning) with deep CNN for the automatic detection of TB using chest radiographs. e robustness of several CNN approaches was implemented for TB classification. Among the nine models, Chex Net attained better results. e results were evaluated with and without segmentation. It was proved that the classifier with segmentation obtained better outcomes. Win et al. [13] used hybridized feature learning model to screen TB automatically. DeepLabv3 was used for lung segmentation. en, the optimization PSO (Particle swarm Optimization) was used for selecting features and given as input to optimized SVM. is classifier was used to classify normal and TB. Sahlol et al. [14] used CNN with AEO (Artificial Ecosystem Optimization) for TB detection. Initially, the input image was segmented using MobileNet and, feature selection was carried out by AEO. By this optimization 50,000 features were reduced to 19 and 29 features and classified as TB and Non-TB images.
Pavani et al. [15] presented a new automated model for quick detection of pulmonary TB. At first, preprocessing was carried out and Chan-Vese active contour was used for segmenting the images. Several features were extracted and the pertinent feature was taken for classification by the NB classifier. Rahman et al. [16] integrated DL, a pretrained model, and XGBoost classifier to facilitate the quick diagnosis of TB. e major aim of this research work is to enhance recall and specificity of the classification of TB, furthermore, it minimized the training time in classification. Ayaz et al. [17] used a new TB detection model that integrated hand-crafted features with CNN via an ensemble model. Initially, the images were normalized and given to the feature extractor. GF and pretrained models were used to extract hand-crafted and deep features. Two benchmark datasets were used to obtain better ROC results. Dasanayaka and Dissanayake [18] used several CNN models for automatic TB detection using preprocessing and augmentation approaches. e parameters of the DL models were optimized using a genetic algorithm. Finally, this model attained a detection accuracy of 97.1%. Even though these approaches obtained better results, there were some challenges accompanied by these approaches. Optimization techniques used in these approaches suffered from local optima and the error rate was higher. Hence, there is a robust model is essential for TB detection. Among the top 10 infectious disease-related causes of death, tuberculosis (TB) is listed. In this study, convolutional neural networks are used to compare the effectiveness of two approaches for detecting pulmonary tuberculosis from patient chest X-ray pictures (CNN). e combination of different picture preparation techniques that produces the highest accuracy is examined. In addition, a hybrid strategy combining the first statistical computer-aided detection method with neural networks was also researched. On the basis of 394 abnormal photos and 406 abnormal images, simulations have been run. e simulations demonstrate that a clipped region of interest combined with contrast augmentation produces superior outcomes. Even better outcomes are obtained when the photos are further enhanced using the hybrid technique [19][20][21][22][23][24][25][26][27][28].

Proposed Methodology
Recently TB classification is carried out by CXR images. e major processes involved in classification are explained in this section. e datasets like Shenzhen China (SC) and Montgomery Country (MC) are used for the process. Initially, the image is resized and noise is removed using the WF filter. en, the image is segmented using a clustering technique called AFCM. en, GF, shape, texture, and HoG features are extracted. Finally, a DL model DBN-AMBO is used to classify the image as normal and TB. Figure 1 delineates the framework of the proposed TB classification model.

Preprocessing. It is an initial phase in image processing.
is process is used for removing noise and eliminating unnecessary information. Initially, the input images are resized to 512 × 512 pixels, and this resized image is used for further processing. en, wiener filtering (WF) is used for removing the noise in a resized image. is filter eliminated an additive noise and removes blurriness at the same time. WF is optimal on the basis of MSE (mean square error). WF is a linear representation of a primary image. WF in Fourier transform (FT) is represented as follows [3]: where

Segmentation.
e preprocessed image is segmented using AFCM. e dataset of m samples are V � v 1 , v 2 ....v m and it is partitioned into c j clusters. e objective solution of FCM is [7] where u lp and d lp are the membership degree matrix and Euclidean distance of l th sample and p th centre of cluster. d lp � ‖y l − v p ‖ e membership degree matrix should satisfy the following conditions: According to Lagrange multiplier, Uand V are computed using the following expression [5]: However, the normal FCM has the following limitations it needs an initial matrix of membership and the number of clusters to be declared theoretically. It is largely sensitive to the matrix of membership and number of clusters, therefore FCM generates unstable outcomes. In AFCM, sample density ρ j and density rate care initialized. e parameter cis utilized for adjusting the potential cluster of centres. In this, ρ j is defined as follows:  Computational Intelligence and Neuroscience where χ(y) � 1 when y < 0 0 elsewhere , D lp is a Euclidean distance of l th sample and p th centre of cluster and D co is a break off distance. Hence, by using this clustering the TB images are segmented efficiently.

Feature Extraction.
Features are used to represent the images in scalar or vector form. Classifiers can not verify an image directly rather than different features like texture and shape features are given as input to the network. In this work, feature extraction is carried out using GF, shape features, and HOG. e explanation about every technique is explained in the following section.

Gaussian Function (GF). 2D GF is utilized for defining the Gaussian function in
as frequency component function and it is represented as follows [10]: where θ and ω are the orientation and frequency of GF. In GF, Gaussian is convoluted in the Gaussian window and FT is managed using α, μis GF's centre frequency. en the FT of g(x, y) is represented as follows: where σ x and σ y are the constant distance from the Gaussian properties. a � a ′ cos θ According to θ value several filters are obtained. θ value is varied from 45 ∘ for obtaining text features.

Shape
Features. e flowing features like perimeter, area, Roundness, Eccentricity, Major and minor axes, Ratio of elongation, Solidity, and Equivalent diameter are extracted [4].
(1) Perimeter (P) . It is a number of pixels in the border of object.
(2) Area (A) . It is a space occupied using object on the surface of plane.
(3) Roundness (R). It is a computation of how closely an object shape to that of circle and it is given as follows: (4) Eccentricity (E) . is parameter does not have roundness of the object which scales from 0 to 1.
(5) Major and minor axes. ese are the imaginary ellipse which limits the object.
(6) Ratio of elongation (R e ) . It is a ratio of lengths of major and minor axes and it is defined as follows: where x and yare the lengths of minor and major axes.
It is a ratio among (A) of binary image and convex hull area. It is expressed as follows: It is defined as diameter of the circle with a similar area as an object.

HOG.
e descriptor in HOG same as the Scale-invariant feature transform (SIFT) and it is obtained in four stages.
ey are gradient computation, HOG by cells, contrast normalization, and obtaining HOG descriptors. e features of HOG show the presence of TB in the lungs. Let the gradient in pixel (a, b) in an image I j is represented as a mask convolution with an original image. I(a, b), e gradient magnitude and direction in every pixel (a, b)are expressed as follows: Finally, all the three features are integrated and it is given to the classifier. ere are 55 features extracted from these features.

Classification.
e extracted features are provided as input to the DL classifier DBN. It has input, hidden, and output layers. ere is a deep interconnection between input and hidden neurons. e interconnections among hidden and visible neurons are exclusive and symmetric. Since the output of neurons is stochastic in the Boltzmann network is probabilistic. e output is expressed in Equation (14) and the function of the sigmoid is expressed in the following equation: where s p is a pseudotemperature? Figure 2 depicts the structure of DBN, in which the process of feature extraction is carried out by RBM (Restricted Boltzmann machine) and classified by MLP (Multilayer proton). e mathematical modeling which expresses the Boltzmann machine energy to form a binary phase is and it is expressed as follows: where u l is weight of every neuron and φ is bias. e energy expression on the basis of integration of hidden (g, h)and visible neurons are provided in equations (18)- (20).
where g c is binary phase of c, W m is a binary phase of mhidden unit, k c and W c are the bias provided to the network. For every pair of hidden and visible vector (h i → , g → ), possibility provided to RBM is given as follows: is approach utilizes CD (contrastive divergence) and its procedure is: training samples are selected and given to visible neurons. en, find the possibility of hidden neurons and determine the hidden state. Obtain the weight using where αis a learning rate, ϕ + and −ϕ − are the positive and negative gradients? Every neuron error in mis given as follows: where N → s and R → s are the input and output vectors. Although DBN has optimized arrangements for classification, several layers are attained experimentally, which degrades its accuracy. So the output achieved by DBN is hybridized with AMBO for enhancing the efficiency of the system. Figure 2 indicates the classification of TB using DBN-AMBO. In MBO [29], butterflies positioned at land 1 and 2 are known as subpopulation SP 1 and subpopulation SP 2 . It is computed on the basis of parameters s that are MS 1 (Ceil(s × MS)) and MS 2 (MS − MS 1 ). e individuals of butterflies in SP 1 is operated by migration parameter. It is represented as follows: where z t r1,l is an l th element of z r1 . Random position r1is chosen from SP 1 . When r ≤ s, z t+1 k,l is produced using Equation (24). When r > s, z t+1 k,l is produced using the following equation: where z t r2,l is an l th element of z r2 . Random position r2is chosen from SP 2 . Overall component in butterfly m, when ran d ≤ s is given as follows: where z t+1 m,l is an l th element of z m at t + 1. z t best,,l is l th element of best butterfly. When ran d > s is given as follows: where z t r3,l is an l th element of z r3 when ran d > ARBt is further given as follows: where ARBis adjusting rate of butterfly and az m walking stage of the butterfly and δ is a weighting factor. However, this optimization suffered from slow convergence and was trapped by local optima. Hence, the crossover operator is introduced to improve the accuracy and to reduce error. In AMBO new individual z t+1 v is given as follows: where C o and care crossover mutation and reflection coefficient. ese two parameters improve the performance of TB detection.

Results and Discussions
is analysis and validation of the DBN-AMBO are depicted in this section. e overall evaluation is carried out on a system with an Intel Core i5 CPU, 8 GB RAM, and 3.0 GHz speed. e overall evaluation is carried out in PYTHON 3.6 platform.
e efficiency of the proposed TB classifier is compared against the traditional approaches like RNN (Recurrent Neural Network), CNN, GAN (Generative Adversarial Networks), and DBN. Furthermore, to test the efficiency of optimization techniques, the proposed model is compared with several optimization techniques. [30] dataset was obtained by Shenzhen number 3 hospital. It comprises 662 frontal CXR images of TB positive images (335) and TB positive images (327).

Dataset Details. Shenzhen China (SC)
is dataset has images of all age groups of people and these images are in png format which is 0 for non-TB and 1 for TB images. e resolution changes from 998 × 1130 to 3001 × 3001. Montgomery Country (MC) Dataset of frontal CXR images was given by the Health and Human service department of the USA. It comprises 138 frontal CXR images. Among them, 58 are TB positive images and 80 are TB positive images. ese images are in png format and their resolution is 4892 × 4020 [31][32][33].

Performance Measures.
It is essential to implement the performance of classification in the studies of image classification. Otherwise, performance will remain incomplete. ere are several performance metrics used for classification. In this work, the metrics like Accuracy (A), Recall ® , specificity (S), FNR (False Negative Rate), Precision (P), and NPV (negative predictive value) are measured.
e introduced DBN-AMBO evaluated by True positive (exactly classified as TB) (T p ), False positive (wrongly classified as TB) (F p ), True negative (exactly classified as normal) (T n ), and False negative (count of TB cases missed) (F n ).

Accuracy.
One parameter for assessing classification models is accuracy. Informally, accuracy is the percentage of accurate predictions made by our model.

4.2.2.
Recall. e recall is determined by dividing the total number of Positive samples by the number of Positive samples that were correctly identified as Positive. e model's capacity to identify positive samples is gauged by the recall. More positive samples are found when the recall is higher.

Specificity.
e percentage of true negatives that the model correctly predicts is known as specificity.

Precision.
Precision, or the degree of a successful prediction made by the model, is one measure of the model's performance. Precision is calculated by dividing the total number of positive predictions by the proportion of genuine positives (i.e., the number of true positives plus the number of false positives).
In Figure 3 illustrates the qualitative analysis of experimental results of the MC Dataset. It shows the preprocessed, segmented images, and classified images. Table 1 represents the entire performance of the proposed DBN-AMBO with various DL techniques. e approaches like CNN, RNN, GAN, and DBN are exploited for comparison. It is validated from the table that the proposed DBN-AMBO approach obtained better performance in all performance metrics. Specifically, the value of NPV is found to be very low, hence it is efficiently utilized in TB classification.
In Table 2 indicates the entire performance of the DBN-AMBO with recent optimization techniques on the two datasets. DBN-AMBO is compared over DBN-EPO (Emperor penguin optimization), DBN-BOA (Butterfly optimization Algorithm), and DBN-MBO. e accuracy of DBN-AMBO is 9.2%, 5.2%, and 3.1% superior over DBN-BOA, DBN-EPO, and DBN-MBO, respectively. e CM (confusion matrix) of two datasets for TB classification is provided in Figure 4. is metric is used for evaluating the classification efficiency of the proposed DBN-AMBO. It is the comparison of predicted and true labels in a classification. In a normal case, both datasets provided better outcomes.    DBN-AMBO takes only 0.426 s to complete the implementation process. Table 4 represents the comparison of the performance with recently published works. When compared to these models, the proposed DBN-AMBO attained better results.
is is due to the optimal weight selection by AMBO. is optimizer reduces the loss function and enhances the accuracy.

Conclusion
TB is a viral infection disease and many countries are affected due to this disease. Hence, each and every TB-positive case should be cured. In this work, a hybrid model was introduced for TB classification using CXR images. is work undergoes four major phases. Initially, the images were preprocessed and segmented. en, 55 features were extracted and given as input to the classifier. Finally, TB is classified using DBN-AMBO. e overall evaluations were carried out on MC and SC datasets. When comparing these two datasets, the SC dataset obtained better accuracy of 0.992. It is sure that this methodology will facilitate the radiologist in TB classification. Hence, 99% accuracy was achieved by the proposed work while comparing it to other state-of-the-art approaches. In the future, CXR images can be used for TB-affected people who are affected by COVID-19 and pneumonia.
Data Availability e datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.