Lung Cancer Diagnosis Based on an ANN Optimized by Improved TEO Algorithm

A quarter of all cancer deaths are due to lung cancer. Studies show that early diagnosis and treatment of this disease are the most effective way to increase patient life expectancy. In this paper, automatic and optimized computer-aided detection is proposed for lung cancer. The method first applies a preprocessing step for normalizing and denoising the input images. Afterward, Kapur entropy maximization is performed along with mathematical morphology to lung area segmentation. Afterward, 19 GLCM features are extracted from the segmented images for the final evaluations. The higher priority images are then selected for decreasing the system complexity. The feature selection is based on a new optimization design, called Improved Thermal Exchange Optimization (ITEO), which is designed to improve the accuracy and convergence abilities. The images are finally classified into healthy or cancerous cases based on an optimized artificial neural network by ITEO. Simulation is compared with some well-known approaches and the results showed the superiority of the suggested method. The results showed that the proposed method with 92.27% accuracy provides the highest value among the compared methods.


Introduction
e proliferation of lung diseases in today's industrialized societies doubles the need for modern methods of accurate and early diagnosis. Among lung diseases, lung cancer is still recognized as one of the most dangerous cancers. Cancer means the abnormal growth, and sometimes proliferation of cells in the body. All cancers have an uncontrolled growth pattern and a tendency to detach from the source and metastasize [1]. A normal lung cell may become a lung cancer cell for no apparent reason, but in most cases, the transformation is the result of repeated exposure to carcinogens such as alcohol and tobacco. e appearance and function of cancer cells are different from normal cells. A mutation or change in the DNA or genetic material of a cell occurs [2]. DNA is responsible for controlling the appearance and function of cells. When a cell's DNA changes, that cell differentiates from the healthy cells next to it and no longer does the body's normal cells.
is altered cell separates from its neighboring cells and does not know when it should stop growing and die [3]. In other words, the altered cell does not follow the internal commands and signals that other cells are in control of and acts arbitrarily instead of coordinating with other cells [4]. One-third of all cancer deaths are due to lung cancer. About 80% of patients have five years left in their best condition after being diagnosed with this type of cancer [5]. Based on a survey by the American Cancer Society (ACS), lung cancer in both men and women is the second most prevalent cancer in the United States [6]. Approximately 228,820 new cases and approximately 135,720 lung cancer deaths, based on ACS figures, will occur [5]. Figure 1 shows the statistical information for the lead cancers and their death number in 2019 based on the American Cancer Society of lung cancer screening [7]. Air pollution due to the industrialization of cities, tobacco use, and genetic factors are the main causes of these diseases [8]. Early diagnosis of lung disease will have a major impact on the possibility of definitive treatment of the disease. Major diagnostic methods for lung cancer include radiographic imaging and computed tomography, biopsy, bronchoscopy, and examination of cells in the sputum. Meanwhile, the CT scan imaging method is widely used as a superior diagnostic method. In this diagnostic method, the doctor examines possible nodules on the images. A pulmonary nodule is a small, round, opaque mass that forms inside the lung tissue [9,10]. In other words, nodules are spherical radiographic opacities of less than three centimeters in diameter.
Formerly, lung diseases were diagnosed based on the help of experts' eye ability with no use for computer science. However, recently based on the different imaging techniques based on computer science and artificial intelligence, the diagnosis can be more precise. In most of these methods, after capturing the images from the patient, different image processing methods have been performed for tumor diagnosis [11].

Image Preprocessing
e analyzed images for validation in this study are collected from the Lung CT-Diagnosis database which has been provided by the Cancer Imaging Archive [12]. is database contains a collection of publicly available medical images for different cancers [13] with contrast-enhanced CT scan images stored in the Dicom format. is database is collected from 61 patients such that 4682 different images have been acquired from them.
After image acquisition, the min-max normalization method is established for them to scale the acquired images between 0 and 1. is study uses 250 × 250 scale normalization for this purpose. By considering a grayscale image with n dimension which has the following limitation: , the normalized image, I * , can be achieved by the following formulation: where , a and b describe the intensity values in the grayscale image, and a new and b new describe the intensity values for the normalized image [14]. Also, since the reason that all CT scan images have a kind of visual noise, they need a denoising filter to resolve the problem. A CT scan image faces different factors and includes different types, such as Gaussian noise, Shot Noise, Poisson Noise, Speckle Noise, and Salt and Pepper Noise [15]. Noise hides slight details of the CT scan image. is shows that we need a tool for noise removal before starting the CT scan images.
One of the popular noise reduction techniques for CT scan images is mean filtering. e definition of the mean (average) filter works on averaging any aspect of the picture to the neighbors [16]. e average filter calculates and divides the sum of all the pixels in the filter window by the total number of pixels [4]. It then replaces the value of the center pixel with the calculated average. e result value for each indexed pixel (i, j) is determined as follows: (2)

Image Segmentation
One of the most important issues in image processing is the identification and separation of the image into its main components. Image segmentation determines the success or eventual failure of image analysis methods. And yet, due to its wide application, it has suitable research fields. e accuracy of this study in fields such as medicine is very important to preserve and protect human life. resholding is one of the most convenient methods for image segmentation [17]. By applying a thresholding method to a grayscale image, a binary image is obtained that delineates the boundaries of the objects in the image with appropriate accuracy. e lower the threshold is, the more errors are detected and the more sensitive the results are to noise and unrelated image features [18]. On the other hand, a high threshold may miss weak errors or parts of errors. Since the main purpose of the fabric fault detection system is to find all possible faults, the threshold must be chosen to achieve this goal [19].
ere are several methods for this practice in image processing science, and some separation methods are used for specific images [20]. One way to select the appropriate threshold is to use the trial-and-error method, in which different values of the threshold are selected and the image resulting from the application of this threshold is judged by the viewer. e simplest type of separation is called general separation, which is based on the image histogram. e input of this function is a gray image or a color image [21]. Its output is also a black and white image (binary). In this method, the threshold value at any point in the image is defined based on the local properties of the image in the neighborhood of that pixel. In this paper, Kapur thresholding has been used. Assume that the gray levels in an Figure 1: e heat transfer groups and the environment and cooling objects pairs. 2 Computational Intelligence and Neuroscience image with N pixels and L gray levels are in the range [0, 1, . . . , L − 1]. Kapur method obtains the thresholds based on Kapur entropy maximization and based on the information obtained from the histogram of gray surfaces, which is defined as follows in the case of two-level segmentation (thresholding): such that e value of the optimal threshold point is the amount of gray area t that maximizes the function. For correct diagnosis of the tumors, the cancerous region should be detected with high precision. As mentioned before, image thresholding has been used for identifying the cancerous region. However, mathematical morphology is needed to better filter the detected areas based on image thresholding [22]. Mathematical morphology algorithm is a new technique for processing and analyzing signals and images. e basic idea of this technique is based on the analysis of geometric information by exploring an image with a small geometric pattern called a structuring element. is study uses three popular techniques including opening, closing, and filling holes. e first operator in the region filling is based on complementation, intersections, and dilation operators and is achieved by the following equation: where A indicates a group of boundaries and B defines the organizing element. is operator will end if X k � X k− 1 . e mathematical opening is the second utilized operator. e opening of element A with element B has been obtained with A erosion by structure element B, followed by a dilation of the resulted image by structure element B, that is, e key goal of the mathematical opening is to remove the minor blemishes in the area that can be missed during the diagnosis of lung cancer.
Finally, the mathematical closing operator is used to smooth the counters, fuses narrow breaks, eliminates small holes, and fills gaps in the contour and long thin gulfs. is operation is formulated as follows:

Features Extraction
e purpose of feature extraction is to make raw data more usable for future statistical processing. Feature extraction is a very common process in different types of data processing such as image processing and audio processing. Feature extraction means selecting a feature that can describe the image with little information. ese features must have properties so that a set of these features is described uniquely in each image. If a set of these attributes are the same for two samples, then you will not be able to distinguish two samples with any classifier in the classification section. e main reasons for extracting features from images are image simplification, reduced processing time and memory, and increased accuracy and efficiency. So, feature extraction is a process in which data is mapped in a high-dimension space to a lower dimension space. is mapping can be linear (such as principal component analysis) or nonlinear. How to select these features requires data properties to be examined, and to extract it, preprocessing operations and various filters must be applied to the image to turn the image into the desired information. In this study, GLCM features have been used for extracting the lung cancer images information.
e GLCM method is one of the most efficient techniques for extracting tissue from medical images. is matrix is a square matrix with dimensions N × N where N is the number of degrees of gray in the image. Each element of this matrix represents the number of pairs of pixels that have degrees of gray on the surface of the image and are spaced in a certain direction from each other and to a certain pixel distance. After calculating the matrix, different parameters of the image texture can be extracted from it. In this study, the mentioned technique was used to extract tissue in lung tumor images. In the following, the utilized features have been explained.

Contrast.
e contrast regulates the intensity value of the pixel and its neighbor in the image. is feature is achieved by the following equation:

Correlation.
e correlation feature describes the dependency on spatial features among the pixels. is feature can be mathematically given as follows:

Homogeneity.
Homogeneity is a local uniformity feature that makes single/multiple intervals govern the textured and nontextured characteristics. is feature is achieved by the following:

Energy.
e energy feature regulates the number of repetitive pixel pairs. is feature is mathematically obtained by the following equation: Computational Intelligence and Neuroscience 3 4.5. Entropy. e entropy is a feature that indicates the image selected interference based on the following equation:

Features Selection
Some of the different extraction characteristics in the images are so important and crucial for classification. In the meantime, those features that contain information which is not notable can have high potential when combined with other features. Any of these attributes could also have no useful data at all. is shortcoming can be resolved by different works. In this paper, an optimization-based methodology has been proposed for this purpose. e optimization method for the features selection can be achieved by the following equation: where TP signifies the true positive, TN describes the true negative, FN represents the false negative, and FP defines the false positive.

Improved Thermal Exchange Optimization Algorithm
In ermal Exchange Optimization (TEO), the temperature of the objects indicates the individual position and with objects grouping, it is started to be exchanged. erefore, new temperatures indicate their updated positions [23].

Newton's Law of Cooling.
In the seventeenth century, the English scientist Isaac Newton studied the cooling of objects. e experiments showed that the cooling rate was approximately proportional to the temperature difference between the heated object and the environment. is fact is written as a differential relation: where Q describes the heat, A signifies the area of the body surface that transmits heat, T b defines the body temperature, T s represents the ambient temperature, and α determines the heat transfer coefficient which is dependent on the geometry of the object, surface state, heat transfer mode, and other factors.
e heat loss in time dt is α × A × (T s − T)dt which defines the alteration in stored heat as the temperature falls dT, i.e., where V defines the volume (m 3 ), ρ describes the density (kg/m 3 ), and c signifies the specific heat (J/kg/K). erefore, where T M is the starting high temperature. e above equation is valid when where c is constant. erefore, And finally, the equation can be rewritten as follows: 6.2. Inspiration. During the TEO algorithm, some individuals have been considered as the cooling objects and the residual individuals have been considered as the environment; then it has been in reverse. e method of simulation of the TEO algorithm is given in the following. e first step is initialization. e initial temperature for all of the objects has been defined in an m-dimensional solution space as follows: where T 0 i signifies the initial solution vector of the object number i, rand describes a random vector with components in the range [0, 1], T min and T max describe the minimum and the maximum limitations for the decision variables.
After initializing the objects, the value of the objective function for all of the individuals is evaluated. During the process, some historically best T vectors have been stored in a memory called ermal Memory (TM) to use their position to develop the algorithm efficiency with no extra computational cost. After selecting some best values in TM, they have been added to the population and for the same numbers of them, worst individuals have been eliminated. Individuals have been divided into two equal groups. Figure 1 shows this division. For example, T 1 defines an environment object for T (n/2)+1 cooling object and contrariwise.
During the process, if an object has lower c, the temperature exchanging has been established slowly. erefore, the value of c for the objects has been established based on the following equation: 4 Computational Intelligence and Neuroscience c �
Time is another parameter that is important in this algorithm.
is parameter is related to the iteration number. is parameter can be formulated by the following equation: Generally, an important ability of the metaheuristics is to escape from the local optimum. During this process, the environmental temperature has been altered by the following equation: where c 1 and c 2 represent the control variables. T ′e i describes the object earlier temperature that has been modified to T e i . Based on the previous models, the new temperatures of the objects have been updated by the following equation: Another parameter in the algorithm is Pr with (0, 1) that stated whether a component in the cooling objects should be changed or not.
All of the Pr agents have been compared with R(i)(i � 1, 2, . . . , n) which is a randomly distributed value in the range [0, 1]. If R(i) < Pr, one dimension of the agent number i has been randomly chosen and the value is redeveloped by the following: where T i,j signifies the j th variable of the i th agent and T min j and T max j represent the lower and the upper bounds of the j th variable, respectively. To keep the structures of the agents unchanged, just one dimension has been altered.
Finally, the stopping criteria are checked to terminate the algorithm in the considered criteria.

Improved ermal Exchange Optimization (ITEO)
. e basic ermal Exchange Optimization algorithm suffers some disadvantages like stability and premature convergence problems.
is case leads us to design a modified version of the TEO algorithm to refine these drawbacks as possible.
e first modification is to use Lévy flight (LF) as a proper mechanism. is mechanism has been commonly employed in metaheuristic algorithms to solve premature convergence shortcomings [24]. During this mechanism, a random walk policy has been utilized for proper adjusting of the local search that is mathematically represented as follows: where A ∼ N(0, σ 2 ), B ∼ N(0, σ 2 ), τ signifies the Lévy index which is located in the range [0, 2] (here, τ � 1.5 [25]), Γ(·) represents Gamma function, and w defines the step size. By assuming the above equations, the updating formulation for the TEO algorithm is as follows: e second modification is to use the chaos mechanism for improving the system convergence speed. Here, we used Singer function for chaos modification [26,27]. By considering this mechanism, rnd can be updated as follows: where rnd 0 ∈ [0, 1].

Algorithm Authentication.
e simulations are applied to a Core (TM) i7-4720HQ 1.60 GHz with 8 GB RAM under and simulations are applied to the MATLAB 2017b environment. To prove the effectiveness of the suggested ITEO algorithm, it is performed on some different benchmark functions, i.e., Ackley, Rastrigin, Sphere, and Rosenbrock, and the results have been compared with some well-known and new optimization techniques, i.e. Locust Swarm Optimization Algorithm (LSO) [28], Crow Search Algorithm (CSA) [29], Multi-Verse Optimizer (MVO) [30], search and rescue (SAR) algorithm [31], and the basic ermal Exchange Optimization (TEO) [10]. Table 1 tabulates the parameter settings of the studied algorithms.
To clarify the studied benchmark functions, the equations and the boundaries have been given in Table 2. During the simulation, all of the algorithms run 40 times independently for all of the benchmark functions and the maximum iteration for all of the algorithms is considered 500. Table 3 tabulates the results of the analysis of the algorithms on the benchmark functions based on four measurement indicators including minimum, maximum, mean, and standard deviation (std).
As can be observed from the results, the results of all four indicators based on the suggested ITEO have the minimum value. is indicates that the suggested algorithm has the minimum error ratio for all of the metrics. e minimum value of the indicator "minimum" shows that the proposed ITEO algorithm has the highest precision among the others. Plus, the minimum value of the "std" for the ITEO algorithm shows its higher consistency toward the other methods.
Computational Intelligence and Neuroscience 5

Classification
For the final diagnosis of the features, a classifier is needed. In this study, a new optimized version of Artificial Neural Network (ANN) has been used [33]. e ANN is an almost new methodology with high efficiency for different classification applications. e ANN makes a proper relationship between input and output data. e sensitivity of the ANN is so low toward errors [34]. is method can be used for the accurate classification of medical images by correct training with no mathematical modeling [35,36]. A popular and simple technique among different types of ANN is the multilayer perceptron (MLP) neural network. An MLP is a mathematical model of a natural brain [34]. An MLP is a model with several numbers of weights and biases that are connected for brain performance modeling. e popular method for error minimization in MLPs is backpropagation (BP). In BP, the values of the weights and biases have been adjusted for minimizing the error between the output value and the desired value [37]. is method uses Gradient Descent (GD) algorithm for minimization. One significant problem of the GD algorithm is that they have been easily trapped into the local minimum.  Table 2: e equations and the boundaries of the studied benchmarks in the analysis.

Function
Equation Computational Intelligence and Neuroscience e output of each layer in the network is as follows: where α i indicates the i th input variable, b j signifies the j th bias for the neuron, and w ij describes the relation weight between α i and the j th invisible neuron.
After that, the activation function has been performed to trigger the output of the neurons.
is research employs sigmoid function for this purpose: And the output layer gives e ANN calculates mean square error (MSE) between the desired and the observed output, i.e., where n describes the number of the steps in the training data collection and y i and d i represent the observed value and the desired value, respectively. As mentioned earlier, GD algorithm for minimizing the MSE has some problems. erefore, here for minimizing the MSE and optimizing the classifier, the suggested Improved ermal Exchange Optimization has been used. Figure 2 shows the method of this idea.

Results and Discussion
As mentioned before, the main idea of this study is to propose a pipeline methodology for lung cancer diagnosis. e method starts with a preprocessing method for enhancing the quality of the original image. After image preprocessing, a simple image thresholding has been done based on Kapur for lung areas. Afterward, optimal features have been selected from GLCM features based on an improved version of the ermal Exchange Optimization algorithm. Finally, an optimized MLP system was based on the introduced Improved ermal Exchange Optimization algorithm. e method has been validated based on the Lung CT-Diagnosis dataset collected by the Cancer Imaging Archive [12]. Figure 3 shows an example of the image segmentation.  e number of the total images is 4682. e method has been simulated based on MATLAB 2017b environment and performed to the database based on the following configuration: Corei7 laptop with 16 GB RAM and CPU@2.6 GHz processor. Table 4 indicates the GLCM data results of 20 first images from the Lung CT-Diagnosis database. As can be observed from Table 4, nineteen numbers of GLCM features are employed for the feature extraction (Table 5).

Simulation Results.
After feature selection based on the Improved ermal Exchange Optimization, the optimum features based on the suggested ITEO methodology are evaluated and shown in Table 6. e optimum threshold achieved by ITEO has been used to select the features. For the cost function, the best optimal value earned is 0.75. To test the final efficacy of the suggested technique, three measurement indicators including sensitivity, precision, and accuracy have been analyzed. e mathematical formulations of the indicators are given below: Lin's [24] Suggested method Specificity (%) Sensitivity (%) Accuracy (%) where TN is truly negative, TP is truly positive, FN is false negative, and FP is false positive.
To verify the higher efficiency of the proposed method, a comparison analysis of the method has been applied toward some state-of-the-art algorithms, including Kavitha's [38], Kumar's [39], and Lin's [40], applied to the Lung CT-Diagnosis database.
e comparison results are illustrated in a bar chart in Figure 4.
As can be observed from Figure 4, the proposed method has the best precision for the Lung CT-Diagnosis database and Lin's, Kumar's, and Kavitha's are placed in the next ranks. e results show also the proposed method.

Conclusions
e main purpose of this study is to propose an optimal pipeline for precise lung cancer diagnosis based on different approaches. e method started with applying a preprocessing process based on a min-max normalization for the input data and an average filter for denoising the input image. Afterward, Kapur entropy maximization along with mathematical morphology was used for segmentation of the lung area. en, 19 numbers of the GLCM features have been extracted from the segmented images and the features with higher priority were selected based on a new optimization design.
e new design, called Improved ermal Exchange Optimization (ITEO) algorithm, was designed and employed to optimize the feature selection step by considering more accuracy and convergence ability as was shown in the validation stage. Finally, the images were classified into healthy or cancerous cases by using an artificial neural network optimized by ITEO. e simulation was compared with some different state-of-the-art methods including Lin's, Kumar's, and Kavitha's, and the results showed that the proposed method with 92.27% accuracy, 96.4% sensitivity, and 97.61% specificity has the highest efficiency toward the other state-of-the-art methods. In future work, we will work on using convolutional features of the lung cancer images to provide a method with higher accuracy in the system.

Data Availability
e data that support the findings will be available in the Lung CT-Diagnosis database of lung cancer images at https://doi.org/10.7937/K9/TCIA.2015.A6V7JIWX.

Conflicts of Interest
e authors declare no conflicts of interest.