A novel framework based on deep learning and ANOVA feature selection method for diagnosis of COVID-19 cases from chest X-ray Images

The new coronavirus (known as COVID-19) was first identified in Wuhan and quickly spread worldwide, wreaking havoc on the economy and people's everyday lives. Fever, cough, sore throat, headache, exhaustion, muscular aches, and difficulty breathing are all typical symptoms of COVID-19. A reliable detection technique is needed to identify affected individuals and care for them in the early stages of COVID-19 and reduce the virus's transmission. The most accessible method for COVID-19 identification is RT-PCR; however, due to its time commitment and false-negative results, alternative options must be sought. Indeed, compared to RT-PCR, chest CT scans and chest X-ray images provide superior results. Because of the scarcity and high cost of CT scan equipment, X-ray images are preferable for screening. In this paper, a pre-trained network, DenseNet169, was employed to extract features from X-ray images. Features were chosen by a feature selection method (ANOVA) to reduce computations and time complexity while overcoming the curse of dimensionality to improve predictive accuracy. Finally, selected features were classified by XGBoost. The ChestX-ray8 dataset, which was employed to train and evaluate the proposed method. This method reached 98.72% accuracy for two-class classification (COVID-19, healthy) and 92% accuracy for three-class classification (COVID-19, healthy, pneumonia).

equipment that are not readily available; because of the large number of false-negative results, it takes at least 12 hours, which is inconvenient considering that positive COVID-19 patients should be identified and followed up on as soon as possible [7], [8]. Chest-CT scan is another option for detecting the disease, which is more accurate than RT-PCR; For instance, 75% of negative RT-PCR samples had positive results on chest-CT scans [9]. CT scans have several drawbacks, including image collection time, related cost, and CT equipment availability [10].
When compared to CT scans, X-ray images are less expensive and more easily available [11].
As a result, the focus of the research is only on the use of X-ray imaging as a screening tool for COVID-19 patients.
Researchers discovered that COVID-19 patients' lungs contain visual markings such as ground-glass opacities-hazy darker areas that may distinguish COVID-19 infected individuals from non-infected patients [12], [13]. However, due to the limitations of experts, time constraints, and the irreversible consequences of misdiagnosis [4], it is crucial to discover a different approach to get faster and more reliable outcomes. The technological advancements facilitate the process of diagnosing the diseases, in other words, the widespread use of AI [14] mainly its areas such as machine learning and deep learning, are extremely constructive and researchers have made significant use of AI and deep learning in various medical areas [15]- [19]. CNN architecture is one of the most prominent deep learning techniques in the medical imaging field, with outstanding results [20].
Pre-trained neural networks are used in this paper, which is one of the most recent techniques. Using easily accessible pre-trained models, the proposed method extracts features from X-ray images. We utilize one of the feature selection methods in the second phase to 3 acquire an appropriate number of features for classification. Finally, we use the XGBoost classifier to classify the specified features. The rest of the paper is organized as follows: Section 2 describes related works. In section 3, used materials and methods will be presented. In section 4, the experimental results are reported and analyzed. Finally, section 5 will present a summary of the findings and conclusions.

Related Works
Researchers worldwide are now trying to fight against the Covid-19; using radiological imaging and deep learning has made significant progress in this approach. Wang et al. [6] developed COVID-Net, a deep model for COVID19 detection that categorized normal, non-COVID pneumonia, and COVID-19 classes with 92.4 percent Accuracy. Apostolopoulos et al. [21] applied transfer learning and employed COVID-19, healthy, and pneumonia X-ray images to develop their model. Ozturk et al. [4] proposed using the DarkNet model to build a deep network. This model contains 17 convolution layers and utilizes the Leaky RelU activation function. For binary classes, this model was 98.08% accurate, and for multi-class cases, it was 87.02% accurate. Nasiri and Hasani [22] employed DenseNet169 to extract features from Xray images and used XGBoost for classification; they gained 98.24% and 89.70% in binary, and multi-class classification, respectively. Sethy et al. [23] devised an in-depth feature combined support vector machine (SVM) based method for detecting coronavirus infected individuals using X-ray images. SVM is examined for COVID- 19 [33] used publicly available datasets to build a dataset of 5000 chest X-rays. A board-certified radiologist discovered images that showed the existence of COVID-19 virus. Four prominent convolutional neural networks were trained to detect COVID-19 disease, using transfer learning

Materials and Methods
The proposed method employs the DenseNet169 deep neural network, as well as feature selection and the XGBoost algorithm, which will be discussed in the following section.

DenseNet169
A CNN's overall architecture is composed of two core parts: a feature extractor and a classifier.
Convolution and pooling layers are the two essential layers of CNN architecture. Each node in the convolution layer extracts features from the input images by performing a convolution operation on the input nodes. Through averaging or calculating the maximum value of input nodes, the max-pooling layer abstracts the feature [34], [35]. DenseNet is a highly supervised network that contains a 5-layer dense block with a k = 4 rate of growth and the standard ResNet structure. Each layer's output in a DenseNet dense block includes the output of all previous layers, incorporating both low-level and high-level features of the input image, making it suitable for object detection [36]. The ILSVRC 2012 classification dataset, which was used for training DenseNet, contains 1,000 classes and 1.2 million images. The dataset images was cropped with the size of 224 224  before using as input for DenseNet. DenseNet presented a new connectivity pattern that introduced direct connections from any layer to all the following layers to improve information flow across layers even further [37]. In DenseNet, the l th layer, takes all feature maps 0 1 2 1 , , ,..., l x x x x − from the preceding layers as input, which is described by Equation (1).
H  is a singular tensor and 0 1 2 To preserve the feature-map size constant, each side of the inputs is zero-padded by one pixel for convolutional layers with kernel size 33  . DenseNet employed 11  convolution and 22  average pooling as transition layers between adjoining dense blocks. A global average pooling, in fact, is conducted at the end of the last dense block, and then a Softmax classifier is connected. In the three dense blocks, the feature-map sizes are 32 32  , 16 16  , and 88  , respectively. On five distinct competitive benchmarks, this innovative architecture reached state-of-the-art accuracy for recognising the object [34], [37].

Analysis of Variance Feature Selection
New issues develop as a result of the creation of large datasets. Consequently, reliable and unique feature selection approaches are required [38]. Feature selection can assist with data visualization and understanding, as well as minimizing measurement and storage needs, training and utilization times, and overcoming the curse of dimensionality to enhance prediction performance [39], [40]. Analysis of variance (ANOVA) is a well-known statistical approach for comparing several independent means [41]. The ANOVA approach ranks features by calculating the ratio of variances between and within groups [42].
The ratio indicates how strongly the  th feature is linked to the group variables.
Equation (2) is used to calculate the ratio F value of  th g-gap dipeptide in two benchmark datasets: Between, MSB) and within groups (also known as Mean Square Within, MSW), respectively, and can be calculated as Equation (3) and Equation (4).
The degrees of freedom for MSB and MSW are 1

Extreme Gradient Boosting (XGBoost)
Chen and Guestrin proposed an efficient and scalable variation of the Gradient Boosting algorithm called Extreme Gradient Boosting (XGBoost). XGBoost has been widely employed by data scientists recently, and it had desirable results in a wide range of machine learning competitions [44], [45]. In certain ways, XGBoost differs from GBDT. First of all, the GBDT algorithm only employs a first-order Taylor expansion, whereas XGBoost augments the loss function with a second-order Taylor expansion. Secondly, the objective function uses normalization to prevent overfitting and reduce the method's complexity [46], [47]. Third, XGBoost is extremely adaptable, allowing users to create their own optimization objectives and evaluation criteria. Nevertheless, by establishing class weight and using AUC as an assessment criterion, the XGBoost classifier can handle unbalanced training data efficiently.
In summary, XGBoost is a scalable and flexible tree structure improvement model that can manage sparse data, enhance algorithm speed, and minimize computing time and memory for large-scale data [48].
Formally, the XGBoost algorithm can be described as follows: Given a training dataset of n samples

Proposed Method
In this study, pre-processing methods were employed on the dataset, which includes label encoder for classes and using normalization on images and as a result, less redundant data are given as the input to the network. Deeply influenced by the brain's structure, deep learning as a sub-field of machine learning was emerged. In the area of medical image processing, as in many other areas, deep learning approaches have continued to demonstrate excellent results in past years [29]. ImageNet is a dataset of millions of images organized into 1000 categories when it comes to image processing. The next step was to apply several pre-trained models that were trained based on this dataset. Densnet169 had the best performance among those models, so it was selected as the feature extractor in the proposed method. The X-ray dataset images scaled at a fixed size of 224 224 pixels, which is the DenseNet169 input size.
The final layer of the DenseNet169 network, which was used to predict ImageNet dataset labels, was eliminated. Global average pooling, a pooling method designed to substitute fully connected layers in classical CNNs, was added in the final layer of the network. One of the benefits of global average pooling is that there are no parameters to adjust in this layer; therefore, no training is needed. Additionally, because global average pooling sums up the dimensional information, it is more resistant to input dimensional translations [50]. The X-ray images were given to the network in order to extract features from DenseNet169, and 1664 features were extracted as a result.
When a learning model is given many features and few samples, it is likely to overfit, causing its performance to degrade. Among researchers, feature selection is a widely used strategy for reducing dimensionality [51]. In order to reduce the classification time and increase the classifier performance, the ANOVA feature selection method was employed to reduce the 8 number of features. Thus, the range of 50 to 500 features was applied for the purpose of selecting the best number of features for classification (using validation set). Finally, the selected features were given to the XGBoost for detection of COVID-19. Figure 1 shows the general framework of the proposed method.

Results and Discussion
Several performance metrics such as precision, recall, specificity, and F1-Score, as well as accuracy were utilized, to evaluate several deep learning models with the proposed methodology, because accuracy alone cannot evaluate a model's usefulness [52]. Accuracy is the ratio of number of correctly predicted samples to the total number of samples. The Equation (7) can be used to calculate accuracy.

TP TN Accuracy
Total (7) Precision is the proportion of predicted true positive values to the total number of predicted true positive and false positive values. A model with a low precision is prone to a high falsepositive rate. Precision can be calculated using Equation (8).
The number of true positives divided by the sum of true positives and false negatives is known as recall or sensitivity. When there is a large cost associated with false negatives, the model statistic used to pick the optimal model is recall. Recall can be computed using Equation (9). TP Recall TP FN (9) Specificity is the proportion of predicted true negatives to the summation of predicted true negatives and false positives. Specificity can be determined using Equation (10). (10) F1-Score combines precision and recall. As a result, both false positives and false negatives are included while calculating this score. It is not obviously as simple as accuracy for comparison. However, F1-Score is generally more valuable than accuracy, particularly if the problem is an imbalanced classification problem. Equation (11) can be used to calculate the F1-Score. 1

-
Recall Precision F Score Recall Precision (11) In this study, the dataset that Ozturk et al. [4] collected has been employed, gathered from two distinct sources, and includes COVID-19, No-findings, and Pneumonia as shown in Figure   2. The 5-fold cross-validation had an average accuracy of 98.72%, and the confusion matrix was computed for each fold and overlapped, as shown in Figure 3. The total of the confusion matrix entries acquired in all folds is used to generate the overlapping confusion matrix. As a consequence, the goal is to get a sense of the model's general patterns [4]. It shows that the proposed architecture correctly identified COVID-19 and No-findings with 100% and 98.43% accuracy, respectively. In other words, the proposed method performs better at detecting truepositive samples. Precision, recall, specificity, and F1-Score values achieved are 99.21%, 93.35%, 100%, and 97.87%, respectively. Table 1 represents the comparison of the proposed method with Ozturk et al. [4] and Nasiri and Hasani [22] in terms of accuracy, precision, recall, specificity, and F1-Score values for each fold and the average of all folds, which Nasiri and Hasani [22] had better results than Ozturk et al. [4] and the proposed method outperforms them all except recall.
In the multi-class problem, 80% of X-ray images dataset was used for training and the 20% remaining employed for evaluation of proposed architecture. ANOVA was used to select 275 features out of 1664 as the ideal number for classification. As a consequence, almost 84% of features are decreased, and XGBoost classification process was substantially ramped up and performance improved. The accuracy of validation set achieved 92% and the confusion matrix was illustrated as Figure 4. Like binary class problem, this confusion matrix indicates that the proposed method had a stronger result in finding COVID-19 rather than No-findings, and Pneumonia. Precision, recall, specificity, and F1-Score values of 94.07 %, 88.46%, 100%, and 92.42%, were reached respectively. In terms of accuracy, precision, recall, specificity, and F1-Score values of the validation set, Table 2 compares the proposed approach to Ozturk et al. [4] and Nasiri and Hasani [22].  12 The proposed method applied on ten pre-trained networks for both binary and multi-class problem. As shown in Table 3, The average of 5-fold cross-validation accuracy employed to compare approaches in two class problem whereas the best fold accuracy was used to compare approaches on multi-class problem. DenseNet169 outperforms other pre-trained networks in both binary and multi-class problem. Additionally, the gradient-based class activation mapping (Grad-CAM) [54] was used to represent the decision area on a heatmap. Figure 4 illustrates the heatmaps for three COVID-19 cases, confirming that the proposed method extracted correct features for detection of COVID-19, and the model is mostly concentrated on the lung area.
Radiologists might use these heatmaps to evaluate the chest area more accurately.    sources has been used in this paper. For two-class and multi-class classification problem, 92% and 98.72% accuracy are obtained, respectively, in this paper. Table 4 shows that the proposed approach outperforms most of the existing deep learning-based models in terms of accuracy.
However, it should be emphasized that the findings in Table 4 were derived from different datasets and different experimental setups.

Conflicts of Interest
The authors declare that they have no conflicts of interest.

Funding Statement
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Code Availability
The source code of the proposed method required to reproduce the predictions and results is available at https://github.com/seyyedalialavi2000/COVID-19-detection