Prediction of COVID-19 with Computed Tomography Images using Hybrid Learning Techniques

Reverse Transcription Polymerase Chain Reaction (RT-PCR) used for diagnosing COVID-19 has been found to give low detection rate during early stages of infection. Radiological analysis of CT images has given higher prediction rate when compared to RT-PCR technique. In this paper, hybrid learning models are used to classify COVID-19 CT images, Community-Acquired Pneumonia (CAP) CT images, and normal CT images with high specificity and sensitivity. The proposed system in this paper has been compared with various machine learning classifiers and other deep learning classifiers for better data analysis. The outcome of this study is also compared with other studies which were carried out recently on COVID-19 classification for further analysis. The proposed model has been found to outperform with an accuracy of 96.69%, sensitivity of 96%, and specificity of 98%.


Introduction
The COVID-19 virus, believed to have initially originated from the Phinolophus bat, transmitted to human beings in December 2019. Wuhan city's Huanan Seafood Market was the nerve center for the COVID-19 outbreak which spread rapidly all around the world [1] and was eventually announced as a pandemic by World Health Organization (WHO) during March 2020 [2]. COVID-19-infected individuals have experienced severe acute respiratory disorders, fever, continuous coughing, and other infections. The mortality rate of this pandemic reached its peak in a short span of time. Early detection of the COVID-19 virus is the best way in mortality reduction. The CT scan images of COVID-19-affected individuals show distinctive characteristics like patchy multifocal consolidation, ground-glass opacities, interlobular cavitation, lobular septum thickening, and clear indication of fibrotic lesions, peribronchovascular, pleural effusion, and thoracic lymphadenopathy. The evolution of consolidation and ground-glass opacities over a period of time of a COVID-19-affected patient from symptom commencement to the next 31 days is delineated in Figure 1 [2][3][4]. RT-PCR is known to be the standard testing tool but has produced false negative rates in recent studies [5,6] at the early stages. Studies also postulated the importance of CT scan images to screen COVID-19 with better specificity and sensitivity [7].
The characteristics of COVID-19 are similar to other viral pneumonia [4]. Yet with help of deep learning techniques, one can predict the differences between types of viral pneumonia precisely. The main differences between pneumonia caused by different types of viruses including the Respiratory Syncytial Virus (RSV) and Human Metapneumovirus (HMPV) in terms of ground-glass opacity (GGO), consolidation, and pleural effusion are depicted in Table 1. +++ is 50% area of lungs being involved and + is 10% area of lungs being involved.
The large number of CT scan images opens up a research area for start-up companies. These techniques proposed by researchers aid radiologists and physicians for fast and early prediction of the disease.
RT-PCR which is used for diagnosing COVID-19 has a few limitations. Firstly, the test kits are not sufficiently available and consume more time for testing, and the sensitivity of testing varies. Thus, using CT scan images for screening COVID-19 is important. CT scans images expose patchy ground-glass opacities which are hazy white spots in the lungs, which is the primary sign of COVID-19. In a recent study [8], with 1,014 patients, deep learning technique was able to predict (888/1014) positive cases using CT scan images of suspected COVID-19 patients, while RT-PCR was only able to predict (601/1014) positive cases of suspected COVID-19 patients. The results have shown that the CT scan images were able to diagnose COVID-19 effectively thus saving more lives. The mortality rates for different CoV viruses are discussed in Table 2. There is little knowledge on what will be the future of the outbreak. There are different manifestations of COVID-19 as discussed in a study [9]. In a study [10], it was found that CT scans had a high sensitivity while diagnosing for COVID-19. CT scan of the chest is considered to be an important tool for COVID-19 detection in endemic regions. As a result of the sensitivity and specificity of CT scans, a clinical detection threshold based upon ideal CT scan imaging manifestations is now utilized in China. So, CT scan images act as a better alternative to RT-PCR testing. Thus, chest CT scan images can be utilized as a primary resource for detecting COVID-19 in endemic regions which lack access to the testing kits.  This also takes less time thereby saving radiologist's time for carrying out the further treatments. The following conclusions were arrived from the researchers carried out by many studies mentioned above: (  Table 3, the deep learning techniques which were carried out using images are presented. The accuracy of the works is also shown along with the classification methods that were used. The predominant works delineated in Table 3 show 94.52% accuracy when model was built for CT images. It is also seen that most of the models are built for X-ray images [17][18][19][20][21][22][23][24][25][26]. Studies have shown instances where patient's chest X-ray showed no traces of lung nodules but then were later identified using CT scans [13,15]. CT images play a major role in detecting the COVID-19 infection. Hence, for the above reasons, a hybrid learning model was proposed which scans the CT images and classifies them as COVID-19, CAP, and Normal images using machine learning and deep learning techniques. Figure 2 shows the overall progression of the proposed hybrid learning model. The CT scan input images are collected from various sources like Google Images, RSNA, and Github, so they are different in resolution, size, and many other features. So, all the CT scan input images are preprocessed to standardize the images and given to the pretrained deep learning models for feature extraction. The extracted features are then given to machine learning classification models. The pretrained deep learning models used in the proposed work are VGG-16, Resnet50, InceptionV3, and AlexNet. The machine learning models used in the proposed work are Support Vector Machine (SVM), Random Forest, Decision Tree, Naive Bayes, and K-Nearest Neighbour (KNN). Figure 3 shows the progression of image processing.

Image Processing.
The histogram equalization is applied to enhance the quality of the image without losing the important features of the image. The histograms of the original and equalized image are shown in Figure 4. The Weiner filter is used to remove the noises from the image yet preserving fine details and edges of the lungs. The filter size is chosen to be 4 × 4 in order to prevent the image from getting over smooth. Weiner filter is typically based on estimation of variance and mean from the local adjacent of individual pixels. It then constructs pixel-based linear filters using the Eq (1).
where WFði, jÞ denotes the position of pixel in filtered image and Oði, jÞ denotes the position of pixel in the original image.   [12] CT 85% DenseNet Vruddhi Shah [13] CT 94.52% VGG-19 He X [14] CT 94% Self-trans model Michael J. Horry [15] CT 84% Fine-tuned VGG-19 Song Ying [16] CT 93% Deep CNN 3 Disease Markers μ and σ are mean and variance of local adjacent pixels, respectively. v is called the noise variance. Images are then resized to focus on a specific area of interest in order to extract its features.

Feature Extraction.
Feature extraction is achieved using pretrained CNN models such as VGG-16, Restnet50, Incep-tionV3, and AlexNet. CNN models are purposely used for image classification. An image is viewed as an array of pixel which also depends upon the resolution of an image. These CNN models consist series of convolutional and pooling layers. The data augmentation is achieved using a convolutional layer. The convolution operation is applied to a region of an image, sampling the values of the pixels in that particular region and converting them into a solitary value. This convolution operation is defined in Eq (2) and Figure 5.
where Eðj, jÞ is the value of pixel at ði, jÞ after convolution operation; Iða, bÞ is the value of pixel at ða, bÞ in input matrix and Fði − a, j − bÞ is the value of pixel at ði − a, j − bÞ in filter (Kernel) matrix and K is the kernel size or size of the filter matrix.

Disease Markers
The output size of the convolution layer is given in Eq (3).
where M is the size of the output matrix, I is the size of the input matrix, F is the size of the convolution filter, P is padding, and S is stride value for convolution operation. The max-pooling layer performs dimensionality reduction.

Disease Markers
This layer will downsample the value without losing any important information. It does max operation by finding the maximum valued neuron in a particular region for the output from the previous layer which is given in Eq (4) and Figure 5.
where Pði, jÞ is the value of pixel at ði, jÞ after pooling operation is performed; Eða, bÞ is the value of pixel at ða, bÞ of preceding layer's output and M is the size of previous layer's output grid. The output size of the max-pooling layer is given in Eq (5).
where N is the size of the output matrix, M is the size of the previous layer's matrix, F is the size of the pooling filter, and S is the stride value same as what was chosen for convolution operation. Relu acts as an activation for convolutional and max-pooling layer as given in Eq (6) where x is the input value provided to activate the neuron. Thus, all the parameters which were extracted from the series of convolution and pooling operations from all the pretrained models that were used for feature extraction only are shown in

Classification.
Classification refers to a predictive modelling problem where a class label is predicted for an input image. The classification is performed using traditional machine learning classifiers by removing the fully connected layers from the pretrained deep learning models. The extracted features were utilized for the final classification using Support Vector Machine (SVM), Decision Tree, Naive Bayes, K-Nearest Neighbour (KNN), and Random Forest. In SVM, the input values are plotted in an n-dimensional space, and the optimal hyperplane that differentiates the classes is found. In Random Forest, a large number of decision trees are built to operate as an ensemble model where all decision trees predict the class label and eventually the class that gets more votes will be chosen as the predicted label. In Decision Tree, each node acts as a splitting criterion and the branches lead to the final node (leaf node) to provide the output. Naive Bayes is a conditional probability model which used the Bayes theorem for classification. KNN is a nonparametric classifier which classifies images based on its k-nearest neighbours.

Results and Discussion
In this section, datasets that have been utilized for carrying out the experiments are discussed. Further, the comparative analysis of results is discussed.
3.1. Data Formulation. The dataset used here contains CT scan images for COVID-19 (includes both symptomatic and asymptomatic), CAP, and normal chest CT scan images. The images were assimilated from multiple resources for training the model precisely. The data collected from different resources are shown in Table 5. Scanning scheme used for scanning the image is diverse thus the model is able to learn all possible images. Image preprocessing has been applied to make the dataset a standardized one. A total of approximately 500 CT scan images were obtained for each class to maintain the data balance. The images were split for training, validation, and testing purposes which are shown in Table 6. The project was conducted on windows  Google Images, Github 500   7 Disease Markers analysis, the fully connected layers for CNN models were removed, and the prediction was performed with machine learning models as hybrid learning models. This showed that the hybrid learning models such as AlexNet+SVM and AlexNet+Random Forest models yielded better results when compared with other models.  Figure 9: Images that were tested as negative by RT-PCR were actually positive cases and were correctly predicted as positive by the proposed work.       Figure 6 shows the colormap images for COVID-19affected CT scan images which were correctly classified by AlexNet+SVM and AlexNet+Random Forest. Figure 7 shows the correctly classified CAP images, and Figure 8 shows the correctly classified normal CT scan images. These images in Figures 6-8 show the infected region in CT scan images which are then classified as CAP or COVID-19. The normal CT scan image does not have any infected region pointed in the image. The COVID-19 image shows an infected region in the left lower lobe region. This identification of the infected region is performed using Jet Colormap and Turbo heat map provided in python.

Disease Markers
To compare this work with RT-PCR, 12 sample images of 3 patients are taken to test the model. All these images in Figure 9 are classified correctly by Alex-Net+SVM and AlexNet+Random Forest, which are found to be negative by RT-PCR. The infected regions are also shown in the images using the colormap function provided by python.
Various metrics used to analyse different models are discussed below. F1-score, precision, and recall are defined in Eq (7), Eq (8), and Eq (9). Accuracy of a model shows how correctly the images are classified. The precision of the model determines the reproducibility of values or how many values are predicted correctly. Recall of a model shows how many correct values are discovered among all classes. F1score takes precision and recall into account in order to calculate a balanced average value.
where precision and recall are defined in Eq (8) and Eq (9). These values are in fact calculated from a Confusion matrix that is built using test data images.
where T p is the number of images observed as positive and predicted as positive and F p is the number of images observed as negative and predicted as positive.
where T p is the number of images observed as positive and predicted as positive and F n is the number of images observed as positive and predicted as negative. Recall is also called sensitivity. The specificity is defined in Eq (10). where T n is the number of images observed as negative and predicted as negative and F p is the number of images observed as negative and predicted as positive.
The accuracy for all the models can be calculated by Eq (11).
where T n is the number of images observed as negative and predicted as negative. The Root Mean Square Error (RMSE) value for all the images can be evaluated using Eq (12).
where y j is the actual value, y ′ j is the predicted value, and n is the total number of images. The Mean Absolute Error (MAE) can be calculated using [26]. where y j is actual value, y ′ j is predicted value, and n is the total number of images.
The Confusion matrix is often used to analyse the performance of the classification models using predicted class label for test images against known class label for test images. Classification report is used to evaluate the quality of prediction of class labels by classification models. The Confusion matrix and classification report for the models built using conventional machine learning classifiers are presented in Table 7. It is obvious that Random Forest has produced better results with the precision of 0.95, recall of 0.96, and specificity of 0.97 when compared with other machine learning classifiers. The Confusion matrix and classification report for models constructed using deep learning techniques are analysed and shown in Table 8. It is seen that AlexNet has produced better prediction outcomes with the precision of 0.94, recall of 0.94, and specificity of 0.97. The Confusion matrix and classification report for the proposed hybrid learning models are presented in Tables 9-13. The proposed works performed better than other classifiers. AlexNet+SVM has produced better results with the precision of 0.96, recall of 0.96, and specificity of 0.98 when tested for 333 test images. Resnet50+Random Forest has also produced better outcomes with precision, recall, and specificity of 0.95, 0.95, and 0.97, respectively. The feature extraction produces only necessary features to be trained and remove unnecessary features that are not vital for the classification task. This also helps the model to be faster in training and testing the classification models.
The outcomes of the models when trained for images before preprocessing and after preprocessing are also compared. There is a visible difference in results and it shows the significance of preprocessing the images. This analysis report is presented in Table 14. When comparing the outcome with studies that are featured in Table 3, the presented hybrid learning models have produced better results.
Thus, the prediction of COVID-19 using the classification model has been constructed in a robust way and it helps in quicker prediction of COVID-19. AlexNet model takes 13 minutes 25 seconds for training and 6 minutes 38 seconds for testing. VGG-16 model takes 20 minutes 43 seconds for training and 12 minutes 30 seconds for testing. InceptionV3 model takes 34 minutes 12 seconds for training and 20 minutes 12 seconds for testing. Resnet50 model takes 43 minutes for training and 21 minutes for testing. As the model gets deeper, it takes more time to train and test the images. The time taken to run the models is inversely proportional to a number of layers. RT-PCR which is used as a standard reference takes 1-2 days in India for confirming a patient to be infected by COVID-19 or not. When compared to RT-PCR, the present model aids in quicker prediction and can aid radiologist in carrying out further treatment and procedures. The accuracy of this model is also quite promising to perform the prediction when compared with RT-PCR. The CT scan images that were tested as negative by RT-PCR are also correctly predicted by these models. Often, the medical images are unclear with lesions and tissues being captured in CT scan images which can impede the prediction task. In order to overcome these difficulties, various image preprocessing techniques were applied. The image preprocessing techniques that are incorporated has an impact on accuracies and results. These techniques provide better resolution, high quality, and high-definition images for carrying out the prediction. In conclusion, the models presented in this study have produced better results in terms of outcomes (accuracy) and quicker prediction even in the early stages. In short, the proposed work can be used for this global public health emergency situation which requires immediate attention.

Conclusion
Early detection of COVID-19 is vital for treating and isolating the patients in order to avoid the spread of the virus. RT-PCR is contemplated as the standard technique, but it is reported that chest CT could be used as a rapid and reliable approach for scanning of COVID-19. The proposed hybrid learning models are able to detect COVID-19 with chest CT scan images with an accuracy of 96.69%, sensitivity of 96%, and specificity of 98% for AlexNet+SVM model. Even though there is overlap in patterns of abnormalities in CAP-and COVID-19-affected CT scans, these models are capable of performing well with greater accuracy, sensitivity, and specificity using multisource data assimilation. Finally, reliable models are proposed to distinguish COVID-19 and CAP from CT scan images.

Ethical Approval
This study does not involve human participants, and hence, ethical approval is not required.

Conflicts of Interest
On behalf of all the authors, the corresponding author state that there is no conflict of interest.

Authors' Contributions
V.P performed the supervision, project administration, writing, reviewing, and editing. V.N performed the data curation and formal analysis and wrote the original draft. S.J.S.R performed the conceptualization, investigation, and methodology.