A Diagnostic Model of Breast Cancer Based on Digital Mammogram Images Using Machine Learning Techniques

,


Introduction
Breast cancer disease is one of the most recorded cancers which leads to morbidity and maybe death among women around the world. Recent research statistics have exposed that one from 8 females in the USA and one from 10 females in Europe are diseased by breast cancer [1][2][3]. Hence, breast cancer is the main problem in public health. So, the early detection of this disease is the best stratagem for ghting it. Mammography diagnosis is the most commonly precise tool that can be employed for the early detection of breast cancer [2,4,5]. e preprocessing step of mammogram images is crucial. In research [6], the objective of the Breast Imaging-Reporting and Data System (BI-RADS) of the American College of Radiology (ACR) is to provide a consistent classi cation system for commentary mammographic breast concentrations. Di erent research makes the determination either to detect breast lesions using computer-aided detection systems (CADe) or to understand mammograms through Computer-Aided Diagnostic Systems (CADx). ese systems are working as a supplement to the radiologists' valuation. In general, there are four steps to be tracked in the development of a CAD system for diagnosing suspicious regions in mammograms. e first step is the preprocessing for preparing mammograms to be applied in the following steps, without noise. e second one is the identification of regions of interest (ROI) for selecting the desired mammogram information. e third step is the extraction and optimum selection of the features from the identified ROIs. Finally, the classification step of ROIs aims to classify mammograms and decide if it is a normal mammogram or abnormal mammogram which is either a malignant mammogram or a benign mammogram [7,8].
As it was mentioned above, there are many variant recent works that aimed to propose CAD systems for breast cancer diagnosing, using different feature extraction techniques, such as wavelet processing, statistical methods, and different classification methods like machine learning, neural networks, and deep learning. Recently, AI based on machine learning plays a promising and attractive branch in classification purposes [from tables in related works]. Most of the related works are proposing a classification methodology for benign and malignant while the classification of normal and abnormal is not identified clearly. is work aims to explore a flexible and effective machine learning method for breast cancer diagnosing of normal/abnormal and benign/malignant classifications, with systematic steps and identified methods/techniques. A mammography dataset of the Mammographic Image Analysis Society (MIAS) database [9] is chosen. e proposed CAD system is illustrated in the following steps. e first step, the preprocessing step of the MAIS dataset images, is carried out by applying different image processing techniques such as noise removal, artifacts and labels suppressing, and image enhancement. e 2 d median filter is used for noise removal from a mammogram image then, a morphological operation is carried for suppressing artifacts and labels of mammogram images, and finally, the contrast enhancement of mammography images is performed. e second step, the segmentation step, is carried out by taking away the pectoral muscle, using the seeded region growing (SRG) algorithm [10][11][12]. Subsequently, the identification and extraction of the region of interest (ROI) are done to the result of the segmentation. In the third step, the feature extraction step, several features are extracted from the ROI using different feature extraction classes such as the first-order statistical class, the second-order statistical class based on the gray level cooccurrence matrix (GLCM), the fractal dimension class, the shape class, and the wavelet features class. After that, the step of the optimized feature selection is involved picking the optimum and effective subset of features, which show clear effects on the classification accuracy. is is performed using the Sequential Forward Selection (SFS) technique. Finally, the classification of either normal or cancer by its two different types, benign and malignant, is decided. More details on these steps are presented and discussed in Section 3. e main contributions of this work can be summarized as follows: (i) A proposed method based on machine learning for diagnosing breast cancer with mammography images is introduced and illustrated in systematic steps, studied, and investigated. (ii) e MAIS dataset of mammography images is used after carrying out the preprocessing step of the dataset in three substeps, noise removal using 2 d median filter, thresholding and contrast enhancement, and pectoral muscle removal using SRG technique. (iii) 307 features are extracted using different statistical methods and wavelet feature classes in the suggested method, and only 21-features with an effective impact on the accuracy are chosen during the training phase. (iv) e SVM algorithm is used for classification in two levels, one for normal and abnormal classification and the other for benign and malignant classification. (v) A CAD system with GUI manipulation is implemented for flexible manipulation of mammography. (vi) e accuracy of the proposed method has reached 100% accuracy in the case of normal and abnormal classification of mammography. e rest of this paper is organized as follows: Section 2 introduces a survey on related works; Section 3 presents and discusses the proposed system. Section 4 demonstrates the practical implementation of the proposed CAD systems and presents and discusses experiments and results. Finally, Section 5 summarizes the conclusions of this work.

Literature Review and Related Works
ere are numerous approaches that have been anticipated for mammogram diagnosing. In general, they are can be grouped into statistical-based methods [13,14], waveletsbased methods [15][16][17], Markovian-based models [18], machine-learning-based methods [19], etc. Numerous investigations have been issued on computer breast cancer diagnosis. In [20], the author presents an overview of the recent advances in the field of CAD breast cancer diagnosis based on mammogram image analysis. In [21], this research presents an outline of procedures that have been suggested in analyzing breast cancer images' histopathology.
For the development of a CAD system for breast cancer, the process of distinguishing the ROIs is an important step that plays a key challenge. Many suggestions have been introduced by researchers for breast tissue/muscle region segmentation according to the density and the texture variances [22]. A proposal provided by [23] uses the Bayesian techniques with a Markovian random field to partition mammogram images into three diverse regions, the pectoral muscle, the fatty, and the fibroglandular regions. Other approaches were using the LBP, the K-means, SVM, and the GLCM algorithms for identifying the ROI regions from mammogram images like in [24][25][26]. e proposals presented by [2][3][4] introduce adaptive thresholding methods based on multiresolution for detecting suspicious lesions in mammogram images. For the feature extraction/ classification step, there are many different types of research suggestions, and the Automatic CAD detection/classification system of suspicious lesions was presented by [27]. In [28], the SVM classification was applied to the development of a classification algorithm for breast masses. e author in [29] presents a different machine learning technique for classifying breast cancer as malignant via cytological imaginings of fine-needle aspiration. e work presented by [30] proposes an automated breast cancer diagnosing system by employing the GVFSnake Segmentation method, the wavelet-based feature extraction, and the fuzzy-based classification. A hybrid optimization algorithm-based feature selection for mammogram images and hybrid transfer learning for detecting the breast masses of mammographs are presented by [31,32]. e authors in [33] proposed a novel computer-aided diagnosis (CAD) system based on one of the regional deep learning techniques, an ROI-based convolutional neural network for simultaneous detection and classification of breast masses in digital mammograms. In [34,35], the research is developing a CAD system that employs a temporal analysis for improving radiologists. From the previous studies in the field of breast cancer detection and classification, it is clear to us that the CAD system is an attractive way that could lead to good marks in diagnosing a breast cancer disease. e systematic review in [36] provides a comprehensive description and analysis of existing CAD systems that make use of machine learning techniques, as well as an assessment of how they now stand in relation to various categorization schemes and mammography image modalities. All CAD phases, including preprocessing, segmentation, feature extraction, feature selection, and classification, were covered in this systematic study. e systematic review outlined suggestions for the next study and identified research gaps. Table 1 presents some related works and their shortcomings discussed in the following paragraph. In this table, there are different proposed methods with different considerations. Some were working with machine learning like K-mean, fuzzy C-means, and neural networks (NN) and compared them with SVM, and others using multiclassifiers, artificial neural networks (ANN)s, or convolutional neural networks (CNNs). On another side, there are many different datasets and different feature extraction and selection techniques. Overall, the SVM classifier has approved its practice effectiveness, and the wavelet and GLCM feature extraction classes are performed well. Moreover, there is no clear classification will be identified for normal versus abnormal classification. Hence, we make our decision for carrying out effective preprocessing of the MIAS dataset which is the only available dataset to us in 2018 when we start the work. e preprocessing of the MIAS mammograms is carried out based on the proposed work in [10]. en, for

The Proposed CAD System Methodology
An abstract view of the proposed CAD system processes is shown in Figure 1. e flow of processes of the suggested CAD system is starting by preprocessing the mammogram images for removing noise. e next step, image segmentation, is carried out for detecting and identifying the ROIs of mammogram images. en, the ROIs are used in the features extracting step. After that, the Sequential Forward Selection (SFS) technique is used for the important and relevant feature selection. e final step, the classification, is applied to mammogram's features to classify mammograms as either normal or abnormal, with further classification of abnormal images as benign or malignant. A detailed description of these steps is presented in the followed subsections.

Digital Mammography Image Dataset.
As an initial step, before the processes of the proposed CAD system for breast cancer diagnosing system, data collection will be obtained.

Data Collection.
We obtain a well-known digital mammogram dataset from a known data acquisition society called the mini-MIAS database. e obtained images include left and right breast images of breasts that are fatty, fattyglandular, and dense-glandular. e three basic cases of the obtained mammogram images are malignant, benign, and normal that are each further categorized into five groups as follows: (1) Constrained masses (2) Spoilage-like masses (3) Irregular masses (4) Masses with deformed architecture (5) Asymmetries in the masses A well-known available mammogram images database which represents 322 digital mammogram images of 161 pairs at a resolution of 50 μ as a portable gray map style with associated truth data, from the MIAS, UK [9], is selected. e MIAS data is used to feed as standard inputs to the proposed CAD system in this work. Images are associated with annotation labels.

e First
Step: e Preprocessing of Database. Naturally, different medical images, such as mammogram images, are hard to understand or interpret; therefore, preprocessing is required to enhance the quality of images by removing noise and to make better results in the segmentation step. Image preprocessing includes noise removal, artifact suppression, and background separation [10].

Noise Removal.
e majority of obtained mammography images contain digitization noises like straight lines, which are filtered using a two-dimensional (2D) median filtering technique in a 3-by-3 neighborhood connection. Each output pixel includes the median value for the 3-by-3 area surrounding the corresponding input pixel for removing artifacts, labels, and unwanted image borders by using morphological operations. However, the images' edges are changed to zeros. A mammography image is shown in Figure2(a) with the digitization noise present, and the same image is shown in Figure 2(b) with the noise removed.

Artifact Suppression and Background Separation.
Using threshold and morphological techniques, shadow artifacts in the mammography images such as wedges and labels are eliminated. Figure 2(a) displays a mammography image with a shadow artifact that was taken from the MIAS mammogram database. A global threshold with a value of � 18 normalized value is discovered to be the most suitable threshold for converting the grayscale images into binary [0, 1] format through manual inspection of all mammography images acquired. e binary mammography pictures are subjected to morphological operations such as dilation, erosion, opening, and closing after the grayscale mammogram images are converted to binary, as shown in Figure 2(c) for the image in Figure 2(a). e breast profile region is likewise separated from the background during artifact, wedge, and label suppression (see Figure 2(d)).

e Image Segmentation
Step.
e main objective of image segmentation is to identify and extract the ROI, i.e., the pectoral muscle, which has similar characteristics that may match the tumor in the breast profile if exist.
Initially, for obtaining the best outcomes of segmentation, we do the following steps: (1) Prior to completing seeded region growing (SRG), it is necessary to determine the breast orientation in each mammography image. e binary image in Figure 2(c) is used to determine the breast profile orientation (left or right) using an automated approach. e binary image is chopped from top to bottom and from left to right so that the breast profile hits the image's four boundaries (top, left, right, and bottom). In the cropped binary images, the sum of the first and final five columns of binary values is then determined. e breast profiles are oriented by simply comparing the sums of the first and final five columns; if the first sum is bigger than the last, the breast is right-oriented; otherwise, it is left-oriented.
(2) e image contrast of the breast profile images is enhanced using image adjusting and stretching limits techniques in MATLAB.
(3) en, the image segmentation is carried out via the SRG algorithm for extracting the pectoral muscle from a processed mammogram image. For more details on the SRG algorithm, see [11,12].  Applied Computational Intelligence and Soft Computing

e Region of Interest (ROI) Selection.
Features necessity is to be calculated as of abnormal region, benign and malignant, in breast profile without no including all other insignificant fragments of the breast tissue. So, features could not be directly calculated from the segmented mammogram images. at is due to the fact that the segmented image willpower biases the exposure results. For such cases, the Ground Truth data assimilated along with the mammography's dataset information are used for identifying as well as for extracting the ROIs [9]. According to the size of ROI, a suitable ROI window size that is adequate for encircling the majority of benign and malignant abnormalities is selected to be 64 × 64 pixels [23,24]. So, the proposed CAD system uses an ROI block of size 64 × 64 pixels. Figure 4 shows examples of ROIs within 3-different mammogram images, Figure 4(a) is an example of benign ROI, Figure 4(b) is an example of malignant ROI, and Figure 4(c) is an example of normal ROI.

e Feature Extraction
Step. Since the conventional images are complicated to manipulate and are highly textured, they led to difficult interpretation. So, the digital mammogram images are used in the feature extracting phase of the proposed system based on their ability for visualizing masses.
is is because the spatial resolution of digital mammogram images resulting from an X-ray is in the order of limited microns. Consequently, it is necessary to enhance the performance of the diagnosis in terms of accuracy and reliability by extracting features from digital mammograms. Figure 5 shows five classes of feature extraction methods that are applied in this work.
An explanation of these five main classes of feature extraction is presented in the following subsections.

e Features of First-Order Statistics Class.
e firstorder statistics feature class offers variant statistical properties representing the image's intensity histogram. e firstorder statistic depends on the intensity values of each pixel only. Table 2 lists 18-features representing the first-order textural features that could be extracted from ROI [48,49].

e Features of Second-Order Statistics
Class. e most commonly used second-order statistics feature is the GLCM feature, which can be applied in the extraction of 2nd order texture information on images based on well-known features having a strong statistical tool. Table 3 presents a list of GLCM features [50][51][52].

e Features of the Shape Class.
To notice the ROI's shape of images, the shape features' class could be employed. Many shape features can be extracted from mammogram images; the quantitative analysis of these shape's properties permits us to distinguish between malignant lockups from those of benign as well as the normal cells. Eight shape  e exponent of the number of similarity-on-self fragments, N, to the magnification factor, 1/R into which a figure may be fragmented defines the fractal dimension [54], as listed by

e Wavelet Features' Class.
One of the most commonly used signal/image processing/representation techniques is wavelet transformation. Typically, wavelet transformation of images results in two types of decomposition, the detail, and the approximation coefficients. e detailed coefficients represent high-frequency components of the input image; the approximations represent low-frequency components of the input image. Wavelet transform is good in extracting the characteristics' vectors. In this numerical signature, the vector of characteristics represents masses and calcifications through the smaller-scaled image with a minimum amount of values; this depends on the resolution of the input image [55,56]. In this work, the wavelet transform is used for detecting and segmenting masses and calcifications, by evaluating the practice of the Haar and the Daubechies wavelet types for breast lesions, mass, and calcification and characterization. ese two wavelet transforms can characterize the difference between masses and calcifications over the gray levels. Figure 6 demonstrates the wavelet processing of the image as a low-frequency filter is used to pass low-frequency components resulting in approximation coefficients of the input image while a high-pass filter that passes high-frequency components of the image results in detailed components.
An example of applying the Haar wavelet transform is presented in Figure 7, with two levels of decomposition. e approximation coefficient informs nuance while, the detailed coefficients, in horizontal, diagonal, and vertical provide the image its identity.

e Important Feature Selection
Step. e critical role in providing the best performance of the classifier of the image is how accurately the features are extracted and selected by the classifier. e optimum selection of the effective features from the whole feature set provides the best accuracy of the classifier. Roughly, there are three different categories of feature selection methods which are as follows: the filtering methods, the Wrapper methods, and the embedded methods. In this work, the Sequential Forward Selection (SFS) technique, which belongs to the Wrapper methods, is used for the selection of the important features from the extracted features [57,58]. In the SFS selection method, an empty set of features is growing sequentially, by adding variant features to the feature set, training the model with these added features, and dropping out features with the lowest effects, until the addition of extra features does not reduce the criterion.

e Classification
Step. e final stage of the CAD system is the classification which is used to categorize breast cancer to be either normal or abnormal as malignant or benign. It aids in predicting features using the Support Vector Machine (SVM) algorithm. SVM was established by the machine learning community [59]. e SVM is a classifier defined by a separation of hyperplanes. e hyperplane is defined by the distance from such hyperplane to the nearest data points on each side, termed maximal support vectors [60]. It could be extended to nonlinearly separable data with the help of kernel function application on the data  Applied Computational Intelligence and Soft Computing to make them linearly separable [61]. An approach based on wavelet SVM was discussed by [62]. More details on SVM and its application to the diagnosing of breast tumor was discussed in [63,64].

Computer Experiment Study and Investigation
is section presents the experimental results as well as the implementation of the proposed computer-aided detection and diagnosis (CADD) system. Different experiments have been carried out in the sequential stages followed by the CADD system. e next subsections are providing the experimental results of the preprocessing, image segmentation, feature extraction and selection, and classification stages. en, the implementation of the proposed CADD system in the working environment, the MATLAB software, is presented, discussed, and tested.

e Preprocessing.
e preprocessing is carried out by applying a mammogram image to be passed through the nonlinear median filter for noise removal; then, the filtered binary image is separated from the background. Finally, the only breast profile is taken as an output from this phase. Figure 2 shows the outputs of these consecutive experiments that are carried to the input processed mammogram image. e input image is shown in Figure 2(a), and the filtered image after noise removal is presented in Figure 2(b), which shows a significant improvement against the input image in its clearance. Figure 2(c) demonstrates the image's background separation. Finally, the breast profile of the filtered image after the separation of background is presented in Figure 2(d).
e output of this stage, the breast profile, is applied as input to the segmentation step for extracting the ROI.

e Image Segmentation.
e objective of the segmentation phase is to identify and extract the pectoral muscle, i.e., the ROI. Initially, in the output of the preprocessing step, the breast profile of the input image's contrast is enhanced for obtaining the finest results from segmentation. en, the image segmentation is carried out by applying the SRG algorithm for extracting the pectoral muscle from a processed mammogram image. Figure 8(a) shows the input image after contrast enhancement with identified seed point while Figure 8(b) presents the pectoral muscle extraction.

4.3.
e Feature Extraction. In this phase, the pectoral muscle is fed as input to the five classes of feature extraction presented in Figure 5. e output of feature extraction is 307 features that are extracted from each ROI. ese features are organized as follows:  From these features, the optimum and effective small number of features are selected by the next step, the feature selection, to be used in the final stage, the classification stage.
A snapshot of a sample of statistical features which represent some different first-and second-order statistical features matrix values is shown in Figure 9. Figure 10 shows a portion of wavelet features matrix values extracted by feature experiments. It should be noted that each row in both of the previous figures numbered Figures 9 and 10 represents a sample image and the columns contain the values of the extracted features.   with exhaustive experiments carried out and examined in the two steps of the training and the testing phases using samples around 70% and 30%, respectively. More explanation on learning and testing samples ratios and results is shown in the next subsection, the classification. Exhaustive experiments are carried out in training and testing phases for selecting the effective features based on the steps illustrated in Figure 11. As a result of these experiments, 5 features are selected from the total of 83 statistical features, and 16 features are selected from the total of 224 wavelet features. e selected features show their importance thru the training phase; i.e., there are a total of 21 effective features that are selected during the training phase. ese 21 features are then used for the final version of the proposed CADD system in a testing phase.

e Classification.
A well-known classifier algorithm, the SVM, is used for the classification stage of the proposed CADD system. Since the SVM is a binary classifier, in this work, two SVM classifiers are used to avoid this problem as indicated in Figure 12.
In the operation of these two SVM classifiers as illustrated in Figure 12, the first SVM classifier is used to identify if the entered data, the features, indicate cancer or no cancer, i.e., abnormal class or normal class. en, based on the resulting class from the first classifier, the second SVM classifier may be activated or inactivated. Intuitively, if the first classifier indicates a normal case, the two outputs of the second classifier will be 0. Otherwise, the second SVM classifier is activated to decide on the type of cancer, either a malignant class or a benign class.

Performance Evaluation of the SVM Classifier.
To implement the SVM classifier, the selected features from the mammogram image are separated into two distinct sets, the learning set, and the testing set. Table 4 shows the total number of samples, 324 samples which are separated into two main groups, around 70% of total samples are used for the learning/training set and the remainder, around 30% of samples, are used for the testing set. From these two sets, there are 208 samples of normal classes and 116 of abnormal classes with 49 samples belonging to the cancer class and 67 samples belonging to the benign cancer class. e training accuracy of the SVM classification machine could be estimated by regulating the error parameter. Our proposed system uses the RBF (Gaussian) with nonlinear SVM classification. e RBF kernel's parameter is employed to control the width of the Gaussian and needs to be optimized for the SVM hyperparameter. erefore, the determination of two SVM hyperparameters is needed for constructing an optimum classifier that balances its generalization and memorization capabilities. Now, the results of testing of the two working SVM classifiers are obtained and presented for dialogue in Tables 5  and 6, of the two binary confusion matrices: Upon the two confusion matrices listed in previous tables, the performance computation could be done by calculating the Sensitivity, Specificity, and Accuracy values of the two SVM classifiers according to the following equations: e results of the first SVM classifier, in the testing of normal and abnormal classes, show that the classifier's Sensitivity, Specificity, and Accuracy are all 100% which is a great performance while the second SVM classifier, in the testing of malignant and benign classes, shows that the classifier's Sensitivity is equal to 90%, the Specificity is 85.714%, and the Accuracy is equal to 87.1%; this is due to the use of few numbers of images for diagnosing malignant and benign cancer types. However, the second SVM classifier is  providing significantly high performance. By looking at the results presented in Table 6, it is noticed that the malignant class provides a bad performance compared to all other classes; this is maybe due to the malignant class being trained and tested with the smallest number of samples of 37 samples in training and 12 samples in testing. So, it can be noted that the performance of the second classifier could be improved by applying an extended dataset of mammogram images which may be available soon. e suggested system has shown its capability for diagnosing breast cancers with the MIAS dataset. However, this dataset consists of a few hundred images. erefore, it is noticeable for improving the performance of the suggested system in terms of its accuracy. For dataset size, firstly, we increase the number of available MIAS database images, by applying an augmentation process to images. is is carried out via constructing many forms of images from a given image by applying some augmentation operations, such as different angel rotations and flipping. Secondly, we use other datasets with a larger size and even use many different datasets available on the Internet. Another way for the system model is that one can use a deep learning model based on convolution neural networks or any pretrained convolution neural networks.

Comparative Study and Limitations.
is section compares the implemented algorithm in this work with related works to assess how well it performs. e comparison is based on the accuracy metric. e studies that accessed the MIAS database and different databases are used with SVM classifiers and different classifiers.
According to the related works in Table7, especially those that used few images of the MIAS dataset with the GLCM feature extraction, the proposed method gives a good accuracy compared to the other methods that used KNN [39] and SVM [39] classifiers. Other methods used a large number of images in many datasets along with MIAS for training the model. e authors in [40,41,45,47] trained the proposed methods with different classes of features and different classifiers algorithms, and they outperform our proposed method in terms of accuracy metrics, due to the large numbers of images and different datasets. Our proposed method is characterized by three properties. e first property, the use of two classifiers, has trained the suggested method for classifying cancer as normal/abnormal and benign/malignant while they are only trained and classified cancer as benign versus malignant. e second property is that our suggested method has reached 100% accuracy in classifying normal and abnormal mammograms. And the third property is characterized by the implementation of a computer program with GUI. However, the second-level classifier of benign versus normal of the suggested method reached 87.1% in terms of accuracy.
is is due to the minimum number of images that are used. Henceforward, for improving the accuracy of the second classifier, one can either retrain the suggested model with more mammograms either by using of additional dataset or by extending the existed images using augmentation methods to these mammograms.

e Implementation of the Proposed Computer-Aided Detection and Diagnosis (CADD) System.
is work is finalized by implementing an application of the proposed CADD system using MATLAB software environment, version R2013a, on the windows 8 operating system, by employing MATLAB's image processing tools and statistical tools. All experiments are implemented on a Dell laptop of Intel Core i7 with the processing power of 2.4 GHz CPU with 12 GB RAM. Figure 13 illustrates the graphical user interface working environment of the proposed CADD system. Figure 14 shows the relationship between different files used in the development of the proposed CADD system application. e dashed spherical shape representation means there are additional functions to this component/file code. ese files/components and their operations are summarized as follows: (i) Image reading.m: this file allows the user to enter the mammographic image from a specific file directory to be diagnosed by the proposed CADD system. (ii) Image preprocessing.m: is file improves the quality of the input image and removes the noise. It includes the following MATLAB files: (i) Binary.m which eliminates the noise and converts the image into a binary image. (ii) Breast.m which removes the labels and identifies the orientation of the image.   Figure 13: e proposed CADD system application.

Conclusions
is work has proposed a CAD system for breast cancer identification and diagnosis from mammography images. e proposed method is passed throughout followed and ordered systematic steps for manipulating the mammography images to identify the ROI, feature extraction and optimization, and then for breast cancer classification using a machine learning algorithm, the SVM algorithm. e experimental results of the proposed method have shown its effectiveness, especially in the case of normal and abnormal tumor classifications which reach 100% accuracy. After the investigation of the proposed method with training and testing phases, a CAD software system with GUI is implemented using the proposed method. is CAD tool can play as an efficient tool for early identification and diagnosis of breast cancer from mammogram images. However, the second-level classifier accuracy reached 87% in classifying the cancer class as either benign or malignant.
is is due to the small size of the available MAIS dataset  images, especially mammogram images with tumors. So, we suggest different future works be considered. One of the future works to be suggested is to extend the size of the MAIS dataset using augmentation. Another suggestion is to use another dataset with a larger size. Moreover, researchers may have to use a suitable deep learning machine like the conventional neural networks.
Data Availability e mini MIAS images data set used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.