Classification of Region of Interest in Mammograms Using Dual Contourlet Transform and Improved KNN

,


Introduction
Breast cancer ranks second as a cause of deaths among women in the world and it has become a major public health problem [1].According to the American Cancer Society statistics, the estimated new breast cancer cases reach 246,660 among women in the US during 2016, and it has been the most dangerous malignant tumor for women.Among which an estimated 40,450 breast cancer deaths are expected.The data show that breast cancer incidence rates are increased slightly, but the decline in breast cancer death rates is down by 36% from peak rates as a result of improvements in early detection and treatment [2].In consequence, breast cancer early detection and diagnosis are becoming a difficult point and hot issue of current international research.
Mammography as the best valid tool has been widely used in early breast cancer detection [3][4][5][6][7].However, the growing mammograms especially the large number of normal cases increase the reading burden of radiologist; it may lead to missing the subtle abnormalities.Consequently Computer-Aided Diagnosis (CAD) is particularly significant to provide a second opinion and reduce false positive and false negative rates.Over recent decades, many researchers have proved its effectiveness in breast cancer diagnosis.
CAD methods for distinguishing the normal and abnormal or benign and malignant have been investigated based on many different techniques [8,9].These classification techniques can be divided into two categories.One is image analysis with segmentation the lesion areas [10][11][12][13][14][15][16], and the other is image analysis without segmentation [17][18][19][20][21][22][23].Wei et al. [10] come up with a content-based mammogram retrieval system; meanwhile, a similarity measure scheme was proposed, this study was tested on Digital Database for Screening Mammography (DDSM) dataset, and experimental results demonstrated that round-shape masses were most discriminative when using Zernike moments and roundshape, circumscribed margin masses could achieve the highest precision among all mass types.Mustra and Grgic [11] presented a new method for breast skin-air interface detection and pectoral muscle detection based on selected Region of Interest (ROI); this approach was used to solve segmentation in very low contrast pectoral muscle areas.Pereira et al. [12] put forward a method for overcoming the limitation of analyzing only Cranio-Ca (CC) and Mediolateral Oblique (MLO) views, an artifact removal algorithm and multiple thresholding were used for mass preprocessing and segmentation, and finally they tested this new idea on DDSM database.Agrawalet al. [13] proposed a method for mass automatic detection which did not remove pectoral muscles.Firstly, they segmented mass use saliency; secondly, different features of the segmented regions are extracted; then they detected the mass by Support Vector Machine (SVM).This experiment was tested on the MIAS database and the results showed the effectiveness of this proposed method.Zhang et al. [14] focused on identifying the optimal segmentor from an ensemble mix of weak segmentor; the result showed that the segmentor achieved higher segmentation success rates in most cases.Anitha and Peter [15] proposed a new method to identify and segment the suspicious mass using a modified transition rule.An adaptive global thresholding was used to obtain the rough region; then the initial seed point and the modified transition rule were used for segmentation of the mass.This proposed approach yielded promising results when evaluating on 70 mass mammograms from mini-MIAS database.Dong et al. [16] presented a novel automatic segmentation and classification base on DDSM and MIAS database, the experimental results verified the effectiveness of this new approach.
These methods mentioned above have taken much effect on CAD breast cancer.At the same time, the methods without segmentation also play an important role.Campanini et al. [17] exploited all the information available on the image instead of extracting any feature from ROI; then SVM was used to classify suspect areas or not; finally a voting strategy by an ensemble of experts was applied to achieve the final suspect regions.The presented system obtained impressive results when testing on DDSM database.Rashed et al. [18] used fractional amount of biggest wavelets coefficients in multilevel decomposition and they achieved a remarkably high efficiency in distinguishing between benign and malignant tumors.Reyad et al. [19] studied the effect of different features to be used in CAD system for classification of masses, these features included Local Binary Pattern (LBP), statistical measures, and multiresolution feature, the results showed that when using both statistical and LBP features, the accuracy was increased to 98.63%, and the contourlet-based features achieved classification accuracy of 98.43%.Tai et al. [20] studied the local texture characteristics and the discrete photo-metric distribution of each ROI and used stepwise linear discriminate analysis to classify abnormal regions, and the results revealed that the proposed system obtained satisfactory detection performance.Orozco et al. [21] presented a CAD system to distinguish lung nodules CT images based on supervised extraction of the ROI; experimental results showed that this method helped reducing the complexity of classification without the segmentation stage.Pak et al. [22] used ROI-feature extraction based on Nonsubsampled Contourlet Transform (NSCT) and Super Resolution (SR); then AdaBoost algorithm was used to classify and determine the probability of benign and malignant.Beura et al. [23] employed Gray Level Cooccurrence Matrix (GLCM) to all the detailed wavelet coefficients based on ROI and then classified the breast tissues as normal, benign, or malignant using Back Propagation Neural Network (BPNN).
Based on the discussion above, it can be concluded that breast cancer analysis with segmentation has got certain achievements; moreover, the segmentation results directly affect the classification accuracy.Whereas mammogram analysis without segmentation also can obtain higher accuracy and it help reduce the complexity of classification by not carrying out the segmentation stage.Just in time, we have proposed a new structure of Dual Contourlet Transform (Dual-CT) in our previous work [24].Meanwhile, to our knowledge, there is no previous research using Dual-CT-based feature in digital mammogram analysis.In this paper, we firstly extracted the Dual-CT-based feature of ROI and then developed a new classification method based on Dual-CT feature and improved -nearest neighbors (KNN) classifier.Firstly, we identify the ROI manually according to the gold standard.Secondly, Dual-CT is used to decompose the ROI and then a series of feature are extracted based on Dual-CT coefficients.Finally, improved KNN is employed to classify the mammogram into normal and abnormal and malignant and benign.
The outline of the rest paper is organized as follows.Section 2 describes the wavelet transform, contourlet transform, and Dual-CT briefly; also KNN is given a simple introduction; the database and preprocessing are described in Section 3. In Section 4, feature extraction and feature analysis are presented, and the achieved results are discussed.Conclusions and future work are presented in Section 5.

Materials and Image Preprocessing
2.1.Wavelet Transform.Wavelet proposed by J. Morlet is widely used in many areas [25,26].Wavelets are short basis functions that are used to represent other functions.It is implemented by iterations of discrete time filters.The basis function is called mother wavelet, and a cluster of functions can be generated by translations and dilation of this basis function.It possesses well-localized properties in both time and frequency simultaneously.
The wavelet basis function can be described as follows; firstly we define scale parameter  =  0  and translation where ,  ∈ Z,  0 > 1,  0 > 0, respectively.In general, the discrete wavelet transform of () can be defined as  When applying two-dimensional (2D) wavelet decomposition to an image, we will get four subbands in each level, the low frequency subband and three high frequency subbands.Then the low frequency subband is used to be further decomposed.The low frequency subband contains the coarse information of the original image; and the edges and other details information are distributed in the high frequency subbands.Figure 1 shows the decomposition of DWT.The wavelet has the following properties: 1. Multiresolution: it can represent the images approximated successively, from coarse to fine resolutions.2. Localization: the separable wavelet represents the basic elements localizing in both spatial and frequency domains.3. Critical sampling: wavelet can form a basis or a frame with small redundancy.
For more details about wavelet analysis, refer to [27].

Contourlet Transform.
Contourlet transform [28] is proposed as a new image representation approach over wavelet.It is a "true" 2D image representation scheme and it can capture the intrinsic geometrical structure of original image.The contourlet combines with Laplacian pyramid (LP) and directional filter banks (DFB) for multiresolution and multidirectional decomposing.The LP is firstly used to capture the discontinuous points; next DFB is used to link discontinuous points into linear structures.Figures 2 and 3 show the decomposition structure of CT and the frequency spectrum decomposition of DFB, respectively.The LP iteratively decomposes a 2D image into low-pass and bandpass subbands, and the following bandpass subbands are fed into DFB to capture the directional information.Then iterating this scheme on the low-pass subband, the contourlet coefficients are obtained finally.The contourlet has the following advantages over wavelet: 1. Directionality: DFB contains basis elements oriented at various directions which are more than three directions offered by wavelet.2. Anisotropy: contourlet contains basis elements with various elongated shapes with different aspect ratios, and it can capture smooth contours in images.
More details about contourlet can be found in [29].

Dual Contourlet Transform. Dual Contourlet
Transform is developed as an improvement over contourlet.It is constructed by cascading of dual LP and DFB; the dual LP is used to improve the spectrum aliasing in downsampling of LP.DFB involves basis functions orienting at any power of two's number of directions with flexible aspect ratios.Figure 4 shows the decomposition structure of Dual-CT.
Besides the properties of contourlet, Dual-CT offers approximate shift invariance, phase information, which is very important in image processing areas.

𝐾 Nearest Neighbor (KNN)
.  Nearest Neighbor (KNN) [30] is proposed by Cover and Hart in 1968, and it is one of the most simple machine learning algorithms.It is an extension of the simple nearest neighbor.KNN classifies an unknown sample on the "vote" of nearest neighbor  rather than on the single nearest neighbor.
The main steps of KNN implementation are as follows: (1) Assess similarity: calculate the similarity between the test sample and each sample of the training set.In general, the similarity can be measured by Euclidean distance, Manhattan distance, Jaccard similarity coefficient, correlation coefficient, and so on.Among these, Euclidean distance is the most widely used.For a given feature sample Test( 1 ,  2 , . . .,   ) and training set feature Train( 1 ,  2 , . . .,   ), the Euclidean distance is calculated as below: where  is the number of the feature vectors,  is the number of training samples, and   is the Euclidean distance between the test sample and the th sample of the training set.(2) Find neighbors: find the  neighbors nearest distance and sort in ascending order.The selection of  value will directly affect the classification result.As shown in Figure 6, the test sample class will change with the value of .The candidate of  can be chosen as 3, 5, and 7 or by experience.(3) Vote and classify: according to the vote result of each category, the test sample is classified to one class.

Mammogram Database and Preprocessing
3.1.Database.The mammogram is obtained by compressing the breast between two acrylic plates when X-ray is emitted through.In the previous study, MIAS [31] was widely used in mammography analysis because that they are freely available [13,16,32,33].In this work, we choose the same dataset, the same as other researchers.Another reason is that various cases of MIAS are labeled by expert radiologist based on experience and biopsy.Mammogram of MIAS is selected from the United Kingdom National Breast Screening Program; it contains 161 pairs of films.Every image is 1024 × 1024 pixels; they contain normal and abnormal cases.The coordinates center and approximate radius (in pixels) of abnormality are given by experts.Each mediolateral oblique view is available for research purpose.The summary of MIAS digital mammogram is listed in Table 1.For instance, there are two lesion areas in a mammogram such as "mdb 239" and "mdb 249," so there are totally 324 samples.information can be seen as noise in the process of classification.Instead of segmentation on the lesion areas, we apply a cropping operation to the original image manually, and then 324 ROI are extracted with size of 128 * 128 pixels.The center of ROI is selected according to the given center of the

Region of Interest (ROI) Extraction
Accuracy deals with all cases and it is the most commonly used indicators; it reflects the precision of predict results.
The ROC curve is used to evaluate the predictive accuracy of the proposed model.It indicates the relation between sensitivity and specificity.The area under the ROC curve (AUC) is one of the excellent methods for comparing classifiers into two-class issues.If the ROC curve rises quickly towards the upper left corner of the graph, this indicates that the test method performs better.When the AUC is close to 1.0, it indicates that the diagnostic test is reliable; on the contrary, an area close to 0.5 demonstrates the unreliable test result.

The Proposed System
In this work, a new classification method of mammograms is proposed.The procedure of the proposed system can be summarized as follows, and the proposed system is presented in Figure 8  ROI is decomposed by the proposed Dual-CT, and the Dual-CT coefficients are obtained; secondly the directional subband coefficients are used for feature extraction.After investigation and analysis, it is found that these nine features including mean, smoothness, and others are effective.These nine features are illustrated as follows.
For the given ROI,   is each gray value of the ROI, (  ) is the gray level histogram,  is the number of gray levels.
(1) Mean: it reflects the average gray level of an image (2) Standard deviation: it reflects the degree of deviation between the whole image and the mean image (3) Smoothness: the practical significance is similar to the variance  2 (4) Skewness: it reflects the deviation trend between the whole gray level and the mean; the gray deviation caused by minority extremum can be indicated in this index (5) Uniformity: (6) Entropy: it is often used to measure the random distribution of gray value; the greater the randomness, the larger the entropy value: (7) Contrast: it is used to measure the image definition; the deeper the texture, the larger the contrast, where  is a positive number.Experimentally,  is set as 1/4.For a given image (, ), (, ) is the normalized Gray Level Cooccurrence Matrix (GLCM) of (, ), where  and  are the size of (, ).image.These selected features include standard variance, uniformity, entropy, and correlation.Figure 9 shows the feature value of 10 normal ROIs and 10 abnormal ROIs.
It can be seen that the standard variance of normal images is stable and low, while that of abnormal images is high and sharp.It means that gray scale of the normal image changes smoothly; the emergence of the lesion area changes the gray level distribution obviously.For the uniformity and the correlation, the normal image achieves higher value than the abnormal.It indicates that the local similarity is higher in normal sample than that in abnormal sample.From the entropy indicator, the normal image is lower than the abnormal because the gray level distribution of the abnormal image is more randomness, while it is regular in normal image.
Figure 9 indicates that the feature of smoothness is significant to distinguish these two types.
The same step is also done for the malignant and benign ROIs.We firstly select 10 malignant and 10 benign samples, respectively; secondly we compute the same four features of the selected ROIs. Figure 10 shows the feature value of 10 benign ROIs and 10 malignant ROIs.From Figure 10, we can see that the features between benign and malignant have obvious difference.The uniformity and correlation value of benign ROI is larger than that of malignant ROI, while the standard variance and the entropy value of benign ROI is smaller than that of malignant ROI.These changes are related to the gray level distribution of lesion areas.For instance, the gray level of benign lesion assumes disciplinary changes and the gray level of malignant lesion changes desultorily.
As can be seen, these selected features are useful to classify normal and abnormal ROI and benign and malignant ROI.Following, we will use these features for classification and analyze the experimental results.

Classification Results.
In order to verify the effectiveness of the present new method, we compare our method with the state-of-the-art methods.For the choice of the number of decomposition layers, previous research suggests that three layers of decomposition in feature extracted often indicate better classification results.So in this paper, we choose three levels.The main steps in this article are as follows: firstly, the 322 ROIs are extracted manually; secondly Dual-CT, contourlet transform, and wavelet transform are used to decompose the extracted ROIs, the decomposition level is set as [4,4,4] based on the experiment, and then we obtain the multiresolution coefficients: Dual-CT has 2 4 directions at each scale for each tree; contourlet has 2 4  coefficients, and we obtain the feature database.Finally, the features are fed into the improved KNN for classification.

The Improved KNN.
The basic KNN can be described in three steps: computing distance, finding  nearest neighbor, and classification.In this work, we used improved KNN to improve the classification accuracy.The implementation of improved KNN is illustrated as follows.
1. Compute distance: for each test sample, we calculate the Euclidean distance between the feature of test sample and all the rest of the features database.
2. Find  nearest samples: sorting the distance in ascending order and finding the first  samples.In our experiment, we set  as 3 and 5 by experience.
3. Classify: the test sample will be divided into the class of more votes directly in former KNN.In order to increase the classification accuracy, we improve this step.
For a test sample ,   represents the category and   and   are the sample and sample number which belong to   in the  neighbor, respectively; then we define (  , ) as the credibility of  to category   .

𝑇 (𝐶
The smaller the (  , ) is, the greater the possibility that  belongs to   .If (  , ) is equal to 0, there is no doubt that  belongs to   .
In the following experiments, the improved KNN will be proved to show its effectiveness in classification.

Classification between Normal and Abnormal.
In this section, there are totally 324 ROIs extracted from the MIAS database.It includes 206 normal areas and 118 abnormal areas.Table 3 shows the classification accuracy of different methods.
For Table 3, we analyze the classification performance from the following two aspects.1.In terms of KNN classifier, Dual-CT-based features perform better than contourlet and wavelet in general.
Especially for the abnormal case, the accuracy of Dual-CT is 15% higher than that of the contourlet and wavelet on average.For the correlation index, Dual-CT achieves the accuracy of up to 80.51% in abnormal cases, whereas the accuracy is 63.56% and 64.41% of contourlet and wavelet, respectively.2. In terms of improved KNN, the classification performance is improved totally.Especially for the abnormal cases, the accuracy is promoted obviously.
All in all, the classification accuracy of normal is higher than that of abnormal.Dual-CT domain feature performs the best of the three multiresolution domains; contourlet-based feature performs slightly better than wavelet-based feature.The improved KNN helps improving the classification performance.
Table 4 shows the classification accuracy of the nine extracted features based on the KNN and improved KNN classifier.It can be concluded that the classification accuracy is up to 94.14% based on entropy feature using improved KNN classifier, and the average of the classification accuracy is about 93%.The best classification performance of KNN is achieved by the correlation index with the accuracy of 83.02%, and the improved KNN promotes the accuracy to 93.83%.KNN classifier achieves better performance.We can see that a higher ROC of 0.95 has been obtained on average.This proves once again the superiority and robustness of our method over the others.

Classification between Benign and Malignant.
There are totally 118 abnormal cases, which includes 52 malignant cases and 66 benign cases.In this section, we classify the two cases with the proposed method.The classification results are listed as shown in Table 5.
From Table 5, we can see that the best classification accuracy rate is achieved by the Dual-CT feature using improved KNN classifier.Using KNN classifier to distinguish benign and malignant seems a little weak; it cannot provide reliable classification accuracy; when using improved KNN classifier, the classification accuracy has increased significantly with almost 15 percentage points.The best performance is achieved by the standard deviation and contrast in Dual-CT domain with the accuracy of 100% and 96.15% for the benign case and malignant case, respectively.
To further confirm the effectiveness of our method, we calculate the classification accuracy based on the above simulation results and list in Table 6.The higher the values for classification accuracy, the better the performance of the method.Table 6 shows that the optimal value is obtained by improved KNN classifier with Dual-CT domain feature (about 6%∼20% improvements).In terms of feature index, wavelet-based feature performs the worst; contourlet-based feature performs slightly better; Dual-CT-based feature performs the best.In terms of classifiers, the improved KNN obviously promotes the classification accuracy compared with KNN.
Figure 12 shows the AUC comparison between KNN classifier and improved KNN classifier based on multiresolution feature.In most instances, Dual-CT-based feature achieves better performance than CT and wavelet-based feature.The AUC is 0.95, 0.91, and 0.88 based on improved KNN with Dual-CT domain feature, contourlet domain feature, and wavelet domain feature.This indicates that the proposed method can detect benign and malignant lesions with high probability, and it will help reduce the number of biopsies for benign lesions.

Compared with State-of-the-Art Methods.
In the previous section, we have demonstrated that the proposed Dual-CT-based feature with improved KNN classifier provides better performance than that using traditional KNN classifier.Here, we compare this proposed method with state-of-theart methods reported in the literature, including accuracy and AUC.Table 7 shows the comparison where the database and classification technique are listed.It can be seen that the proposed method obtains better diagnostic performance.Even compared to [7], it is also comparable.It can be noted that [7] reaches the higher accuracy, but we choose the    point where the classification accuracy rate is higher and the number of features is fewer.

Discussions.
The obtained promising results suggest the following: 1.The Dual-CT-based features perform better than contourlet-based features and wavelet-based features.It is consistent with the expected effect since the Dual-CT simultaneously possesses approximate shift invariance and higher directional selectivity than contourlet and wavelet.Dual-CT is able to capture the anisotropic structures and multidimensional features of mammogram.Wavelet lacks shift invariance and has poor directional selectivity; contourlet performs a little better than wavelet because of its better directional selectivity.2. The improved KNN classifier performs better than the traditional KNN classifier in terms of classification performance.This should be attributed to the improved discrimination process.In the improved KNN, we take the number of samples in each category into consideration.For the MIAS database, there are 206 normal cases and 118 abnormal cases; the traditional KNN directly distinguishes the test sample to either of the two classes according to the  nearest neighbor samples.The number of normal cases is about double that of the abnormal cases, it will lead to normal cases that are more likely to be selected into  nearest neighbor samples than abnormal cases and bring about misclassification.The introduction of credibility solves this problem and improves the classification performance.
3. The normal and benign cases achieve better performance than that of abnormal and malignant cases.
It may be because the normal cases have relatively homogeneous texture; in contrast, the abnormal cases include many conditions such as microcalcification, circumscribed mass, speculated mass, architectural distortion, and other cases, as well as the same reason in benign and malignant cases.

Conclusion
In this work, a new method of digital mammogram analysis and classification is proposed.Firstly, the ROI is cropped from MIAS database manually according to the gold standard.Secondly, Dual-CT, contourlet, and wavelet transform are used to decompose each cropped ROI separately.The directional subbands from each decomposition level are used to extract feature.Then improved KNN and traditional KNN are employed to distinguish normal and abnormal and malignant and benign.We analyze the classification accuracy and AUC of each method quantitatively.The experimental results suggest that the Dual-CT-based features obtain a better performance as compared to contourlet and wavelet transform, and improved KNN gives a more outstanding performance than traditional KNN.For instance, the accuracy of abnormal based on entropy feature reaches 80.51%, while the accuracy achieved by contourlet and wavelet transform is 63.56% and 64.41%, respectively; for classification of benign and malignant, the Dual-CT-based feature using improved KNN is 95.76%, which is 20 percent higher than that of traditional KNN.Moreover, the proposed method is comparable with state-of-the-art methods reported in recent literatures in terms of accuracy and AUC.The Dual-CT-based features are firstly used to analyze mammograms, and improved KNN is used to help improving diagnosis of breast cancer.These positive results clearly demonstrate the great potential of the Dual-CT-based feature and improved KNN in analysis and classification of biomedical data.In the future, we will try to extend the proposed method with appropriate changes for other medical images.

Figure 1 :
Figure 1: The decomposition structure of DWT.

Figure 2 :
Figure 2: The decomposition structure of CT: the LP, followed by the DFB.

Figure 4 :
Figure 4: The decomposition structure of Dual-CT. c

Figure 5 :Figure 6 :
Figure 5: The decomposition structure of dual tree LP.

Figure 8 :
Figure 8: Framework of the proposed system.

Figure 10 :
Figure 10: Comparison of the feature value in the ROIs between benign and malignant.

Figure 11 Figure 11 :
Figure 11: The comparison of AUC tested between normal and abnormal.

Figure 12 :
Figure 12: The comparison of AUC tested between benign and malignant.

Table 1 :
Summary of MIAS digital mammogram.

Table 2 :
The concept of confusion matrix. .
[18,22,36]e Extraction.Image texture is an important feature of representing itself; different types of image possess different texture.Previous studies[18,22,36]have shown that combining texture feature with multiresolution transform domain feature can help improving the classification accuracy.In this work, feature is extracted from the multiresolution domain based on ROI.Firstly, the extracted

Table 3 :
Classification performance (%) using KNN and improved KNN tested between normal and abnormal.
SD represents standard deviation; bold font number indicates the best performance in each class.

Table 4 :
Classification accuracy (%) of different methods tested between normal and abnormal.
SD represents standard deviation; bold font number indicates the best performance in each class.

Table 5 :
Classification performance (%) using KNN and improved KNN tested between benign and malignant.
SD represents standard deviation; bold font number indicates the best performance in each class.

Table 6 :
Classification accuracy (%) of different methods tested between benign and malignant.
SD represents standard deviation; bold font number indicates the best performance in each class.

Table 7 :
Classification performance of different methods.
N versus A represents normal versus abnormal; B versus M represents benign versus malignant.