^{1}

^{1}

^{2}

^{1}

^{2}

We propose a regularization based approach for case-adaptive classification in computer-aided diagnosis (CAD) of breast cancer. The goal is to improve the classification accuracy on a query case by making use of a set of similar cases retrieved from an existing library of known cases. In the proposed approach, a prior is first derived from a traditional CAD classifier (which is typically pre-trained offline on a set of training cases). It is then used together with the retrieved similar cases to obtain an adaptive classifier on the query case. We consider two different forms for the regularization prior: one is fixed for all query cases and the other is allowed to vary with different query cases. In the experiments the proposed approach is demonstrated on a dataset of 1,006 clinical cases. The results show that it could achieve significant improvement in numerical efficiency compared with a previously proposed case adaptive approach (by about an order of magnitude) while maintaining similar (or better) improvement in classification accuracy; it could also adapt faster in performance with a small number of retrieved cases. Measured by the area of under the ROC curve (AUC), the regularization based approach achieved AUC = 0.8215, compared with AUC = 0.7329 for the baseline classifier (

Clustered microcalcifications (MCs) can be an important early sign of breast cancer in women. MCs are calcium deposits of very small dimension and appear as granular bright spots in a mammogram (e.g., Figure

A mammogram image in CC view (a) and clustered microcalcifications in magnified view (b).

In CADx, a pattern classifier is typically first pretrained on a set of existing cases and subsequently applied to predict the likelihood that a given lesion is malignant or benign. For this purpose, many different machine-learning methods have been investigated, for example, [

Recently, we have been exploring a case-adaptive approach to boost the performance of a CADx classifier by using retrieved similar cases [

Based on this prior success, in this work, we further develop this case-adaptive classification approach by using a regularized adaptive classifier. One drawback of the adaptive classifier in [

Regularization techniques are often used in machine learning to deal with ill-posed problems or to prevent over-fitting by an underlying model. They usually assume the form of a penalty to the complexity of the model, such as

The rest of the paper is organized as follows: The development of the adaptive classification schemes with regularization is given in Section

The problem we consider can be stated as follows: for a given query lesion

Consider a linear classifier of the form:

In practice, the unknown vector

Now, consider a query lesion

Observe that the objective function in (

To address this problem, we propose a regularized approach for designing the adaptive classifier, as illustrated in Figure

Diagram of retrieval-driven case-adaptive classification with regularization.

Let vector

Thus, we can rewrite the modified objective

The second term in (

Note that the Hessian matrix

To further simplify the computational complexity of the adaptive classifier, in this study we assume that the components of

The constant

In this study, the weighting coefficient for a retrieved case

In our experiments, the Newton-Raphson algorithm was used for optimization of the adaptive classifier in (

In (

To determine the mean vector

To determine the adaptive vectors

In this study, we use a dataset as in our previous work [

To demonstrate the proposed approach for case-adaptive classification, we used the following setting in our experiments. The dataset of all 1,006 cases was first randomly divided into three subsets, denoted by

To avoid any potential bias, the training set

To evaluate the classification performance, we conducted a receiver operating characteristic (ROC) analysis, which is now routinely used for performance evaluation in classification tasks. An ROC curve is a plot of the classification sensitivity (i.e., true positive fraction) as the ordinate versus the specificity (i.e., false positive fraction) as the abscissa; for a given classifier, it is obtained by continuously varying the threshold associated with its decision function. As a summary measure of overall diagnostic performance, the area under an ROC curve (denoted by AUC) is used. A larger AUC means better classification performance. In our experiments the ROCKIT program [

To remove the effect of case distributions, we applied a bootstrapping methodology for testing the performance of the classifiers. A total of 2,000 bootstrap sample sets were used [

In our evaluation, the proposed adaptive approach was compared against the following different classifiers: (1) the baseline classifier (

In Figure

Classification performance (AUC) achieved by the case-adaptive classifiers with/without regularization (Ada-Reg-Non, Ada-Reg-Uni and Ada-Reg-Var). The number of retrieved cases

From Figure

Furthermore, from Figure

The respective effects of retrieved cases and regularization can be illuminated by examining the results achieved by Ada-Reg-Non, that is, when no regularization was used in the adaptive classifier. With

Furthermore, the regularized adaptive classifiers Ada-Reg-Uni and Ada-Reg-Var could also outperform the baseline classifier LR-all (

The rationale behind the proposed regularization-based approach for adaptive classification is to use a prior to regularize the adaptive classifier in order to prevent it from over-fitting to the retrieved cases. As can be seen from (

Effect of regularization prior in adaptive classifier Ada-Reg-Uni with parameter

In Figure

Execution time for classifying all test cases in

In this work, we investigated a regularization based approach for case-adaptive classification of microcalcification lesions in mammograms. Deviating from a previously developed adaptive approach, in which a set of retrieved cases was used in conjunction with the training cases of a baseline classifier to re-retrain an adaptive classifier, we derived a prior in place of the baseline classifier as a regularization term in the adaptive classifier. This prior was used together with the retrieved cases from a reference library to optimize the classification on a query case. Our goal was to reduce the numerical complexity associated with online training of the adaptive classifier. We explored two different forms for the regularization prior: one is invariant for the different query cases and the other is allowed to vary with respect to the features of the query cases. We demonstrated the proposed regularization approach on a dataset of 1,006 cases. The results show that it could achieve significant improvement in numerical efficiency (around 10 times in execution speed) while maintaining similar (or better) improvement in classification accuracy compared to a previous nonregularization approach. The regularization approach was also observed to achieve faster adaption in performance with a small number of retrieved cases.

We used the Newton-Raphson method to solve the optimization problems associated with the regularized adaptive classifiers. In particular, consider the objective function in (

Substituting (

The gradient and Hessian matrix of

The Newton-Raphson update is computed iteratively as

R. M. Nishikawa is a shareholder in Hologic, Inc. (Bedford, MA) and also he received research support and royalties. This work was supported in part by NIH/NIBIB (Grant No. R01EB009905).