We propose a novel classification framework to precisely identify individuals with Alzheimer’s disease (AD) or mild cognitive impairment (MCI) from normal controls (NC). The proposed method combines three different features from structural MR images: gray-matter volume, gray-level cooccurrence matrix, and Gabor feature. These features can obtain both the 2D and 3D information of brains, and the experimental results show that a better performance can be achieved through the multifeature fusion. We also analyze the multifeatures combination correlation technologies and improve the SVM-RFE algorithm through the covariance method. The results of comparison experiments on public Alzheimer’s Disease Neuroimaging Initiative (ADNI) database demonstrate the effectiveness of the proposed method. Besides, it also indicates that multifeatures combination is better than the single-feature method. The proposed features selection algorithm could effectively extract the optimal features subset in order to improve the classification performance.
In recent years, Alzheimer’s disease has become a common neurodegenerative brain disease in elderly people. According to a report published by Alzheimer’s Disease International, there are around 44 million dementia patients worldwide, and the number will reach 76 million by 2030 and 135 million by 2050. Among these patients, Alzheimer’s disease (AD) patients account for 50% to 75% [
At present, machine learning and pattern classification methods have been widely utilized in developing a computer-aided brain disease diagnosis system with neuroimages such as Magnetic Resonance Imaging (MRI) [
Several types of features can be extracted from the structural MRI of whole brain, such as intensities or gray-matter densities [
Texture analysis could analyze the subtle changes of body; therefore, it has been widely used in AD diagnosis for extracting texture features. Oliveira et al. [
Moreover, the number of feature dimensions in neuroimaging is commonly higher than the number of samples. In order to solve the problem of overfitting, it is necessary to select features. Common feature selection algorithms can be divided into three categories: filters, wrappers, and embedded approaches. To be more specific, filter methods select subsets of features using learning algorithms, like principal components analysis (PCA) [
In this paper, we propose a novel classification framework to precisely identify individuals with Alzheimer’s disease (AD) or mild cognitive impairment (MCI) from normal controls (NC). Firstly, we propose a combination of voxel-based morphometry (VBM) and texture analysis to extract the more discriminative features. To be more specific, the VBM analysis can obtain the 3-D information of the brain, and texture analysis can obtain the 2-D information, so fusing the two kinds of features can achieve a better performance. Secondly, we adapt the SVM-RFE with covariance method to select a robust feature subset and use it to solve the overfitting problem of feature fusion. Our proposed method is evaluated on Alzheimer’s Disease Neuroimaging Initiative (ADNI) database and shows better performance than comparison methods. The following part of the paper is organized as follows: in Section
In this section, the proposed framework will be described in detail. An overview of the proposed classification method is illustrated in Figure
Schematic diagram illustrating the proposed AD and MCI classification framework.
We use the ADNI pipeline as introduced in Jack et al. [
In this paper, the schematic diagram of the proposed AD and MCI classification framework is shown in Figure
Generally speaking, the value of image’s density will be changed when the image is processed by the traditional spatial normalization. And these changes will lead to the local error in subsequent brain tissue segmentation, such as the false segmentation of brain tissue or nonbrain tissue affecting the accuracy of analysis. In order to solve this problem, Good et al. [
After the processing of spatial normalization, the images are segmented into gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF). Because brain tissue segmentation is based on voxel brightness, the different groups segmented using that will be affected by smooth brightness changes and cause a problem of nonuniform brightness, so current brain tissue segmentation technology will include the correction of image nonuniform brightness.
Spatial smoothing is a filtering process based on images of different tissue segmented by the last step. Gauss kernel function in normalized space is often used to implement a convolution on image data and the half width and height range of the Gauss function is 4 mm~10 mm. In general, the smoothing process has the ability to eliminate the subtle matching error and to improve the signal-to-noise ratio. However, spatial smoothing has some other effects for VBM. For example, it can make the analysis result based on voxel equal to the result based on ROI (region of interest). Consequently, every voxel of image after spatial smoothing can contain the mean concentration of GM from voxel statistics, which is the so-called “gray-matter density.” Notice that it is far different from the biological cell packing density. According to the central limit theorem, the process of smoothing also has an effect to conform the data to normal distribution. After that, the effectiveness of subsequent statistical analysis can be improved.
In general circumstances, VBM is often used for the quantitative analysis of the gray matter. Therefore, we adapt the GM images to implement statistical analysis. At present, a commonly used VBM statistical analysis method is based on GLM [
After the feature extraction, the linear combination of morphometric features and texture features can be yielded. Unfortunately, these features are less effective, irrelevant, and redundant for classification. Therefore, the feature selection process is essential for selecting an optimal feature subset to improve the classification performance.
SVM-RFE is a supervised loop iterative cancellation method. During each iteration, the SVM training process performs first, and the optimal hyperplane is obtained. Then, the ranking score is the weight corresponding to the feature calculated according to the parameters of the hyperplane. Finally, remove the feature with minimum score from the feature sets, and this loop ends. Check whether the feature set at this time only holds one single feature. If there are two or more features, the looping process is continued until only one feature is left. Thus, outputting the sorted result for all the features, it is known from the above process that the feature sequences are arranged in descending order.
However, according to Guyon’s experiments in the literature, the results are similar in both situations (linear and nonlinear kernels are used). The higher ranking score is, the more contribution the feature makes in classification training. Based on this idea, SVM-RFE finally obtains a descending feature sequence. According to this descending list, we can define a set with
It can be seen that the covariance matrix is a symmetric matrix, reflecting the observation of the correlation between the two variables; if a positive value represents that the two variables are positive correlation, negative values indicate a negative correlation. It also shows the importance and redundancy of the variable. If the diagonal elements are small, it indicates that this variable is likely to be a secondary variable, and the nondiagonal element value corresponds to the redundancy degree between the variables.
The SVM-RFE with covariance scheme.
To be more specific, the initial state of the subset is empty and SFS process’s function is to select iteratively one feature to add to it. The selected feature is the highest ranked feature among unselected features of the ordered feature set or it is related to the highest ranked feature according to the covariance matrix of the features. In other words, in the process of SFS, a parameter
Several parameters are needed to be defined in the proposed method. In the texture feature extraction stage, gray-level cooccurrence matrix is used to extract the features, the selected spatial distance is from 1 to 5 pixels, and directions are 0°, 45°, 90°, and 135°. A total of 20 gray-level cooccurrence matrices are constructed, and then there are 11 quadratic statistics including contrast values that are extracted from the matrix; Gabor filter’s extracting window size is 3 × 3, and the selected frequencies and directions are 0.5, 0.25, 0.125, and 0.1 and 0°, 45°, 90°, 180°, and 225°, 315°. A total of 32 filters were constructed, and the gray-level mean and gray-level variance are calculated from each filter response. At the stage of morphological feature extraction, the spatial normalization process is VBM-DARTEL, and the image is registered to the MNI space. The 8 mm FWHM Gauss kernel function is used to smooth the image, and then the GM images are used for statistical analysis.
The feature selection is implemented after the feature normalization. We let
The proposed method has been compared with five other methods in order to evaluate the effectiveness. Due to the limited number of samples, during the experiment, leave-one-out method is used to search for the optimal kernel function parameter pair, and then 10-fold cross validation is used to classify the sample data. Three random assignments were performed to guarantee random sample partitioning without sample bias. That is to say, every experiment in this article is randomly divided into three samples; in each time, the implementation of 10-fold cross validation ensures the fairness of the results. Moreover, the grid search was used to optimize the parameter for each model. In the experiment, the SPM and REST were implemented for extracting the morphological features, and a shogun toolbox (
We obtained all data in our experiments from ADNI public database. The sample images were using 1.5 T scanner and T1 weighted MRI. A total number of 170 subjects were enrolled, including 54 patients with Alzheimer’s disease (AD), 58 mild cognitive impairments (MCI), and 58 normal controls (NC). The demographic statistics of these samples are shown in Table
Basic information of the subjects.
AD ( |
MCI ( |
NC ( |
|
---|---|---|---|
Gender (male/female) | 22/32 | 32/26 | 30/28 |
Age (mean ± SD2) |
|
|
|
MMSE (mean ± SD) |
|
|
|
(
According to the statistical information in the table, the sex ratio of AD, MCI, and NC is balanced. The mean age of three groups is about 75 years, and the MMSE of AD is lower than MCI, which is lower than NC.
Information of the significant clusters (AD-NC).
Cluster | Number of voxels | Peak MNI coordinates ( |
Peak MNI coordinate region |
---|---|---|---|
Cluster 1 | 330608 | −30 −10.5 −19.5 | Hippocampus |
Cluster 2 | 1565 | 1.5 −99 −7.5 | Calcarine L |
Cluster 3 | 2315 | −16.5 −84 −34.5 | Cerebelum Crus2 L |
Cluster 4 | 105 | 0 −40.5 −54 | Medulla |
Cluster 5 | 890 | 1.5 −18 −30 | Pons |
Cluster 6 | 2622 | 40.5 −87 −34.5 | Cerebelum Crus1 R |
Cluster 7 | 61 | 4.5 −94.5 28.5 | Occipital Lobe |
Cluster 8 | 1354 | 21 −25.5 73.5 | Frontal Lobe |
Significant GM difference in AD relative to NC.
Information of the significant clusters (MCI-NC).
Cluster | Number of voxels | Peak MNI coordinates ( |
Peak MNI coordinate region |
---|---|---|---|
Cluster 1 | 1999 | 48 −51 −46.5 | Cerebellar Tonsil |
Cluster 2 | 9712 | −10.5 −103.5 −9 | Lingual Gyrus |
Cluster 3 | 173 | −13.5 −45 −40.5 | Cerebelum_9_L |
Cluster 4 | 107 | 10.5 −61.5 −39 | Uvula |
Cluster 5 | 25 | 3 −45 −34.5 | Vermis_10 |
Cluster 6 | 232 | 18 51 −3 | Frontal Lobe |
Cluster 7 | 52 | −19.5 42 −3 | Anterior Cingulate |
Cluster 8 | 370 | −1.5 −63 −4.5 | Culmen of Vermis |
Significant GM difference in MCI relative to NC.
Information of the significant clusters (AD-MCI).
Cluster | Number of voxels | Peak MNI coordinates ( |
Peak MNI coordinate region |
---|---|---|---|
Cluster 1 | 24 | 1.5 −15 −25.5 | Pons |
Cluster 2 | 206 | −15 25.5 −4.5 | Frontal Lobe |
Cluster 3 | 58 | 37.5 −43.5 6 | Temporal Lobe |
Cluster 4 | 72 | 12 25.5 10.5 | Sub-Lobar |
Cluster 5 | 42 | 27 19.5 18 | Sub-Gyral |
Cluster 6 | 25 | 30 −7.5 27 | Extranuclear |
Cluster 7 | 69 | 18 −31.5 28.5 | Cingulate Gyrus |
Cluster 8 | 43 | −12 −25.5 16.5 | Pulvinar |
Significant GM difference in AD relative to MCI.
In order to test the effectiveness of our proposed method SVM-RFE with covariance, several SVM classifiers with different parameters and dataset are used to identify the disease. Moreover, the corresponding accuracy rate for each feature subset was calculated during the process of forward feature selection. The experiment also investigates the effect of
Effect of the number of the related features (AD-NC).
ACC (%) |
|
|
|
|
|
|
---|---|---|---|---|---|---|
Test set | 88.2 |
|
91.0 | 88.2 | 88.2 | 85.1 |
Training set | 82.1 |
|
88.4 | 86.7 | 97.3 | 86.7 |
Based on the experiment results, the parameter
Accuracies obtained by feature selection process (AD-NC).
Effect of the number of the related features (MCI-NC).
ACC (%) |
|
|
|
|
|
|
---|---|---|---|---|---|---|
Test set | 88.7 |
|
91.7 | 91.7 | 94.3 | 91.7 |
Training set | 90.5 |
|
93.7 | 93.7 | 93.7 | 92.5 |
For this reason, the parameter
Accuracies obtained by feature selection process (MCI-NC).
Effect of the number of the related features (AD-MCI).
ACC (%) |
|
|
|
|
|
|
---|---|---|---|---|---|---|
Test set | 85.5 | 88.2 |
|
88.2 | 88.2 | 88.2 |
Training set | 87.5 | 92.2 |
|
92.2 | 91 | 91 |
Based on the results of our experiment, we set
Accuracies obtained by feature selection process (MCI-NC).
Effect of the number of the related features (3-way).
ACC (%) |
|
|
|
|
|
|
---|---|---|---|---|---|---|
Test set | 71.1 | 78.9 | 80.8 |
|
76.0 | 75.0 |
Training set | 73.7 | 80.0 | 81.7 |
|
83.0 | 78.9 |
For this reason, the parameter is
Accuracies obtained by feature selection process (3-way).
In conclusion, the classification performance in the above four experiments has proved the effectiveness of our method. Even though there is no optimal procedure to assign the parameter of SVM1 (Figure
Accuracy (ACC) = (TP + TN)/(TP + TN + FP + FN). Sensitivity (SEN) = TP/(TP + FN). Specificity (SPEC) = TN/(TN + FP). Positive predictive value (PPV) = TP/(TP + FP). Negative predictive value (NPV) = TN/(FN + TN).
The accuracy (ACC) is the most direct metric for comparison between methods. Sensitivity (SEN), specificity (SPEC), positive predictive value (PPV), and negative predictive value (NPV) describe how well diagnostic tests capture the true presence or absence of the disease. These evaluation indexes together describe the accuracy and error rate of recognition method for image classification and recognition. Among them, the higher the ACC, SEN, and SPEC values, the lower the error rate of the recognition method, and PPV and NPV represent the prevalence of the disease in the sample.
Classification accuracy with different type of features.
Feature type | ACC (%) | SEN (%) | SEPC (%) | PPV (%) | NPV (%) |
---|---|---|---|---|---|
AD-NC | |||||
Texture feature | 78.57 | 75.93 | 81.03 | 78.85 | 78.33 |
Morphological feature | 79.46 | 74.07 | 84.48 | 81.36 | 77.78 |
Feature combination |
|
|
|
|
|
|
|||||
MCI-NC | |||||
Texture feature | 83.33 | 77.78 | 88.89 | 87.50 | 80.00 |
Morphological feature | 63.88 | 55.56 | 65.00 | 66.67 | 61.90 |
Feature combination |
|
|
|
|
|
|
|||||
AD-MCI | |||||
Texture feature | 76.47 | 94.44 | 61.11 | 70.83 | 91.67 |
Morphological feature | 70.59 | 66.67 | 77.78 | 75 | 70 |
Feature combination |
|
|
|
|
|
|
|||||
3-way | |||||
Texture feature | 73.08 | X | X | X | X |
Morphological feature | 63.46 | X | X | X | X |
Feature combination |
|
|
|
|
|
Classification performance of all comparison methods.
Method | ACC (%) | SEN (%) | SEPC (%) | PPV (%) | NPV (%) |
---|---|---|---|---|---|
AD-NC | |||||
Without feature selection | 85.71 | 79.63 | 91.38 | 89.58 | 82.81 |
PCA | 86.71 | 83.33 | 87.93 | 85.64 | 85.26 |
Multikernel SVM | 88.39 | 85.19 | 91.38 | 90.20 | 86.89 |
Proposed method |
|
|
|
|
|
|
|||||
MCI-NC | |||||
Without feature selection | 86.11 | 77.78 | 94.44 | 93.33 | 80.95 |
PCA | 86.11 | 85.71 | 86.67 | 90.00 | 81.25 |
Multikernel SVM | 91.67 | 90.47 | 93.33 | 95.00 | 87.50 |
Proposed method |
|
|
|
|
|
|
|||||
AD-MCI | |||||
Without feature selection | 79.44 | 88.89 | 72.22 | 76.19 | 86.67 |
PCA | 73.53 | 81.25 | 66.67 | 68.42 | 80.00 |
Multikernel SVM | 79.41 | 87.50 | 72.22 | 73.68 | 86.67 |
Proposed method |
|
|
|
|
|
|
|||||
3-way | |||||
Without feature selection | 75.00 | X | X | X | X |
PCA | 69.23 | X | X | X | X |
Multi-kernel SVM | 79.41 | X | X | X | X |
Proposed method |
|
|
|
|
|
According to the experimental result presented above, the proposed method achieves better results than the current two mainstream AD detection methods [
In this paper, a novel feature fusion method is proposed to improve the classification accuracy of AD, MCI, and NC. Firstly, we preprocessed the structural MR images of the subjects and then extracted the morphometric features and texture features. By combining these two kinds of features linearly and then using the feature set to perform classification experiments, we find that the combination of morphometric features and texture features is better than both of them when they were used separately. Based on this, a new feature selection algorithm is proposed which is an improvement of SVM-RFE. By combining SVM-RFE and covariance, the optimal feature subset can be yielded after feature selection process. Finally, we perform several comparison experiments on the public ADNI database using the optimal subset, and then the experimental results were presented and analyzed, which demonstrated the effectiveness of the proposed method in improving classification performance.
The proposed method in this paper effectively promotes the detection accuracy of AD and MCI, but our method still has drawbacks. The next step in our future work is to improve the method from the following aspects: firstly, we try to optimize the obtaining process of parameter
The authors declare that there are no conflicts of interest regarding the publication of this article.
This work was supported by the NSFC-Guangdong Joint Fund (Grant no. U1401257), National Natural Science Foundation of China (Grant no. 61300090, no. 61133016, and no. 61272527), Science and Technology Plan Projects in Sichuan Province (Grant no. 2014JY0172), and the opening project of Guangdong Provincial Key Laboratory of Electronic Information Products Reliability Technology (Grant no. 2013A061401003).