Towards Effective Classification of aMCI Based on Resting-State Multiscale Brain Features and Machine Learning Approaches

School of Informatics, Xiamen University, Xiamen 361000, China National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361000, China The First Affiliated Hospital of Xiamen University, Xiamen 361000, China Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518000, China Artificial Intelligence Research Center, Peng Cheng Laboratory, Shenzhen 518000, China


Introduction
Industry 4.0, represented by improvement of the intelligent level of the manufacturing industry, is profoundly converting all walks of life. Smart healthcare that adopts various Industry 4.0 concepts is an era full of opportunities and challenges [1,2]. As a whole, smart healthcare consists of three parts, including the smart hospital system, family health system, and regional health system [3]. Among them, the core work of a smart hospital system is to collect, store, and process patients' health status and medical information [4]. Furthermore, imaging diagnosis using medical information and intelligent algorithms can be employed to uncover the risk of disease, timely remind doctors, and assist doctors in making clinical decisions, which is an essential ingredient of smart healthcare [5].
Nowadays, brain-related diseases are considered as one of the most severe problems in the healthcare system. Alzheimer's disease (AD), which frequently occurs in the elderly population, is a disease accompanied by cognitive decline and noncognitive mental symptoms [6,7]. Unfortunately, there are no specific drugs or treatment protocols when it comes to AD disease [8]. Moreover, amnesic mild cognitive impairment (aMCI), conceptualized as an episodic memory disorder, is most likely to develop AD [9]. In practice, numerous studies have shown that resting-state functional magnetic resonance imaging (rs-fMRI), characterized by the indirect reflection of neural activity in the brain, is a noninvasive imaging technology that has been widely employed in the classification of brain-related diseases [10,11]. Accordingly, research on an efficient and reliable system for detecting aMCI is conducive to screening and detecting individuals at high risk for developing AD. It is worth noticing that one of the cores of smart healthcare development is the high demand for data, while aMCI data based on rs-fMRI is confronted with enormous challenges due to its limited data and high dimensions [12].
In this work, the correlation value between the time series of the standard brain regions is calculated using the Pearson correlation coefficient, thus constructing the brain functional network (BFN) that reflects the interaction between the nodes. Significantly, the existing methods only use the local features of the BFN as the input of classifier while ignoring its structural features. To address this issue, we develop an adaptive structure feature generation strategy (ASFGS) based on the Laplacian matrix and sparse autoencoder to improve the classification performance and reduce data redundancy of the system. Concurrently, we present a multiscale local feature detection strategy (MLFDS) to overcome the low utilization of local features of BFN. Afterwards, multiscale features, including structural features and multiscale local features, are fused to further improve classification accuracy of aMCI. It is worth mentioning that support vector machine based on radial basis function (RBF-SVM) for small data learning is utilized to evaluate the performance of the proposed algorithm. In the following, we employ the leaveone-out cross-validation strategy to avoid the overfitting problem of classifier.
Accordingly, the innovativeness of our work is that we first present an ASFGS algorithm to obtain the structural features of BFN, improve the detection accuracy, and reduce data redundancy of the system. Then, we develop an MLFDS algorithm to excavate the local features of BFN at multiple scales. Finally, multiscale features of BFN obtained from the ASFGS algorithm and MLFDS algorithm are concatenated to further improve classification accuracy of aMCI. The results elucidate that the accuracy (ACC) and the area under the curve (AUC) in this work provide about 86.57% and 86.36%, respectively, which outperforms the state-of-the-art methods. It can be inferred that our work dramatically improves the detection performance of aMCI system, providing a new perspective for the construction of intelligent imaging diagnosis system in smart healthcare.
The rest of the work is structured as follows: In Section 2, we review the related works on the feature extraction and classification of aMCI based on rs-fMRI data. In Section 3, we present materials and methods of aMCI detection system. Experiment results and analysis is conducted in Section 4. We conclude the whole work in Section 5.

Related Works
The recent development and combination of machine learning, statistical algorithm, and neuroimaging technology offer a new perspective for designing an intelligent imaging diagnosis system, which is a crucial procedure toward smart health. The design of an intelligent imaging diagnosis system mainly includes several parts, including the data generation module, data preprocessing module, feature learning module, classifier training module, and feedback module [13]. With the development of intelligent imaging diagnosis technologies, brain network constructed using Pearson correlation coefficient based on rs-fMRI can be employed to estimate the mechanism of information processing and mental expression in the brain, which further proves that it is effective in assisting diagnosis [14,15]. Nevertheless, owing to the limited and high-dimensional data, little is known about whether to develop the multiscale features of BFN to improve the classification performance of aMCI system. Accordingly, the structural features and multiscale local features that we have developed are the main innovation in this work. This helps us to timely intervene and treat potential individuals associated with brain-related disease.
Numerous works about BFN research have focused on using rs-fMRI to excavate effective features of aMCI. For example, the altered patterns of rich club generated from the BFN have been reported in [16], which indicates that the altered patterns in overlapping nodes can be utilized as the potential features in the aMCI classification process. Moreover, the changes in the architecture of BFN have been reported compared to the healthy control (HC), which is conducive to understanding the mechanism of aMCI and searching for biomarkers [17]. Through the two-sample t -test, several neuroimaging biomarkers are also identified in the aMCI group, providing a novel aspect for designing interventions before the onset of disease [18]. The preceding studies, restricted by the number of data, are conducted using statistical testing methods to extract the effective features and explore neural mechanisms of aMCI. Furthermore, it ignores the development of the intelligent imaging diagnosis system based on machine learning technology, which is designed to help doctors improve the detection efficiency of aMCI.
With the rapid development of machine learning technology, feature extraction and classification algorithms related to disease have become a hot spot. However, due to the limited number of aMCI data, feature selection is first conducted to reduce redundant information and then use them as the input to classifier to improve classification performance. It provides about the ACC of 69.00% when the significant regional signals resulting from brain pathway activities are employed as the input of support vector machine (SVM) classifier, providing new opportunities for comprehending the disrupted patterns caused by disease [19]. Similarly, the significant features of BFN using the two-sample t-test are employed to evaluate the classification performance of aMCI in [20], and the result provides about the ACC of 79.10% in SVM classifier. Moreover, it provides about the ACC of 75.35% when the altered signals of lowfrequency fluctuation are acted as the input of SVM classifier 2 Wireless Communications and Mobile Computing [21]. The existing literature extracts the local features through statistical methods, ignoring the multiscale local features and the structural features between nodes of BFN.
In present work, we present an ASFGS algorithm using the Laplacian matrix and sparse autoencoder to obtain the structural features of BFN. Concurrently, we develop an MLFDS algorithm to overcome the low utilization of local features of BFN. In the end, all the features generated above are concatenated to improve the classification performance of aMCI system.

Materials and Methods
3.1. Overview of the aMCI Detection System. The critical point of our work is to design the reliable detection system of aMCI from commonly redundant information of rs-fMRI data, as is shown in Figure 1. To achieve this objective, the proposed aMCI detection system consists of multiple components. At first, the BFN is constructed using the Pearson correlation coefficient. Then, the obtained BFN is utilized as the input of the ASFGS algorithm and MLFDS algorithm we propose to extract features at multiple scales. Furthermore, to evaluate the validity of the proposed algorithm, the RBF-SVM classifier is employed in this project. Ultimately, we send abnormal brain regions and classification results to the doctor in result feedback component.

Data Preprocessing.
We utilize the public dataset downloaded from the second phase of Alzheimer's Disease Neuroimaging Initiative (ADNI-2) to validate effectiveness of the proposed aMCI detection system (downloaded from http:// adni.loni.usc.edu/). The data are preprocessed using a widely adopted Resting-State fMRI Data Analysis Toolkit plus (RESTplus) toolbox [22]. All subjects are required to lie flat, not think, and not move their heads during the scan. It deserves to be noticed that several data do not conform to the basic requirements of the work, such as undue head movement, the reliability of the data, and poor quality of image registration. Specific requirements can be inquired from [18,23]. Hence, we end up with data on 33 individuals with aMCI and 34 healthy controls (HCs) in this work.

BFN Construction.
To better extract the time signals of the corresponding brain regions, anatomical automatic labeling (AAL) template is employed to segment the whole brain into 90 regions of interest (ROI) [24]. Following this, a Pearson correlation coefficient matrix for each data, denoted as P ij , is calculated using the given time series of standard brain regions. Hence, we obtain a BFN of 90 × 90 dimensions for each data, and the generated Pearson correlation coefficient matrix can be calculated by where x i and x j represent the extracted ROI signals of the brain corresponding to the i region and j region. The i and j in P ij represent the brain region positions corresponding to the AAL template, and each value in the P ij represents the value of the Pearson correlation coefficient. Moreover, cov ðx i , x j Þ represents the covariance of variable x i and variable x j . Likewise, σ x i and σ x j represent the standard deviation of variable x i and variable x j .

Adaptive Structure Feature Generation Strategy (ASFGS).
We develop an ASFGS algorithm for extracting the structural features of BFN, which is aimed at improving the detection performance of aMCI system. ASFGS algorithm is mainly composed of two components, including the rough feature extraction module and accurate feature extraction module, as shown in Figure 2.
Considering the substantial contribution of Laplacian Eigenmaps (LE) to maintain and reflect the local relationship between data to some extent, we present the rough feature extraction module based on its conception to extract the structural features of BFN. A brief description of the LE algorithm is as follows [25]: Step 1. Given a set of data S = fs 1 , s 2 , ⋯, s n g. Where s i represents the input data i, the dimension of each data is denoted by q, and n represents the number of all data. In general, Gaussian kernel function is employed to construct an undirected graph matrix between data to depict the adjacency relationship between the data, which can be denoted as M ij .
Notably, the degree matrix represents the sum of each column or row in M ij , which can be calculated as D through (3). Then, the Laplacian matrix, denoted as L, can be obtained by (4).
Step 2. Since the Laplace matrix is a positive semidefinite matrix, it can be further expressed as: To maintain the adjacency relation between two data, it can be converted to the minimization issue. That is, if m ij is larger, the higher the similarity between y i and y j . Thus, it is now reduced to y opt = arg min y T Dy=1 y T Ly: ð6Þ

Wireless Communications and Mobile Computing
Step 3. Ultimately, by employing the Lagrange multiplier method, it can be approximately converted into Nevertheless, the objective of the LE algorithm is to reduce the dimension of data features, while the retained dimension is determined by the number of minimum nonzero eigenvalues of the matrix, which does not conform to the requirements of the structural feature extraction of BFN in this paper. Fortunately, LE algorithm plays a significant role in maintaining the relation between sample points after dimensionality reduction. Therefore, we present a rough feature extraction module that modifies the LE algorithm to put its proper focus on structural feature extraction of BFN.
First of all, the minimization problem is constructed using the Laplacian matrix [25], as is shown in where B represents the number of brain nodes. To better evaluate the interrelation between brain region signals in BFN, we use the Pearson correlation coefficient matrix obtained by (1) to replace the Gaussian kernel function. It is worth noting that at this point D represents the level of importance of brain nodes. Considering the limitation of the small data set, we reduce the dimension of BFN from 90 × 90 to 90 × 1, while maintaining the inherent correlation between brain nodes of the BFN. Let ðλ f , y f Þ denotes the solution to (7). Hence, the rough feature extraction module can be induced as follows: where h represents the number of nonzero eigenvalues of the 90 × 90 matrix of BFN. To summarize, the optimal result of (8) can be obtained by (9). The novel aspect of our work is that we present the rough feature extraction module to extract the structural features of BFN under the condition of the small data set. While the generated vector still contains redundant aMCI HC The mask resulted from two-sample t-test VC-MG strategy

MMDGS strategy
The mask of 0.001<p<0.05 Multi-scale local features The mask of p<0.001

Concatenation
Local features after fusion  Wireless Communications and Mobile Computing information, such as the randomness of the values of unrelated brain regions. Based on this, the accurate feature extraction module is presented using the sparse autoencoder to further extract the structural features of BFN in this work. Sparse autoencoder, which constrains hidden layer neurons to some extent and continuously regulates parameters through the errors between the output and input of the model, is an unsupervised machine learning algorithm [26]. Besides, the sparse autoencoder can represent linear or nonlinear transformations using different transfer functions in the coding process. The advantages of using the sparse autoencoder are as follows: Firstly, it can reduce the dimensionality of the original data while maintaining the nonlinear structural features of the data. Secondly, the feature dimension extracted using the sparse autoencoder is not fixed, and the structure of the input data can be obtained by regulating parameters to minimize model error. In the end, various brain regions are inhibited or activated, respectively, in the resting state, while the neurons in the hidden layer of the sparse autoencoder are sparse, indicating that the sparse autoencoder is desirable for the actual situation in which the human brain works. Accordingly, we present an accurate feature extraction module to further extract the structural features and reduce redundant information.
The accurate feature extraction module consists of a hidden layer, and the transfer functions of the encoder and decoder are nonlinear. First, the cost function of the sparse autoencoder using sparse constraint in the hidden layer is given as follows [27,28]: where μ represents the number of training data, x ðiÞ represents the input data, y ðiÞ represents label of data, λ represents the weight decay parameter, β represents the tuning parameter, s 2 represents the number of neurons in the hidden layer, KLð•Þ represents relative entropy, W represents the weight coefficient of encoder, L represents the number of layers, and b represents the bias coefficient, respectively. Here, let s 2 denotes the average node degree of all data to represent the average activity level of brain nodes. That is, we substitute the number of active brain regions for the number of neurons in the hidden layer, where r represents the number of brain nodes.
In order to optimize the error between the output and input of the sparse autoencoder, the back propagation algorithm is employed to update the model parameters. Owing to the limited number of data, we further employ the twosample t-test to obtain the structural features with significant differences (p < 0:05) in the R × 1 generated vector of BFN, where R represents the number of neurons in hidden layer.

Multiscale Local Feature Detection Strategy (MLFDS).
We develop an MLFDS algorithm to overcome the low utilization of local features of BFN. Our algorithm is proposed based on maximizing the mean difference between classes and minimizing the intraclass variance. The framework of the MLFDS algorithm shown in Figure 3 is mainly composed of two parts, including the mask generation based on variable coefficient (VC-MG) and minimal mean difference generation strategy (MMDGS).
At present, the two-sample t-test is employed to obtain the nodes of BFN with significant difference (p < 0:05) in mean value between classes, which is conducive to removing redundant information. However, before the two-sample t -test is performed, the results of homogeneity analysis can not determine whether the variance of each type of data is large or small. Besides, the latest method only uses the mask of p < 0:001 for extracting local features of BFN, ignoring the effective utilization of features, such as the features under the mask of 0:001 < p < 0:05 [20,21]. Consequently, the MLFDS algorithm is presented to address this issue.
Step 1. We first calculate the variable coefficient of BFN in two groups to generate the mask, respectively. The value of corresponding position is 0 if variable coefficient in the mask is greater than the mean; otherwise, it is 1, where 1 means that variation coefficient is lower than the average value of variation coefficient. It is remarkable that we employ the median of variation coefficient as the average value to avoid the influence of extreme values. Then, we intersect the generation masks of the two kinds of data generated by the above operation, and the obtained mask is denoted as Mask i , where the i represents the number of executions of Step 1 (VC-MG).
Step 2. We calculate the average value of p at the corresponding positions according to the Mask i . Concurrently, we perform the subtraction operation on the variation coefficient matrix between classes according to the Mask i and then calculate its average value E i (MMDGS).
Step 3. Perform Step 1 and Step 2 n times.
where A i represents the variable coefficient matrix of the aMCI group, B i represents the variable coefficient matrix of the HC group, l i represents the number of 1 in Mask i , η is constrained to (13), and N is equal to 8100.
The principle of MLFDS algorithm is to select the position with lower variance under the premise of the obvious difference in mean value between groups. Therefore, the i 5 Wireless Communications and Mobile Computing corresponding to the lowest value of E i is employed as the threshold value to extract the local features of BFN under the mask of 0:001 < p < 0:05. And finally, the multiscale local features resulted from the mask of 0:001 < p < 0:05 and mask of p < 0:001 are concatenated in this work to improve the detection performance.
3.6. Multiscale Feature Fusion. In order to improve the detection performance of aMCI system, we concatenate multiscale features resulted from the ASFGS algorithm and MLFDS algorithm in this work. That is, supposing the dimensions of two groups of features are d 1 and d 2 , respectively, then the dimension of concatenated feature is equal to d 1 + d 2 .
3.7. Classification Using RBF-SVM Classifier. In view of the finite data, the adoption of suitable classifier is essential to estimate the validation of features obtained from the proposed algorithms. Fortunately, numerous works on mild cognitive impairment (MCI) classification have shown that RBF-SVM classifier has superior detection performance [19][20][21]. The following is a brief introduction to the RBF-SVM classifier: The essence of SVM algorithm is to work around the optimization problem of the objective function [29].
The objective of Gaussian radial basis function is to obtain the new space, which is more favourable to classification [30].
where C denotes the penalty coefficient, x i denotes the input data, y i denotes the label of data, w denotes the weight coefficient, b denotes the bias coefficient, and V i denotes the relaxation variable, respectively. To prevent the overfitting issue of classifier training process, we utilize the leave-one-out cross-validation strategy in this work.

Experiment Results and Analysis
This work explores the detection power of aMCI system using multiscale features of BFN, which are derived from rs-fMRI data, for the automatic identification and classification of aMCI subjects from HCs. In the proposed detection system, we employ structural features using ASFGS algorithm and multiscale local features using MLFDS algorithm to train an RBF-SVM classifier for accurate discrimination of aMCI individuals.

aMCI HC
The mask resulted from two-sample t-test

MMDGS strategy
The mask of 0.001<p<0.05 Multi-scale local features The mask of p<0.001

Concatenate
Local features after fusion The rough feature extraction module is first presented to extract the structural features of BFN, which maintains the correlation between the brain nodes after dimensionality reduction from 90 × 90 to 90 × 1. Considering the generated vector mentioned above still contains redundant information, we develop an accurate feature extraction module to further extract the structural features and achieve the dimensioning reduction from 90 × 1 to 12 × 1. Next, the two-sample t-test (p < 0:05) is employed for 12 × 1 structural feature generation vector to reduce information redundancy under the condition of limited data. In the end, the structural features with obvious alteration are employed as the input of SVM-RBF classifier. The classification results using ASFGS algorithm are shown in Our goal is to simulate the information processing pattern of human brain to extract structural information of BFN, which further improves the detection performance of the aMCI system. Consequently, mathematical modelling about BFN is implemented, which contains information about the interactions between brain regions [33][34][35]. We perform the rough feature extraction module to extract the structural features; that is, the information of brain regions with higher correlation will be maintained after dimensionality reduction. Next, the accurate feature extraction module is based on how the brain works in the resting-state, in which some parts of the brain nodes are activated while others are suppressed. Let the number of neurons in the hidden layer of the sparse autoencoder be the average activity level of brain nodes, while the selection of sparsity can make some brain regions in the inhibited state and others in the activated state. Where 12 is derived from (11), which is performed to represent the average activity level of brain nodes. The results show that the sparsity threshold between 0.4 and 0.5 shown in Figure 5 has the minimum reconstruction error, indicating that the number of activated brain regions is about 4 to 6. The activity level of brain nodes (node degree) in the data ranges from 3 to 31, and the sparsely activated brain regions are also within this range, suggesting that the brain working mechanism we simulate is meaningful to some extent.

The Performance Analysis of MLFDS Algorithm.
We develop an MLFDS algorithm to excavate the multiscale local features of the BFN. Specifically, we first present the VC-MG strategy to generate the mask, and then, we present the MMDGS strategy based on the mask to extract the multiscale local features.
We extract the local fusion features of BFN using MLFDS 2 algorithm to improve the detection performance of aMCI system. The results elucidate that five pairs of connected brain nodes with obvious alteration are found using MLFDS 1 algorithm, including (21, 72), (45, 46), (11,61), (73, 76), and (74, 76), as shown in Figure 6. Furthermore, two pairs of connected brain nodes with obvious alteration are found using SLF, including (63, 76) and (58, 64). It is worth noting that several numbers in Figure 6 correspond to specific brain regions in the AAL template, which can be found in [36]. Where the blue ball denotes the brain nodes with obvious alteration, the red lines show the great correlation in two brain nodes, SLF refers to the single local feature method (two-sample t-test, p < 0:001) in [20,21], MLFDS 1 refers to local features obtained using the VC-MG strategy and MMDGS strategy, and MLFDS 2 refers to features after fusion of multiscale local features.
As shown in Table 2, our findings elucidate that using MLFDS 1 algorithm can achieve about the ACC of 79.10% in RBF-SVM classifier, and the improvement is 2.98% compared to SLF algorithm. Also, the AUC of RBF-SVM classifier is 79.14%, increasing by 2.67% compared to SLF algorithm. For F1-score metric, it provides about 77.42% performance, increasing by 1.66% compared to SLF algorithm. From the classification results, the MLFDS 1 algorithm is more effective than the state-of-the-art algorithm (SLF). This is due to the fact that the MLFDS 1 algorithm follows the   7 Wireless Communications and Mobile Computing principle of maximizing the mean difference between classes and minimizing the intraclass variance. Therefore, the discriminant features used for classification can be obtained to some extent. We further concatenate the multiscale local features generated from MLFDS 1 algorithm and SLF algorithm. The results elucidate that using MLFDS 2 algorithm provides about the ACC of 80.60% in RBF-SVM classifier, increasing by 4.48% compared to SLF algorithm. Moreover, it achieves about 84.22% in AUC, with an improvement rate of 7.75% compared to SLF algorithm. For F1-score metric, it provides about 80.00% performance in RBF-SVM classifier, increasing by 4.24% compared to SLF algorithm. This indicates that the concatenation of multiscale local features can greatly improve the detection performance of aMCI system.

Performance Analysis of Fusion of Structural Features
and Multiscale Local Features. In order to further improve the detection of aMCI system, we mainly concatenate the multiscale features of BFN, referred to as ASFGS-MLFDS, including structural features and multiscale local features.
As is shown in Table 3, ASFGS-MLFDS algorithm provides about the ACC of 86.57% in RBF-SVM classifier, increasing by 10.45% compared to SLF algorithm. Besides, it provides about the AUC of 86.36% in RBF-SVM classifier, with an improvement rate of 9.89% compared to SLF algo-rithm. For F1-score, it provides about 85.71% performance in RBF-SVM classifier, increasing by 9.95% compared to SLF algorithm. This elucidates that multiscale local features and structural features play a complementary role, which significantly improves the detection performance of aMCI system, thus making up for the low feature utilization rate under the condition of limited data.

Conclusions
In this paper, we develop an aMCI detection system. Firstly, we present the ASFGS algorithm to extract structural features of BFN. Then, we present the MLFDS algorithm that excavates the multiscale local features of BFN, thus overcoming the low utilization of local features. In the end, multiscale features of BFN, including structural features and multiscale local features, are fused to further improve the detection performance of aMCI system. Our work outperforms the state-of-the-art methods and offers new insights for the accuracy requirement of aMCI system. Accordingly, the ASFGS algorithm and MLFDS algorithm we present can be employed to detect brain diseases, providing new insights for the intelligent construction of the imaging diagnosis system. The future work contains introducing multimodality data to improve the detection performance of aMCI system.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.