Study of functional brain network (FBN) based on functional magnetic resonance imaging (fMRI) has proved successful in depression disorder classification. One popular approach to construct FBN is Pearson correlation. However, it only captures pairwise relationship between brain regions, while it ignores the influence of other brain regions. Another common issue existing in many depression disorder classification methods is applying only single local feature extracted from constructed FBN. To address these issues, we develop a new method to classify fMRI data of patients with depression and healthy controls. First, we construct the FBN using a sparse low-rank model, which considers the relationship between two brain regions given all the other brain regions. Moreover, it can automatically remove weak relationship and retain the modular structure of FBN. Secondly, FBN are effectively measured by eight graph-based features from different aspects. Tested on fMRI data of 31 patients with depression and 29 healthy controls, our method achieves 95% accuracy, 96.77% sensitivity, and 93.10% specificity, which outperforms the Pearson correlation FBN and sparse FBN. In addition, the combination of graph-based features in our method further improves classification performance. Moreover, we explore the discriminative brain regions that contribute to depression disorder classification, which can help understand the pathogenesis of depression disorder.
National Natural Science Foundation of China613050186143200861472423615320061. Introduction
As one of the most prevalent psychiatric disorders, depression disorder is typically characterized by persistent depressed mood, loss of motivation, and sleep abnormalities [1]. Moreover, it can lead to suicide at its worst. According to the World Health Organization, an estimated 350 million people of all ages suffer from depression disorder globally [2]. However, the diagnosis of depression disorder mainly depends on clinical symptoms, and its pathogenesis remains unclear [3]. Functional magnetic resonance imaging (fMRI) can contribute to the diagnosis and a better understanding of the pathogenesis of depression disorder [4, 5]. This brain imaging technique provides an effective tool to explore functional abnormalities of depression disorder [6].
A large number of fMRI studies have reported abnormal functional brain network (FBN) in patients with depression [7, 8]. However, the models to construct FBN suffer from several limitations. FBN is a mathematical representation of brain. Brain regions are nodes and functional connectivities between each pair of brain regions are edges. Pearson correlation is the most commonly used model for constructing FBN, in which the functional connectivity (FC) value is estimated by the correlation coefficient between brain regions [9]. Connolly et al. use Pearson correlation to analyze the abnormal FC between subgenual anterior cingulate cortex and other brain regions in depressed adolescents [10]. However, it only captures pairwise information between brain regions without considering influence of other brain regions. Independent Component Analysis (ICA) can also be used to construct FBN by grouping brain regions into latent components. The brain regions within the same component are believed to have strong FC, while the FC between different components is weak [11, 12]. Increased FC between subgenual cingulate and thalamic is detected in patients with depression by ICA [13]. The main drawbacks of ICA are the inaccessibility of FC value and the uninterpretability of components. Recent work tries to impose sparse prior to the models for constructing FBN. It is based on neurological findings that a brain region usually only directly interacts with a few other brain regions [14]. Huang et al. construct FBN by employing a sparsity prior in the estimation of inverse covariance matrix [11]. Although this sparse representation model calculates FC between each pair of brain regions with consideration of all the other brain regions, the sparsity prior is not enough to describe the structure of FBN.
As the functional abnormalities of depression can be explored by FBN, many classification methods based on FBN are developed for depression disorder classification. Feature extraction plays a key role in the classification methods. FC in FBN can be directly used as a feature for depression disorder classification [15, 16]. Zeng et al. use multivariate pattern analysis to classify FC of patients with depression and FC of healthy controls. In addition, regional homogeneity and amplitude of low frequency fluctuations are also commonly used features for depression disorder classification [17, 18]. However, these features, which only consider the specific local changes of FBN, are not effective for classification. A more comprehensive feature extraction approach is needed for depression disorder classification.
To overcome the limitations lying in construction of FBN and feature extraction, we propose a new method for depression disorder classification. In this paper, FBN is constructed by sparse low-rank model and eight graph-based features are extracted for classification. Sparse low-rank model provides a much better FBN than Pearson correlation or simple sparse representation model for three reasons. First, FBN constructed by sparse low-rank model considers the linear relationship between two brain regions given all the other brain regions, in contrast with the pairwise Pearson correlation. Secondly, imposing sparsity on FBN is interpretable because a brain region only directly interacts with a few other brain regions in neurological processes, which has been supported by some neurophysiological findings [14, 19]. Thirdly, low-rank constraint encodes a modular structure to the FBN, which is closer to the real FBN [20, 21]. Sparse representation and dictionary learning can also be used as a classifier for fMRI data. Our previous work proposes a weighted discriminative dictionary learning (WDDL) method for disease classification [22]. The model of WDDL represents each test sample using two class-specific dictionaries, respectively, and classifies it to the class with the smaller representation error. However, in this work, we detect the effect of a sparse low-rank model to construct FBN, which is a part of feature extraction for classification.
Once the FBN is constructed by sparse low-rank model, we extract eight graph-based features, which provide information about the entire network other than specific local changes [23, 24]. The eight graph-based features are from the aspects of functional segregation, functional integration, nodal centrality, and network resilience. We choose graph-based features to measure FBN for two reasons. First, graph-based features are effective in helping us understand the functional organization of network and ranging from cells [25] and tissues [26, 27] to the whole ecosystems [28, 29]. Secondly, recent researches have shown that graph-based features, which measure topological properties of FBN, make the classification methods have good classification performance [30, 31].
In short, the main contributions of this paper are as follows: (1) FBN is constructed by sparse low-rank model, which can calculate the relationship between two brain regions given all the other brain regions. (2) We extract eight graph-based features, which can effectively characterize the FBN from different aspects. To our knowledge, this is the first study of depression disorder classification, which extracts graph-based features from sparse low-rank FBN. The experimental results show that both sparse low-rank FBN and the combination of graph-based features improve the classification performance. Generally, the promising classification result proves the effectiveness of our method. The overall procedure of our method is shown in Figure 1.
The schematic diagram of our method for depression disorder classification. SVM: support vector machine.
2. Methods2.1. Participants, Data Acquisition, and Preprocessing
31 patients with depression (16 females, 15 males) and 29 age-, sex-, and education-matched healthy controls (15 females, 14 males) are recruited from the Department of Radiology, Guang’anmen Hospital of China Academy of Traditional Chinese Medicine. The average age of patient group and control group is 30.42 and 32.63, respectively. All subjects are right-handed native Chinese speakers. Written informed consent is obtained from all subjects. All the patients with depression are diagnosed according to Structured Clinical Interview for the DSM-IV, patient version (SCIDI/P) [32], by experienced psychiatrists. They have no history of other neurological illness or head injury. Healthy controls are interviewed using the Structured Clinical Interview for DSM-IV, nonpatient edition (SCIDI/NP). They have no current or history of depression disorder or other psychiatric disorders.
The fMRI measurements are performed on a General Electric (GE) signa 1.5T echo speed superconducting MRI scanner. Functional images are acquired with an echo-planar imaging (EPI) sequence: repetition time (TR) = 2000 ms, echo time (TE) = 30 ms, flip angle = 90°, field of view (FOV) = 24 cm, matrix = 64×64, thickness = 3 mm, and slices = 41. Subjects are instructed not to think of anything and keep their eyes closed but not fall asleep during the resting-state fMRI acquisition. For each subject, the fMRI scanning lasts for four minutes and twenty-eight seconds and 144 volumes are obtained.
The preprocessing of fMRI data is conducted using Statistical Parametric Mapping (SPM8, http://www.fil.ion.ucl.ac.uk/spm/software/spm8/), Resting-State fMRI Data Analysis Toolkit (REST, http://restfmri.net/forum/index.php), and Data Processing Assistant for Resting-State fMRI (DPARSF, http://www.restfmri.net/forum/taxonomy/term/36). The first 10 time points are discarded for subject’s adaptation to the scanning and the scanner calibration. The remaining images are first corrected for different slice acquisition timing and head motion. No subject is discarded for excessive head movement (translation < 2.0 mm or rotation < 2.0°). Next, the images are spatially normalized to the standard EPI template in SPM8 and resampled to a voxel size of 3 × 3 × 3 mm^{3}. After this, the images are smoothed with an isotropic Gaussian kernel (FWHW = 4 mm) and temporal band-pass filtered (0.01 Hz–0.08 Hz). To further reduce the effects of nuisance signals, regression of 6 head motion parameters, global mean signal, white matter signal, and cerebrospinal fluid signal are performed. Finally, we use the Automated Anatomical Labeling (AAL) atlas [33] to segment brain signals. The mean fMRI time series of 116 brain regions are obtained for further analysis. After preprocessing, the final number of volumes is 134 as 10 volumes are discarded from the 144 volumes. The dimensionality of data matrix is 134∗116 for each subject.
2.2. Construction of FBN
FBN is a mathematical representation of the system of brain, which is defined by a collection of nodes and edges [24, 34]. In this paper, nodes represent the brain regions obtained from AAL atlas. Edges linking two nodes represent the FC between the two corresponding brain regions. FC is defined as statistical dependency between spatially remote brain regions [35, 36]. A high correlation between the time series of the two brain regions reflects a high level of FC between them.
FBN has many inherent structures, some of which can guide to construct a better FBN. Sparsity and modularity are two important structures of FBN, which can be used by adding some constraints to the constructed model. Sparsity means that a brain region only directly interacts with a few other brain regions in neurological processes [14, 19]. The sparsity prior can be used in FBN construction by adding l0-norm or l1-norm constraint to the objective function. In addition, modularity refers to that there exist some node groups (communities) in the FBN [24]. The FC between nodes from the same group is dense, while FC between nodes from different groups is sparse. It has proved that the combining of sparse and low-rank constraint can describe the modularity of FBN [21]. Therefore, we use a sparse low-rank model to construct FBN in this paper. The reasons for choosing sparse low-rank model for FBN construction are as follows: (1) the sparse low-rank model can construct FBN with both sparse and modular structure, which is verified in Results. (2) The classification performance can be improved by sparse low-rank model, compared with the commonly used Pearson coefficient model and sparse representation model, as shown in Results.
The sparse low-rank model can be used to construct FBN as follows. Assuming we have N subjects, each of which has m brain regions. Let X=[x1,…,xm]∈Rt×m be the fMRI data matrix of a subject, where t is the number of time points. For the time series of each brain region xi, we use the time series of all the other brain regions Xi=[x1,…,xi-1,xi+1,…,xm]∈Rt×(m-1) as dictionary to represent this brain region with coding coefficient ai, namely, xi=Xi×ai.
The sparse low-rank FBN of the nth subject can be formulated as the following objective function:(1)JA=argminAX-XAF2+λ1A0+λ2rankA,where A=[a1,a2,…,am] is the coding coefficient matrix. The jth element of ai denotes the relationship between xi and xj given all the other x in Xi. Then, the matrix A is a FC matrix of subject X. And the FC between two brain regions are calculated given all the other brain regions, compared with the pairwise Pearson correlation. This is also a reason that we choose sparse low-rank model to construct FBN. λ1 and λ2 are the regularization parameters for trade-off among the three terms. The first term is the data-fitting term, the second term is sparsity constraint, and the last term is low-rank constraint on the FC matrix A. With the introduction of those two constraint terms, the constructed FBN is imposed to have sparse and modular structure. As the two constraint terms are both nonconvex with respect to A, they are relaxed to l1-norm A1 and trace norm A∗, respectively. The objective function in (1) can be written as follows:(2)JA=argminAX-XAF2+λ1A1+λ2A∗,where A1=∑i∑jaij. The objective function can be optimized via a proximal method [37]. Once the optimal FC matrix A is obtained, we replace A with A~=(A+AT)/2 to obtain a symmetry FC matrix. The replacement is based on a discovery that asymmetry of the FC matrix does not contribute to the final classification performance [21]. In addition, all the diagonal elements of the FC matrix (self-connections) are set to zero.
2.3. Feature Extraction
To extract effective graph-based features from the constructed FBN, the original FC matrices are first converted to binary matrices by setting all the nonzero connectivity to one. In this paper, eight graph-based features are computed from the following four aspects: functional segregation, functional integration, nodal centrality, and network resilience [24].
2.3.1. Functional Segregation
Functional segregation measures how efficiently information is exchanged within interconnected groups of brain regions.
Clustering coefficient is defined as the number of neighbors of a given node connected to its other neighbors, which describes the level of local neighborhood clustering of a network [38]. The clustering coefficient of node i is defined as(3)Ci=2rikiki-1,where ri is the number of triangles around a node i and ki is the degree of node i which will be described below.
Local efficiency describes how efficient is the communication between the first neighbors of node i when the node is removed [39]. The local efficiency is the average of inverse shortest path length between the direct neighbors of a node. It is defined as(4)Eloc,i=∑j,h∈GidjhGi-1kiki-1,where Gi is the set of nodes that are neighbors of node i and djh(Gi) is the shortest path length between node j and node h, which contains only direct neighbors of node i.
2.3.2. Functional Integration
Functional integration is used to measure the ability of brain to rapidly integrate information from distributed brain regions. Characteristic path length [40] and global efficiency [39] are the two most commonly used measures of functional integration. The global efficiency is the average inverse shortest path length. They are respectively defined as(5)L=1n∑i∈N∑j∈N,j≠idijn-1,E=1n∑i∈N∑j∈N,j≠idij-1n-1,where L and E are the characteristic path length and global efficiency of the network, n is the number of nodes in the network, N is the set of all the nodes in the network, and dij is the shortest path length between node i and node j.
2.3.3. Nodal Centrality
Degree and betweenness centrality are used to measure the centrality of a node. Degree of a node is defined as the number of links connected to the node, which reflect the importance of a node. Degree of node i is defined as(6)ki=∑j∈Ngij,where gij is the connection status between node i and node j: gij=1 when link (i,j) exists and gij=0 otherwise.
Betweenness centrality of a node is defined as the fraction of all shortest paths that pass through the node [41]:(7)bi=1n-1n-2∑h≠j,h≠i,j≠ih,j∈Nρhjiρhj,where ρhj(i) is the number of shortest paths between node h and node j that pass through node i and ρhj is the number of all the shortest paths between node h and node j.
Participation coefficient assesses the diversity of intermodular interconnections of individual nodes. The participation coefficient of node i is defined as(8)yi=1-∑m∈Mkimki2,where M is the set of modules and ki(m) is the number of links between i and all nodes in module m.
2.3.4. Network Resilience
Indirect measures of resilience quantify anatomical features that reflect network vulnerability to insult. Among these measures, a typical one is average neighbor degree [42]:(9)knn,i=∑j∈Ngijkjki.
Once we have obtained all the eight graph-based features, we concatenate them to construct the final feature vectors. Specifically, for each subject, the feature vector has a size of 698, which consists of 116∗6 local measures and 2 global ones. The dimensionality of feature matrix is 698∗60, which consists of the feature vectors of all the subjects. As leave-one-out cross-validation (LOOCV) is used for classification, the training matrix dimensionality is 698∗59 in each LOOCV.
2.4. Feature Selection
The goal of feature selection is to remove irrelevant or redundant features and retain discriminative features, which can lead to a better classification performance of the model. In this paper, we employ Fisher score to select useful features. Fisher score is used to describe the discriminatory power of a feature between two classes [30, 43]. Fisher score for each feature is defined as(10)FS=p1q1-q2+p2q2-q2p1σ12+p2σ22,where p1 and p2 are the numbers of samples in the two classes, q1 and σ12 are the feature mean value and variance of one class, q2 and σ22 are the feature mean value and variance of the other class, and q is the feature mean value of all the samples.
A larger Fisher score indicates a more discriminative feature. We rank all the features in the training set based on Fisher score. Different feature sets can be obtained by selecting different number of ordered features. The final selected feature set is the one with the highest accuracy tested on the validation set, which is picked out from the training set.
2.5. Classification
In this study, we employ support vector machine (SVM) [44–46] with a simple linear kernel to evaluate the classification performance of our method. This technique is widely used and works well in the field of medical imaging classification [21, 30, 47]. The SVM is implemented using LIBSVM toolbox [48] with default parameters (i.e., C=1). LOOCV is applied here due to limited sample size. One sample is picked out as testing sample in turn and the rest of the samples are treated as training samples. In this paper, the following three quantitative measurements are used to validate the effectiveness of our method:(11)Accuracy=TP+TNTP+FN+TN+FP,Sensitivity=TPTP+FN,Specificity=TNTN+FP,where TP is the number of patients correctly classified, TN is the number of healthy controls correctly classified, FP is the number of healthy controls classified as patients, and FN is the number of patients classified as healthy controls.
3. Results3.1. Classification Performance
In this paper, to verify the effect of sparse low-rank FBN on classification performance, we conduct experiments on methods based on Pearson coefficient FBN and sparse FBN. Additionally, the methods with each single kind of features are also used for comparison, in order to evaluate the effect of combination of the eight graph-based features. Our method achieves the best classification performance compared with the contrast methods, with accuracy of 95%, sensitivity of 96.77%, and specificity of 93.10%. We can see that the results of our method are better than the methods based on Pearson coefficient FBN and sparse FBN, from Tables 1, 2, and 3. As shown in Table 1, our method performs better than the methods with any single kind of features. Besides, the results of different classifiers with sparse low-rank FBN are listed in Table 4. The parameters of all the classification methods are selected by LOOCV.
Classification performance of our method (sparse low-rank FBN).
Feature
NSF
Accuracy (%)
Sensitivity (%)
Specificity (%)
CC
8
83.33
80.65
86.21
LE
46
85.00
87.10
82.76
CPL
—
60.00
70.97
48.28
GE
—
60.00
70.97
48.28
D
22
83.33
80.65
86.21
BC
22
85.00
80.65
89.66
PC
20
83.33
83.87
82.76
AND
18
91.67
90.32
93.10
Eight features
12
95.00
96.77
93.10
NSF: number of selected features; CC: clustering coefficient; LE: local efficiency; CPL: characteristic path length; GE: global efficiency; D: degree; BC: betweenness centrality; PC: participation coefficient; and AND: average neighbor degree.
Classification performance of sparse FBN.
Feature
NSF
Accuracy (%)
Sensitivity (%)
Specificity (%)
CC
10
81.67
80.65
82.76
LE
10
83.33
77.42
89.66
CPL
—
55.00
45.16
65.52
GE
—
53.33
48.39
58.62
D
54
83.33
80.65
86.21
BC
113
73.33
77.42
68.97
PC
6
73.33
61.29
86.21
AND
10
68.33
74.19
62.07
Eight features
70
85.00
83.87
86.21
Classification performance of Pearson correlation FBN.
Feature
NSF
Accuracy (%)
Sensitivity (%)
Specificity (%)
CC
17
73.33
70.97
75.86
LE
23
78.33
77.42
79.31
CPL
—
55.00
58.06
51.72
GE
—
56.67
51.61
62.07
D
46
78.33
77.42
79.31
BC
8
78.33
83.87
72.41
PC
86
81.67
83.87
79.31
AND
1
70.00
61.29
79.31
Eight features
65
83.33
83.87
82.76
Classification performance of the most commonly used classifiers.
Classifier
NSF
Accuracy (%)
Sensitivity (%)
Specificity (%)
NB
15
88.33
87.10
89.66
k-NN
17
88.33
90.32
86.21
LDA
11
90.00
90.32
89.66
SVM (RBF)
11
90.00
93.55
86.21
SVM (linear)
12
95.00
96.77
93.10
NB: naive Bayes; k-NN: k-nearest neighbors; and LDA: linear discriminant analysis.
3.2. Effect of Regularization Parameters
The regularization parameters involved in the sparse low-rank model may significantly affect FBN construction and the classification performance. The optimal parameters are obtained from LOOCV. For our method, λ1 and λ2 are both in the range [0.1–5] with an increment step of 0.1. The classification accuracy of our method with different sets of parameters is shown in Figure 2. We can see that the best classification accuracy is achieved when λ1 is 4.5 and λ2 is 2.8. Therefore, this set of parameters is selected for further analysis. λ1 and λ2 are the regularization parameters for trade-off among data-fitting, sparsity constraint, and low-rank constraint. This optimal set of parameters indicates that the combination of sparsity and low-rank improves the classification performance. In addition, it can be observed that the classification performance is sensitive to the regularization parameters.
The classification accuracy with different sets of parameters.
3.3. Analysis of Sparse Low-Rank FBN
In this paper, FBN is constructed by sparse low-rank model. Figure 3 shows the FC matrix and topology structure of one patient with depression, which are constructed by sparse low-rank model, Pearson correlation model, and sparse representation model. The parameters used in the FBN shown in Figure 3 are optimally obtained from LOOCV. The parameters for sparse low-rank model ((a) and (b)) are 4.5 (λ1) and 2.8 (λ2). The threshold for Pearson correlation model ((c) and (d)) is 20%. The parameter for sparse representation model ((e) and (f)) is 3.2 (λ). It can be observed that the FC inferred by sparse representation model and sparse low-rank model can automatically remove some weak connections. Compared with sparse representation model, sparse low-rank model can lead to a clearer modular structure in the FBN. Moreover, the classification performance of methods based on sparse low-rank FBN is better than methods based on Pearson correlation FBN or sparse FBN, as mentioned in the last subsection.
FC matrix and topology structure of FBN from one patient with depression. (a) and (b) are the FC matrix and topology structure of sparse low-rank FBN, (c) and (d) are those of Pearson correlation FBN, and (e) and (f) are those of sparse FBN.
Furthermore, we use the modularity score [49] to evaluate the modularity of FBN constructed by the three models. Figure 4 shows the average modularity scores of FBN constructed by Pearson correlation model, sparse representation model, and sparse low-rank model with different thresholds. The modularity scores shown in Figure 4 are the average modularity scores of all the subjects. Different thresholds are used in the FBN to remove weak connections in varying degrees. And the thresholds are applied to the absolute value of connections in order to obtain valid modularity scores. The connection whose absolute value is less than a certain threshold is removed. We can see from Figure 4 that sparse low-rank model can lead to a clearer modular structure in the FBN for two reasons. (1) The peak value is obtained by sparse low-rank model, compared with Pearson correlation model and sparse representation model and (2) the area under the curve of sparse low-rank model is the largest among areas of the three models. And the largest area under the curve means the maximum sum of average modularity scores with different thresholds.
Average modularity scores of different FBN with different thresholds. PC: Pearson correlation model; SR: sparse representation model; and SLR: sparse low-rank model.
3.4. Number of Selected Features
After extracting the eight graph-based features, we obtain a feature vector with a size of 698 for each subject. Because of the high dimensionality of the feature vector, feature selection is essential to remove redundant features and improve the classification performance. Fisher score is used in this study to sort different dimensions of features based on the discriminatory power. We select different number of ordered features with max Fisher score to train and test the classifier. The number of selected features that resulted in the best classification performance is applied. The proportion of each kind of selected features in every LOOCV is shown in Figure 5.
Proportion of each kind of selected features in the three methods. PC: Pearson correlation model; SR: sparse representation model; and SLR: sparse low-rank model.
3.5. Discriminative Brain Regions
The selected graph-based features are related to the specific brain regions, which contribute to the classification. These related brain regions are treated as discriminative brain regions of patients with depression compared with healthy controls, as shown in Figure 6. Specifically, we first use Fisher score to sort all the 698 dimensions of graph-based features in each LOOCV. Secondly, we use different sets with increased number of sorted features to train and test the classifier. And the number of features which results in the best performance is picked out. The selected features from the 116∗6 local measures are related to the specific brain regions. Finally, we count the times that each related brain region is selected. In addition, there are 12 brain regions which are picked out in all the LOOCV. The name of these brain regions and the number of times they are picked out are listed in Table 5. The discriminative brain regions include postcentral gyrus, paracentral lobule, posterior cingulate cortex, calcarine, orbital superior frontal gyrus, superior frontal gyrus, Heschl gyrus, superior occipital gyrus, amygdala, middle temporal gyrus, orbital inferior frontal gyrus, and insula.
The discriminative brain regions of patients with depression and the number of times that they are picked out.
Brain regions
NTPO
Related studies
Postcentral_L
151
Guo et al. [4]
Paracentral_Lobule_R
121
Kenny et al. [50]
Cingulum_Post_R
118
Zhu et al. [51]
Calcarine_R
96
Zhang et al. [57]
Frontal_Sup_Orb_L
62
Drevets et al. [58]
Frontal_Sup_R
60
Zhang et al. [57]
Heschl_L
58
Amico et al. [53]
Occipital_Sup_L
49
Zhang et al. [6]
Amygdala_R
2
Zhang et al. [6]
Temporal_Mid_L
1
Zhang et al. [57]
Frontal_Inf_Orb_L
1
Drevets et al. [58]
Insula_L
1
Liu et al. [3]
NTPO: number of times that they are picked out.
The discriminative brain regions of patients with depression compared with healthy controls. The color bar indicates the index of displayed brain regions.
4. Discussion
In this study, the proposed method, using sparse low-rank model and graph-based features, provides promising result for depression disorder classification. As shown in Table 1, our proposed method achieves the best classification performance, compared with using any single graph-based feature based on sparse low-rank FBN. We can see from Tables 1, 2, and 3 that our method performs better than Pearson correlation FBN and sparse FBN. In addition, the algorithm combining all the graph-based features outperforms the one with only one feature. Table 4 shows that linear SVM used in our method is superior to other commonly used classifiers. The highest accuracy of our method demonstrates the capability of accurately discriminating patients with depression from healthy controls. Significant improvement in sensitivity indicates the superiority of the proposed method in identifying patients with depression based on fMRI data. It is very important because misclassifying a patient to healthy control may cause severe consequences such as delaying critical treatment period.
The FBN is constructed by sparse low-rank model, which can automatically remove the weak connections and retain the modular structure. As illustrated in Figure 3, sparse low-rank model obtains sparser connection matrix than Pearson correlation model. However, the great sparsity of sparse low-rank FBN does not affect the classification performance as shown in Tables 1 and 3. On the contrary, the reserved strong connections of sparse low-rank FBN can achieve higher classification performance. Compared with sparse representation model, sparse low-rank model can capture improved modular structure as shown in Figures 3 and 4, which has been verified as an inherent property of FBN.
After constructing the FBN, we extract eight graph-based features to characterize the network and classify patients with depression and healthy controls. Because of the high dimensionality of extracted features, Fisher score algorithm is used to rank the features and select the feature set with best classification performance. We can see from Figure 5 that average neighbor degree is the most commonly selected feature in our method. However, degree and participation coefficient are the most commonly selected features in the method based on Pearson correlation FBN and sparse FBN, respectively. This finding suggests that the kind of the most effective feature is different for different methods. This is why we consider a variety of graph-based features.
The brain regions related to the selected graph-based features are the discriminative brain regions of patients with depression. As shown in Table 5, the discriminative brain regions are consistent with previous studies [4, 50], which can further prove the effectiveness of our method. Most of the discriminative brain regions are located at frontal lobe (paracentral lobule, superior frontal gyrus, orbital superior frontal gyrus, and orbital inferior frontal gyrus), occipital lobe (calcarine and orbital superior frontal gyrus), and temporal lobe (middle temporal gyrus and Heschl gyrus). The most commonly selected brain region in our method is postcentral gyrus, which is the primary somatosensory cortex [4]. Another brain region with high discrimination is posterior cingulate cortex, which has been reported as having abnormal FC in patients with depression [51]. Previous studies have indicated that posterior cingulate cortex is important for successful retrieval of self-relevant information [52]. Heschl gyrus is a primary auditory cortex and a subregion of superior temporal gyrus, which plays a key role in emotional processing and social cognition [53, 54]. It has been reported that insula is associated with abnormal interoception and pain processing in patients with depression [55]. In addition, amygdala, an important area for processing threat and orchestrating a complex set of emotional and physiologic responses [56], is also detected as discriminative brain region of depression in our study. These discriminative brain regions may help us better understand the pathogenesis of depression disorder.
5. Conclusion
In this paper, we develop a new method to classify fMRI data of patients with depression and healthy controls. More specifically, in order to calculate the relationship between brain regions given all the other brain regions, we first construct FBN with sparse low-rank model instead of the conventional Pearson correlation model. Our motivation also lies in that sparse low-rank model can describe the sparse and modular structure of FBN. Secondly, we extract eight graph-based features to effectively characterize the FBN from different aspects. Thirdly, Fisher score is used to rank features and select the optimal feature subset. Finally, the selected features are input to SVM for depression disorder classification. Experimental results demonstrate that our proposed method yields improved classification performance compared with the conventional methods based on Pearson correlation FBN and sparse FBN. In addition, the combination of graph-based features in our method further improves the classification performance. The promising classification result indicates our method can be used as an automatic tool to assist in diagnosis of depression disorder.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grants 61305018, 61432008, 61472423, and 61532006.
BelmakerR. H.AgamG.Major depressive disorderReddyM. S.Depression—the global crisisLiuZ.XuC.XuY.WangY.ZhaoB.LvY.CaoX.ZhangK.DuC.Decreased regional homogeneity in insula and cerebellum: a resting-state fMRI study in patients with major depression and subjects at high risk for major depressionGuoW.-B.LiuF.XueZ.-M.YuY.MaC.-Q.TanC.-L.SunX.-L.ChenJ.-D.LiuZ.-N.XiaoC.-Q.ChenH.-F.ZhaoJ.-P.Abnormal neural activities in first-episode, treatment-naïve, short-illness-duration, and treatment-response patients with major depressive disorder: a resting-state fMRI studyGuoX.LiuB.ChenL.ChenG.PanY.ZhangJ.Bayesian inference for functional dynamics exploring in fMRI dataZhangB.LiM.QinW.DemenescuL. R.MetzgerC. D.BogertsB.YuC.WalterM.Altered functional connectivity density in major depressive disorder at restVeerI. M.BeckmannC. F.van TolM.-J.FerrariniL.MillesJ.VeltmanD. J.AlemanA.van BuchemM. A.van der WeeN. J.RomboutsS. A. R. B.Whole brain resting-state analysis reveals decreased functional connectivity in major depressionMuldersP. C.van EijndhovenP. F.ScheneA. H.BeckmannC. F.TendolkarI.Resting-state functional connectivity in major depressive disorder: a reviewMurroughJ. W.AbdallahC. G.AnticevicA.CollinsK. A.GehaP.AverillL. A.SchwartzJ.DeWildeK. E.AverillC.Jia-Wei YangG.WongE.TangC. Y.KrystalJ. H.IosifescuD. V.CharneyD. S.Reduced global functional connectivity of the medial prefrontal cortex in major depressive disorderConnollyC. G.WuJ.HoT. C.HoeftF.WolkowitzO.EisendrathS.FrankG.HendrenR.MaxJ. E.PaulusM. P.TapertS. F.BanerjeeD.SimmonsA. N.YangT. T.Resting-state functional connectivity of subgenual anterior cingulate cortex in depressed adolescentsHuangS.LiJ.SunL.LiuJ.WuT.ChenK.FleisherA.ReimanE.YeJ.Learning brain connectivity of Alzheimer's disease from neuroimaging dataZhongY.ZhengG.LiuY.LuG.Independent component analysis of instantaneous power-based fMRIGreiciusM. D.FloresB. H.MenonV.GloverG. H.SolvasonH. B.KennaH.ReissA. L.SchatzbergA. F.Resting-state functional connectivity in major depression: abnormally increased contributions from subgenual cingulate cortex and thalamusStamC. J.JonesB. F.NolteG.BreakspearM.ScheltensP.Small-world networks and functional connectivity in Alzheimer's diseaseZengL.-L.ShenH.LiuL.WangL.LiB.FangP.ZhouZ.LiY.HuD.Identifying major depression using whole-brain functional connectivity: a multivariate pattern analysisXiM.SunJ.LiuL.FanF.WuX.Cancer feature selection and classification using a binary quantum-behaved particle swarm optimization and support vector machineGuoW.-B.LiuF.ChenJ.-D.GaoK.XueZ.-M.XuX.-J.WuR.-R.TanC.-L.SunX.-L.LiuZ.-N.ChenH.-F.ZhaoJ.-P.Abnormal neural activity of brain regions in treatment-resistant and treatment-sensitive major depressive disorder: a resting-state fMRI studyLiuJ.RenL.WomerF. Y.WangJ.FanG.JiangW.BlumbergH. P.TangY.XuK.WangF.Alterations in amplitude of low frequency fluctuation in treatment-naïve major depressive disorder measured with resting-state fMRIOlshausenB. A.FieldD. J.Emergence of simple-cell receptive field properties by learning a sparse code for natural imagesSpornsO.QiaoL.ZhangH.KimM.TengS.ZhangL.ShenD.Estimating functional brain networks by incorporating a modularity priorWangX.RenY.YangY.ZhangW.XiongN. N.A weighted discriminative dictionary learning method for depression disorder classification using fMRI dataProceedings of the IEEE International Conferences on Big Data and Cloud ComputingOctober 2016IEEE61862310.1109/bdcloud-socialcom-sustaincom.2016.97BullmoreE.SpornsO.Complex brain networks: graph theoretical analysis of structural and functional systemsRubinovM.SpornsO.Complex network measures of brain connectivity: uses and interpretationsBarabásiA.-L.OltvaiZ. N.Network biology: understanding the cell's functional organizationStožerA.GosakM.DolenšekJ.PercM.MarhlM.RupnikM. S.KorošakD.Functional connectivity in islets of Langerhans from mouse pancreas tissue slicesMarkovičR.StožerA.GosakM.DolenšekJ.MarhlM.RupnikM. S.Progressive glucose stimulation of islet beta cells reveals a transition from segregated to integrated modular functional connectivity patternsGreenD. G.SadedinS.Interactions matter—complexity in landscapes and ecosystemsGosakM.KorošakD.MarhlM.Topologically determined optimal stochastic resonance responses of spatially embedded networksKhazaeeA.EbrahimzadehA.Babajani-FeremiA.Identifying patients with Alzheimer's disease using resting-state fMRI and graph theoryKhazaeeA.EbrahimzadehA.Babajani-FeremiA.Application of advanced machine learning methods on resting-state fMRI network for identification of mild cognitive impairment and Alzheimer’s diseaseFirstM. B.SpitzerR. L.GibbonM.WilliamsJ. B.Tzourio-MazoyerN.LandeauB.PapathanassiouD.CrivelloF.EtardO.DelcroixN.MazoyerB.JoliotM.Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brainSmithS. M.VidaurreD.BeckmannC. F.GlasserM. F.JenkinsonM.MillerK. L.NicholsT. E.RobinsonE. C.Salimi-KhorshidiG.WoolrichM. W.BarchD. M.UǧurbilK.Van EssenD. C.Functional connectomics from resting-state fMRIFristonK. J.FrithC. D.LiddleP. F.FrackowiakR. S. J.Functional connectivity: the principal-component analysis of large (PET) data setsFristonK. J.Functional and effective connectivity in neuroimaging: a synthesisCombettesP. L.PesquetJ.-C.Proximal splitting methods in signal processingReijneveldJ. C.PontenS. C.BerendseH. W.StamC. J.The application of graph theoretical analysis to complex networks in the brainLatoraV.MarchioriM.Efficient behavior of small-world networksWattsD. J.StrogatzS. H.Collective dynamics of small-world networksKintaliS.Betweenness centrality: algorithms and lower boundshttps://arxiv.org/abs/0809.1906Pastor-SatorrasR.VázquezA.VespignaniA.Dynamical and correlation properties of the internetDudaR. O.HartP. E.StorkD. G.VapnikV.TsengY.-L.LinK.-S.JawF.-S.Comparison of support-vector machine and sparse representation using a modified rule-based method for automated myocardial ischemia detectionMaY.DingX.SheQ.LuoZ.PotterT.ZhangY.Classification of motor imagery EEG signals with support vector machines and particle swarm optimizationLiuF.GuoW.FoucheJ.-P.WangY.WangW.DingJ.ZengL.QiuC.GongQ.ZhangW.ChenH.Multivariate classification of social anxiety disorder using whole brain functional connectivityChangC.-C.LinC.-J.LIBSVM: a library for support vector machinesNewmanM. E. J.Modularity and community structure in networksKennyE. R.O'BrienJ. T.CousinsD. A.RichardsonJ.ThomasA. J.FirbankM. J.BlamireA. M.Functional connectivity in late-life depression using resting-state functional magnetic resonance imagingZhuX.WangX.XiaoJ.LiaoJ.ZhongM.WangW.YaoS.Evidence of a dissociation pattern in resting-state default mode network connectivity in first-episode, treatment-naive major depression patientsSprengR. N.MarR. A.KimA. S. N.The common neural basis of autobiographical memory, prospection, navigation, theory of mind, and the default mode: a quantitative meta-analysisAmicoF.CarballedoA.LisieckaD.FaganA. J.BoyleG.FrodlT.Functional anomalies in healthy individuals with a first degree family history of major depressive disorderDziobekI.PreisslerS.GrozdanovicZ.HeuserI.HeekerenH. R.RoepkeS.Neuronal correlates of altered empathy and social cognition in borderline personality disorderMutschlerI.BallT.WankerlJ.StrigoI. A.Pain and emotion in the insular cortex: evidence for functional reorganization in major depressionCullenK. R.WestlundM. K.Klimes-DouganB.MuellerB. A.HouriA.EberlyL. E.LimK. O.Abnormal amygdala resting-state functional connectivity in adolescent depressionZhangJ.WangJ.WuQ.KuangW.HuangX.HeY.GongQ.Disrupted brain connectivity networks in drug-naive, first-episode major depressive disorderDrevetsW. C.Orbitofrontal cortex function and structure in depression