A prerequisite to understand neuronal function and characteristic is to classify neuron correctly. The existing classification techniques are usually based on structural characteristic and employ principal component analysis to reduce feature dimension. In this work, we dedicate to classify neurons based on neuronal morphology. A new feature selection method named binary matrix shuffling filter was used in neuronal morphology classification. This method, coupled with support vector machine for implementation, usually selects a small amount of features for easy interpretation. The reserved features are used to build classification models with support vector classification and another two commonly used classifiers. Compared with referred feature selection methods, the binary matrix shuffling filter showed optimal performance and exhibited broad generalization ability in five random replications of neuron datasets. Besides, the binary matrix shuffling filter was able to distinguish each neuron type from other types correctly; for each neuron type, private features were also obtained.
To accelerate the understanding of neuronal characteristics in the brain, the prerequisite is to classify neurons correctly. It is therefore necessary to develop a uniform methodology for their classification. The existing classification techniques are usually based on structural functions and the numbers of dendrites to fit the models [
In the study by Li et al. [
Available feature selection methods fall into three categories, (i) filter methods, in which inherent features of datasets are used to rank variables, and the algorithm complexities are low. However, redundant phenomena are usually present among the selected features, which may result in low classification accuracy. Univariate filter methods include
In this work, a new feature selection method named BMSF was used. It not only overcame over fitting problem in a large dimensional search space but also took potential feature interactions into account during feature selection. Seven types of neurons, including pyramidal neuron, Purkinje neuron, sensory neuron, motoneuron, bipolar interneuron, tripolar interneuron, and multipolar interneuron, that have different characteristics and functions in the NeuroMorpho.org database were selected, being derived from all the existing species or brain regions (up to version 6.0). BMSF was used to reduce features nonlinearly, and support vector classification (SVC) model was built to classify neurons based on the reserved morphological characteristics. SVM-RFE and rough set theory were used to give a comparison with the introduced feature selection methods, while another two classifiers including the back propagation neural network (BPNN) and Naïve Bayes (NB), which are widely used in the pattern recognition field, were employed to test the robustness of the BMSF. A systematic classification of neurons would facilitate the understanding of neuronal structure and function.
Data sets used in this work were downloaded from the NeuroMorpho.org database [
Summary of training and test sets numbers in each neuronal type.
Neuron type | Number of |
Number of |
Total | |
---|---|---|---|---|
1 | Pyramidal | 3172 | 1586 | 4758 |
2 | Motoneuron | 298 | 149 | 447 |
3 | Sensory | 261 | 130 | 391 |
4 | Tripolar | 94 | 48 | 142 |
5 | Bipolar | 48 | 24 | 72 |
6 | Multipolar | 24 | 12 | 36 |
7 | Purkinje | 11 | 5 | 16 |
Total |
|
|
|
Dendritic cells in the NeuroMorpho.org database were cut into a series of compartments, and each compartment was characterized by an identification number, a type, and the spatial coordinates of the cylinder ending point, the radius value, and the identification number of the “parent.” Although the digital description constituted a completely accurate mapping of dendritic morphology, it bore little intuitive information [
The 43 morphological characteristics extracted by L-measure software and their descriptions.
Number | Abbr. | Morphological index | Description |
---|---|---|---|
1 | SS | Soma_surface | Somatic surface area |
2 |
|
|
Total number of trees |
3 |
|
|
Total number of bifurcations |
4 |
|
|
Number of bifurcations plus terminations |
5 |
|
|
Number of terminal tips of a neuron |
6 | NW | Neuronal_width | 95% of second principal component |
7 | NH | Neuronal_height | 95% of first principal component |
8 | ND | Neuronal_depth | 95% of third principal component |
9 | Ty | Type | Compartments are assigned to four different types: 1 = soma, 2 = axon, 3 = dendrites, and 4 = apical dendrites |
10 | Di | Diameter | Average branch diameter |
11 | Dp | Diameter_pow | Diameter of each compartment of the neuron raised to the power of 1.5 |
12 | Le | Length | Total arborization length |
13 | Su | Surface | Surface area of each compartment |
14 | SA | Section area | Total arborization surface area |
15 | Vo | Volume | Total internal volume of the arborization |
16 | ED | Euc distance | Maximum euclidean (straight) distance from soma to tips |
17 | PD | Path distance | Maximum path (along the tree) distance from soma to tips |
18 | BO | Branch_order | Maximum branch order number of bifurcations from soma to tips |
19 | Td | Terminal degree | Total number of tips each segment will terminate into |
20 | TS | Terminal segment | Number of compartments that comprise the terminal branch |
21 | Ta1 | Taper_1 | The change in diameter over path length between two critical points |
22 | Ta2 | Taper_2 | The ratio of the change in diameter to the initial diameter of two critical points. The initial diameter is usually larger |
23 | Bpl | Branch_path length | Summation of the individual compartment lengths that form a branch |
24 | Co | Contraction | Average contraction (the ratio between euclidean and path length calculated on each branch) |
25 | Fr | Fragmentation | Total number of reconstruction points |
26 | DR | Daughter_ratio | Ratio between the diameter of the bigger daughter and the smaller daughter of the current bifurcation |
27 | PDR | Parent-daughter_ratio | Ratio between the diameter of a daughter and its father for each critical point |
28 | Pa | Partition_asymmetry | Average over all bifurcations of the absolute value of ( |
29 | RP | Rall_power | Average over all bifurcations of the sum of the diameters of the two daughters, elevated to 1.5, divided by the diameter of the parent, and elevated to 1.5 |
30 | Pk | Pk |
|
31 | Pc | Pk_classic | Rall power is set to 1.5 |
32 | Pk2 | Pk_2 | Rall power is set to 2 |
33 | Bal | Bif_ampl_local | Average over all bifurcations of the angle between the first two daughter compartments |
34 | Bar | Bif_ampl_remote | Average over all bifurcations of the angle between the following bifurcations or tips |
35 | Btl | Bif_tilt_local | The angles between the end of the parent branch and the initial part of the daughter branches at the bifurcation |
36 | Btr | Bif_tilt_remote | The angles between the previous node of the current bifurcating father and the daughter nodes |
37 | Btol | Bif_torque_local | Angle between the current plane of bifurcation and the previous plane of bifurcation |
38 | Btor | Bif_torque_remote | Angle between the current plane of bifurcation and the previous plane of bifurcation |
39 | Lpd | Last_parent_diam | Diameter of last bifurcation before the terminal tips |
40 | Dt | Diam_threshold | Diameter of first compartment after the terminal bifurcation leading to a terminal tip |
41 | HT | Hillman threshold | Computation of the weighted average diameter between 50% of father and 25% of daughter diameters of the terminal bifurcation |
42 | He | Helix | Helicity of the branches of the neuronal tree. It needs to be at least 3 compartments long to compute the helicity |
43 | FD | Fractal_dim | Fractal dimension metric of the branches in the dendrite trees |
It was considered redundant among attributes. Feature selection was able to save the cost of computational time and storage and simplify models when dealing with high dimensional data sets, and it was also useful to improve classification accuracy by removing redundant and irrelevant features.
For rapid and efficient selection of high-dimensional features, we have reported a novel method named binary matrix shuffling filter (BMSF) based on support vector classification (SVC). The method was successfully applied to the classification of nine cancer datasets and obtained excellent results [
Firstly, denoting the original training set as
Secondly, for each combination, there will be a reduced training set from the original training set according to the subscripts of those selected features, and classification accuracy can be obtained through tenfold cross validation. By repeating this process
Thirdly, taking the
Comparing the mean value of vectors
Detailed procedures can be found in our previous study [
SVM-RFE is an application of recursive feature elimination (RFE) using the weight magnitude as the ranking criterion. It eliminates redundant features and yields more compact feature subsets. The features are eliminated according to a criterion related to their support to the discrimination function, and the support vector is retrained at each step. This method was first successfully used in gene feature selection and afterwards in the fields of bioinformatics, genomics, transcriptomics, and proteomics. For the technical details of the method, refer to the original study by Guyon et al. [
Rough set theory, introduced by Pawlak [
Support vector classification, based on statistic learning theory, is widely used in the machine learning field [
BPNN is one of the most widely employed techniques among the artificial neural network (ANN) models. The general structure of the network consists of an input layer, a variable number of hidden layers containing any number of nodes, and an output layer. The back propagation learning algorithm modifies the feed-forward connections between the input and hidden units and the hidden and outputs units to adjust appropriate connection weights to minimize the error [
Naïve Bayes is a classification technique obtained by applying a relatively simple method to a training dataset [
Feature selection methods are applied to training sets to get optimal feature subsets. For each method, five sets of features were obtained. Table
Summary of selected features.
Feature selection |
Feature |
Number of |
Selected features |
---|---|---|---|
SVM-RFE | I | 10 | SS, HT, DR, Bpl, NH, Btr, Bal, Su, SA, Lpd |
II | 13 | HT, RP, SS, Ta1, Btr, BO, Dp, Di, Td, Fr, DR, Bar, NH | |
III | 12 | HT, FD, SS, DR, Btr, Dp, Di, Fr, BO, Td, Su, Ty | |
IV | 14 | HT, RP, SS, Ta2, Btr, Di, Dp, Fr, BO, Td, SA, Vo, Ta1, TS | |
V | 15 | HT, Lpd, SS, Bpl, Btr, Bal, NH, Ta1, Su, Di, SA, Vo, Fr, Ta2, Ty | |
|
|||
Rough set | I | 13 |
|
II | 13 |
|
|
III | 11 |
|
|
IV | 13 |
|
|
V | 13 |
|
|
|
|||
BMSF | I | 8 |
|
II | 6 |
|
|
III | 8 |
|
|
IV | 7 |
|
|
V | 8 |
|
As shown in Table
In order to evaluate the performance of BMSF and make a comparison with SVM-RFE and rough set, three classifiers were employed to perform independent test. Including the classification performance without features selection, there were twelve classification accuracies. The average accuracies on five random datasets are presented in Table
Classification results with different classification models.
Feature selection methods | Naïve Bayes (%) | BPNN (%) | SVC (%) | Average (%) |
---|---|---|---|---|
All features | 61.35 ± 26.82 | 91.46 ± 1.22 | 97.10 ± 0.43 | 83.30 |
SVM-RFE | 30.78 ± 12.94 | 91.38 ± 0.83 | 93.29 ± 1.20 | 71.82 |
Rough set | 51.30 ± 3.59 | 92.75 ± 0.46 | 93.05 ± 1.45 | 79.03 |
BMSF | 70.53 ± 6.36 | 91.46 ± 1.45 |
|
86.61 |
|
||||
Average (%) | 50.87 | 91.86 | 94.73 |
The independent classification accuracy is the ratio of the total correctly classified samples to the total test samples. As shown in Table
The above independent accuracies indicated that BMSF has an excellent generalization ability and robustness on the three classifiers. We also calculated the average performance of each feature selection method on the three classifiers and the classification performance based on the three different feature selection methods. The results are listed in the last row and column of Table
As the datasets used in this work are unbalanced (as shown in Table
Breakdown of independent tests results of different models (%).
Classifier | FS method | Pyramidal | Motoneuron | Sensory | Tripolar | Bipolar | Multipolar | Purkinje |
---|---|---|---|---|---|---|---|---|
NB | All | 30.96 ± 1.96 | 18.24 ± 3.12 | 41.62 ± 5.47 | 61.26 ± 7.90 | 94.16 ± 4.74 |
|
96.00 ± 8.94 |
SVM-RFE | 29.22 ± 16.4 | 22.31 ± 3.26 | 29.80 ± 60.5 | 56.25 ± 9.66 | 92.50 ± 6.18 | 88.33 ± 21.73 | 96.0 ± 8.94 | |
Rough set | 52.38 ± 4.48 | 22.32 ± 3.08 | 39.20 ± 3.95 |
|
94.16 ± 6.98 | 85.0 ± 10.87 | 96.0 ± 8.94 | |
BMSF | 77.26 ± 7.67 | 25.38 ± 3.73 | 38.93 ± 4.63 | 60.83 ± 4.0 | 90.83 ± 5.43 | 51.67 ± 21.57 | 92.0 ± 17.89 | |
|
||||||||
BPNN | All | 99.10 ± 0.75 | 82.46 ± 9.78 | 57.84 ± 19.64 | 42.94 ± 11.76 | 0.00 ± 0.00 | 0.00 ± 0.00 | 52.00 ± 48.17 |
SVM-RFE | 99.12 ± 0.36 | 83.22 ± 18.44 | 45.24 ± 23.59 | 62.50 ± 9.64 | 15.84 ± 35.42 | 0.00 ± 0.00 | 80.0 ± 34.61 | |
Rough set | 99.08 ± 0.36 | 78.92 ± 4.37 | 71.80 ± 3.09 | 57.06 ± 12.1 | 0.00 ± 0.00 | 0.00 ± 0.00 | 76.0 ± 8.94 | |
BMSF | 98.42 ± 1.08 | 72.00 ± 6.01 | 66.16 ± 16.64 | 60.0 ± 18.47 | 14.16 ± 31.67 | 0.00 ± 0.00 | 76.0 ± 43.36 | |
|
||||||||
SVC | All | 99.56 ± 0.18 | 82.46 ± 6.95 | 93.69 ± 5.23 | 87.50 ± 6.07 | 97.5 ± 2.28 | 18.33 ± 17.07 |
|
SVM-RFE | 99.55 ± 0.13 | 65.38 ± 4.58 | 69.66 ± 15.64 | 69.58 ± 7.00 | 72.5 ± 31.26 | 0.00 ± 0.00 | 88.0 ± 17.89 | |
Rough set | 99.52 ± 0.11 | 77.54 ± 5.58 | 54.23 ± 15.09 | 67.08 ± 7.71 | 89.17 ± 5.59 | 0.00 ± 0.00 | 92.0 ± 10.95 | |
BMSF |
|
|
|
83.33 ± 7.37 |
|
1.67 ± 3.73 | 92.0 ± 17.89 |
For seven neuron types, BMSF-SVC exhibited the best performance on pyramidal neuron, motoneuron, sensory neuron, and bipolar neuron. Though tripolar and multipolar neurons showed excellent performance on Naïve Bayes, they did not do very well on other neuron types. The classification result of multipolar neuron was poor; however, SVM-RFE and rough set also performed less well on SVC. We found that the predicted labels of multipolar neuron are almost the same as those of the pyramidal neuron in all the models, which indicated that the unbalanced datasets had an effect on the prediction of multipolar neuron.
To evaluate whether a certain feature subset is useful in identifying only a single cell type, the optimal model (BMSF-SVC) in this study was employed. For seven neurons types, six hierarchy models were established. In each hierarchy model, it was a binary classification problem. Due to the imbalanced datasets in this paper, accuracy and the Matthews correlation coefficient (MCC) were used to evaluate the established models, and recall was used to evaluate the classification performance of single neuron type as follows:
Ability to distinguish one single cell type from others and the obtained private feature subsets by BMSF-SVC.
Positive versus negative cell type | Accuracy (%) | MCC (%) | Recall (%) | Private feature subsets |
---|---|---|---|---|
{Pyramidal} |
99.10 ± 0.12 | 97.05 ± 0.40 | 99.76 ± 0.10 |
|
|
||||
|
||||
|
||||
|
||||
|
||||
{Motoneuron} |
97.26 ± 1.44 | 94.3 ± 3.02 | 94.50 ± 5.21 | SS |
SS, NH, |
||||
SS, NH, |
||||
SS, NH, |
||||
SS, NH, |
||||
|
||||
{Sensory} |
90.15 ± 1.24 | 80.62 ± 2.46 | 97.85 ± 1.38 | Pa |
Pa, SS, SA, Ta1, ND, Pk2, Btr, NW, Pk, Btl | ||||
Pa, SS, SA, Ta1, ND, Ty, Co, Di, Btr | ||||
Pa, SS, SA, Ta1, ND, Ty, Di | ||||
Pa, SS, SA, ND, Ta1, Ty, Btr, NW, Lpd, Pk | ||||
|
||||
{Tripolar} |
99.16 ± 0.56 | 98.32 ± 1.12 | 99.17 ± 1.41 | NW |
NW, SS, He, Pa, ND, | ||||
NW, SS, He | ||||
NW, SS, He | ||||
NW, SS, He, Pa, ND | ||||
|
||||
{Bipolar} |
96.95 ± 3.07 | 93.86 ± 6.24 | 95.83 ± 2.95 |
|
|
||||
|
||||
|
||||
|
||||
|
||||
{Multipolar} |
100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | DR |
DR | ||||
Pa | ||||
| ||||
Pa |
As shown in Table
In this paper, we used a new feature selection method named BMSF for neuronal morphology classification. Interactions are taken into consideration to get highly accurate classification of neurons, and this method usually selects a small amount of features for easy interpretation. As shown in Table
We should point out that different runs of BMSF may produce different lists of feature subsets. This phenomenon arises from the fact that there are many possible characteristics that may be used to distinguish neurons. For example, feature subsets derived from rough set theory and BMSF achieve similar classification accuracy when applied to SVC classifier. Our goal is to find a minimal set of such features that the combination of them can well differentiate the dependent variables.
The reserved feature subsets on the same data set that resulted from different feature selection methods differed greatly. Li et al. [
Conventional classification techniques were built on the premise that the input data sets were balanced; if not, the classification performance would decrease sharply [
We introduced a new feature selection method named BMSF for neuronal morphology classification, obtained satisfactory accuracy for all of the datasets and each hierarchy model, and were able to select private parsimonious feature subsets for each neuron type. However, it was obvious that classification based simply on neuronal morphology was inadequate. As time goes by, dendrites may continue to grow and axons will generate additional terminals, which will undoubtedly lead to changes in the vital parameters [
All the authors declare that they have no conflict of interests regarding the publication of this paper.
This work was supported by the National Natural Science Foundation of China no. 31000666 and no. 61300130 and by China Postdoctoral Science Foundation nos. 2012M511722 and 2014T70769.