^{1}

^{1}

^{2}

^{3}

^{3}

^{1}

^{2}

^{3}

In this paper, an evolutionary dendritic neuron model (EDNM) is proposed to solve classification problems. It utilizes synapses and dendritic branches to implement the nonlinear computation. Distinct from the classical dendritic neuron model (CDNM) trained by the backpropagation (BP) algorithm, the proposed EDNM is trained by a metaheuristic cuckoo search (CS) algorithm instead, which has been regarded as a global searching algorithm. CS algorithm enables EDNM to avoid several disadvantages, such as slow convergence, trapping into local minimum, and being sensitive to initial values. To evaluate the performance of EDNM, we compare it with a multilayer perceptron (MLP) and CDNM on two benchmark classification problems. The experimental results demonstrate that EDNM is superior to MLP and CDNM in terms of accuracy rate, receiver operator characteristic curve (ROC), and convergence speed. In addition, the neural structure of EDNM can be replaced by a logical circuit completely, which can be implemented in hardware easily. The corresponding experimental results also verify the effectiveness of the logical circuit classifier.

Classification is machine learning techniques that allocate objects in a collective form to find the classes. Many problems of science, business, and medicine can be treated as classification problems, for example, medical diagnosis, quality control, bankruptcy prediction, credit scoring, handwritten character recognition, and speech recognition [

Among them, ANNs are considered as one of the comprehensive classifiers [

Although the learning ability of MLP which utilizes McCulloch and Pitts’s model as a fundamental calculative unit makes it a powerful tool for various applications [

Different neurons own distinct dendritic structures in vivo; even a small variation in the dendritic morphology will produce a great change in neuron functions [

Recently, Legenstein and Maass proposed a single neuron model with dynamic dendritic structure based on STDP and branch-strength potentiation (BSP) [

In our previous work [

Although CDNM has been used in various applications effectively, the original BP algorithm largely limits the computation capability of CDNM. BP is the gradient-based training algorithm; it requires that the neuron transfer function must be differentiable. The gradient information is highly sensitive to the initial conditions, which makes BP suffer from trapping into local minima easily [

The rest of this paper is organized as follows: Section

The proposed EDNM mimics the mechanism of signal interactions in the biological neural model. The signal processing of EDNM is shown as follows: First, the synaptic layer receives the input signals and processes them through one of defined connection cases. Then, the results of the synapses are transferred to the dendritic branches. The membrane layer sums the dendritic activation and transfers the results to the cell body. The structural morphology of EDNM has been presented in Figure

Architecture of EDNM.

In the synaptic layer, each synapse connects one feature attribute to receive the input signals of training samples. A sigmoid function is adopted to describe the process; it can be expressed by

In addition, according to different values of

Four connection types in the synaptic layer.

Six function cases of the synaptic layer. (a) Direct connection. (b) Inverse connection. (c) Constant-1 connection (c_{1}, c_{2}). (d) Constant-0 connection (d_{1}, d_{2}).

Dendritic structure plays an important role in neural computation. Different neurons own distinct dendritic structure; even a small variation in the dendritic morphology arouses a great change in the neural function. Thus, to realize the plasticity of the dendritic morphology, the simplest nonlinear operation named “multiplication” is adopted in DENM. Combined with four connection cases of the synaptic layer, it can implement neural pruning function to build a unit dendritic structure for each specific problem. The mathematical formula can be expressed as

The membrane layer receives the signals from each branch of dendrites and completes a sublinear summation operation. Then, it transfers the results to the cell body. Its equation is defined as follows:

The output signal from the membrane layer is processed by a nonlinear sigmoid function in the cell body. It is the core part of the computation of the single neural model. The signal will be compared with the threshold of the soma; if it is larger, the neuron will fire; otherwise, it will not. The function of the cell body is expressed as follows:

EDNM adopts the neuron-pruning function to realize the plasticity of the dendritic structure. Specifically, neuron-pruning function prunes unnecessary synapses and dendritic branches during the training process. Then it builds a unit structural morphology of EDNM for each specific problem. In EDNM, the pruning mechanism contains two parts, namely, synaptic pruning and dendritic pruning.

As introduced above, if one synaptic layer is in the constant-1 connection case, its output is fixed to 1 no matter what its input is. The fundamental math operation of the dendritic layer is multiplication; it is known that any value multiplied by 1 is equal to itself. It implies that the output of this synaptic layer has no influence on the result of its local dendritic branch. Thus, we can ignore the synapse and the feature attribute it connects to, and this kind of synaptic layers needs to be discarded in EDNM.

Similarly, if a synaptic layer is a constant-0 connection, whatever the input is, its output will remain 0. Because of the multiplication operation and the rule that any value multiplied by 0 is equal to 0, the output of the whole dendritic branch is fixed to 0. The branch makes no contribution to the output of the soma body. Therefore, we should eliminate this kind of dendritic layers which include all the synaptic layers on them and the connecting feature attributes.

In order to further demonstrate the mechanism of neuron pruning, an example of the pruning process in EDNM is illustrated in Figure

An example of neural model pruning process.

Through the synaptic pruning and dendritic pruning, only the direct connections and inverse connections are retained and a unique simplified neural structure is formed according to the problem. Furthermore, the simplified structure can be transformed into a logical circuit by the comparators, logical AND, OR, and NOT gates. As shown in Figure

An example of the logical circuit transformation process.

CS algorithm is inspired by a special lifestyle and aggressive reproduction strategy of cuckoo species. Cuckoo never hatched eggs by themselves and put their eggs in the nest of other bird species. Let other bird species help them to hatch eggs. Some cuckoo species (e.g.,

CS algorithm is first proposed by Yang and Deb in 2009 [

Each cuckoo lays an egg at one time and places it in a randomly selected nest.

The nests with the highest quality of the eggs are carried over to the next generations.

The number of the available host nests is constant, and the egg is discovered by the host bird with a probability of

With these three rules, the basic steps of the CS are summarized as the pseudocode shown in Algorithm

Objective function

Initial a population of

Lay an egg

Evaluate the quality of the nest

Randomly choose one of

Replace

Abandon a part of the worse nests with the probability

Apply Lévy flights to generate new nests;

Evaluate the quality of the new nests;

Rank the nests to find the current best one;

Update: replace

In this section, to compare the classification performances of the EDNM, CDNM, and MLP, we conduct the simulations on two benchmark datasets, namely, the Glass Identification Dataset (GID) and Congressional Voting Records Dataset (CVRD), which are chosen from the UCI machine learning repository. The details of these datasets are shown in Table

Dataset description.

Dataset | No. of samples | No. of attributes | No. of classes (samples divided) | Attributes characteristics |
---|---|---|---|---|

GID | 214 | 9 | 2 (163 : 51) | Numerical |

CVRD | 232 | 16 | 2 (124 : 108) | Categorical |

GID is obtained by measuring the chemical constitution of glass, fabricated by two different processes [

CVRD records the voting results of the 98th Congress. It contains 435 samples that record the data of votes for each of the U.S. House of Representatives Congressmen on the 16 key votes (attributes) identified by the CQA. Its classification task is to find the correct political party affiliation of each congressman [

In our experiments, to measure the performance of each model, we adopt four performance evaluation criteria, namely, accuracy rate, receiver operator characteristic curve (ROC), convergence speed, and nonparametric statistical test.

Confusion matrix.

Total population | True condition | ||
---|---|---|---|

P | N | ||

Predicted condition | Y | True positive | False positive |

(TP) | (FP) | ||

N | False negative | True negative | |

(FN) | (TN) |

where

In our experiment, each dataset is split into a training set and a testing set. Each set contains 50% of the samples [

Number of samples in the training and testing parts.

No. in training data | No. in testing data | Total no. | |
---|---|---|---|

GID | 107 | 107 | 214 |

CVRD | 116 | 116 | 232 |

In order to maintain the fairness of the comparison, the number of the parameters in each model should be set to the same or approximately equal as possible. The modal structure of MLP is different from CDNM and EDNM; the numbers of the weights and thresholds in MLP can be calculated as follows:

In our experiment, when a benchmark dataset is chosen, the value of

Structures of EDNM, CDNM, and MLP for GID and CVRD.

Dataset | Model | No. of inputs | No. of branches | No. of outputs | No. of adjusted weights |
---|---|---|---|---|---|

EDNM | 9 | 12 | 1 | 216 | |

CDNM | 9 | 12 | 1 | 216 | |

MLP | 9 | 20 | 1 | 221 | |

EDNM | 16 | 18 | 1 | 576 | |

CDNM | 16 | 18 | 1 | 576 | |

MLP | 16 | 32 | 1 | 557 |

In EDNM, there are three parameters, namely,

Simulation parameter levels for two benchmark datasets in the EDNM.

Dataset | |||
---|---|---|---|

GID | 2, 5, 8, 10 | 9, 10, 11, 12 | 0.3, 0.5, 0.7, 0.9 |

CVRD | 2, 5, 8, 10 | 16, 18, 20, 22 | 0.3, 0.5, 0.7, 0.9 |

No. of experiments | Testing accuracy means ± std (%) | |||
---|---|---|---|---|

1 | 2 | 9 | 0.3 | 92.59 ± 2.39 |

2 | 2 | 10 | 0.5 | 91.48 ± 2.25 |

3 | 2 | 11 | 0.7 | 91.90 ± 2.58 |

4 | 2 | 12 | 0.9 | 92.87 ± 3.09 |

5 | 5 | 9 | 0.5 | 90.97 ± 3.01 |

6 | 5 | 10 | 0.3 | 92.15 ± 2.17 |

7 | 5 | 11 | 0.9 | 92.62 ± 2.46 |

8 | 5 | 12 | 0.7 | 92.02 ± 1.87 |

9 | 8 | 9 | 0.7 | 91.90 ± 2.20 |

10 | 8 | 10 | 0.9 | 90.90 ± 2.62 |

11 | 8 | 11 | 0.3 | 91.43 ± 2.46 |

13 | 10 | 9 | 0.9 | 92.87 ± 1.88 |

14 | 10 | 10 | 0.7 | 91.46 ± 4.11 |

15 | 10 | 11 | 0.5 | 92.34 ± 2.59 |

16 | 10 | 12 | 0.3 | 91.12 ± 2.49 |

No. of experiments | Testing accuracy means ± std (%) | |||
---|---|---|---|---|

1 | 2 | 16 | 0.3 | 96.18 ± 1.50 |

2 | 2 | 18 | 0.5 | 96.29 ± 1.59 |

3 | 2 | 20 | 0.7 | 94.28 ± 8.16 |

4 | 2 | 22 | 0.9 | 95.14 ± 2.44 |

5 | 5 | 16 | 0.5 | 94.40 ± 7.93 |

7 | 5 | 20 | 0.9 | 93.59 ± 7.64 |

8 | 5 | 22 | 0.7 | 94.54 ± 8.64 |

9 | 8 | 16 | 0.7 | 95.57 ± 2.55 |

10 | 8 | 18 | 0.9 | 95.09 ± 7.78 |

11 | 8 | 20 | 0.3 | 94.20 ± 8.15 |

12 | 8 | 22 | 0.5 | 95.95 ± 1.74 |

13 | 10 | 16 | 0.9 | 95.11 ± 4.19 |

14 | 10 | 18 | 0.7 | 93.51 ± 5.35 |

15 | 10 | 20 | 0.5 | 96.12 ± 2.35 |

16 | 10 | 22 | 0.3 | 94.91 ± 3.11 |

In order to verify the classification performance of EDNM, we compare it with MLP and the original CDNM on two benchmark datasets. Table

Results of accuracy,

Dataset | Model | Accuracy (%) | AUC | ||
---|---|---|---|---|---|

EDNM | — | 0.9868 | — | ||

CDNM | 86.95 ± 7.28 | 8.07 | 0.7782 | 2.42 | |

MLP | 84.21 ± 6.58 | 9.60 | 0.7444 | 5.37 | |

EDNM | — | 0.9725 | — | ||

CDNM | 67.16 ± 20.46 | 7.38 | 0.6147 | 5.91 | |

MLP | 84.63 ± 8.89 | 8.94 | 0.8927 | 1.85 |

In addition, for the comprehensive evaluation of model performance, the convergence curves of three models on two benchmark problems are illustrated in Figure

The convergence speeds of EDNM, CDNM, and MLP for GID and CVRD.

The AUC curves of EDNM, CDNM, and MLP for GID and CVRD.

Based on the above experimental results, it can be concluded that EDNM is capable of providing more powerful classification performances to solve GID and CVRD problems compared to MLP and CDNM. Higher convergence speed indicates that EDNM is a more efficient classifier, which will save computation time in the practical applications.

The neuron-pruning function of EDNM has been introduced in Section

Evolution of neuronal morphology on GID problem.

Firstly, we present the evolution of the structural morphology of the GID problem in Figure

Evolution of neuronal morphology on CVRD problem.

Simplification results of neuron pruning on GID and CVRD problems.

Dataset | No. of features | No. of branches | No. of synapses | |||
---|---|---|---|---|---|---|

Before | After | Before | After | Before | After | |

GID | 9 | 3 | 12 | 2 | 108 | 4 |

CVRD | 16 | 1 | 18 | 1 | 288 | 1 |

As mentioned above, the simplified structures of EDNM can be completely substituted by the logical circuits. In this section, we attempt to verify the effectiveness of the logical circuit classifiers. According to the final neural structures in Figures

Logical circuit classifiers of GID and CVRD problems.

Besides, we compare the classification performances of the logical circuit classifiers and the normal EDNM in Table

Accuracy comparison of EDNM and logical circuits.

Dataset | Accuracy (%) | |
---|---|---|

EDNM (%) | Logical circuit (%) | |

GID | 93.27 | 92.52 |

CVRD | 96.41 | 96.55 |

In this study, an EDNM is proposed to solve the classification problems. It consists of four layers, namely, the synaptic layer, the dendritic layer, the membrane layer, and the soma. The unique structure makes EDNM implement the neural pruning mechanism, which can rule out the unnecessary synapse and dendritic branches. Compared with the original BP algorithm of CDNM, CS algorithm has higher convergence speed and great classification accuracy on two benchmark problems, where the statistical results demonstrate that EDNM performs significantly better than MLP and CDNM. Besides, we also present the logical circuit classifiers produced by EDNM and verify their accuracy rate. The experimental results show that the logical circuits maintain satisfying classification performances. It is noted that, to the best of our knowledge, when the logical circuit classifiers run on hardware, the classification speed will be higher than that in all the other classifiers in the literation. In our future research, we will attempt to adopt the multiobjective optimization algorithms to train the structure and weights of EDNM, simultaneously, which may be able to produce a more simplified and high-accuracy logical circuit for each classification problem.

The benchmark classification datasets could be downloaded freely at

The authors declare that they have no conflicts of interest.

This research was supported by the Guangdong Basic and Applied Basic Research Fund Project (No. 2019A1515111139) and JSPS KAKENHI (Grant no. 19K12136).