In biomedical signal analysis, artificial neural networks are often used for pattern classification because of their capability for nonlinear class separation and the possibility to efficiently implement them on a microcontroller. Typically, the network topology is designed by hand, and a gradient-based search algorithm is used to find a set of suitable parameters for the given classification task. In many cases, however, the choice of the network architecture is a critical and difficult task. For example, hand-designed networks often require more computational resources than necessary because they rely on input features that provide no information or are redundant. In the case of mobile applications, where computational resources and energy are limited, this is especially detrimental. Neuroevolutionary methods which allow for the automatic synthesis of network topology and parameters offer a solution to these problems. In this paper, we use analog genetic encoding (AGE) for the evolutionary synthesis of a neural classifier for a mobile sleep/wake discrimination system. The comparison with a hand-designed classifier trained with back propagation shows that the evolved neural classifiers display similar performance to the hand-designed networks, but using a greatly reduced set of inputs, thus reducing computation time and improving the energy efficiency of the mobile system.
The traditional way to craft an artificial neural
network (ANN) for a classification task is to hand design a network topology
and to find a set of network parameters using a gradient-based
error-minimization algorithm such as back propagation [
Continuous monitoring of the sleep/wake state of
high-risk professionals such as pilots, truck drivers, or shift workers can
potentially decrease the risk of accidents and help scheduling breaks and
resting times. However, implementing such a classification in a wearable device
is a challenging task. Limited energy and processing resources as well as the
increased noise level due to movement artifacts and a constantly changing environment put tight restrictions on the
choice of sensors and algorithms. Traditionally, the states of sleep and wake
are classified based on the analysis of brain wave patterns (EEG) [
For mobile sleep/wake pattern screening, a commonly
used technique is actigraphy [
Overview of the sleep/wake classification system. (a) The raw electrocardiogram (ECG) and respiratory effort (RSP) signals are cut into windows of 40.06 seconds. (b) A short-time fast Fourier transformation (FFT) is used to calculate the spectral power of the windowed signals. (c) The resulting frequency data are fed to a feed-forward artificial neural network (ANN) and (d) a symmetric threshold classifies the ANN output into sleep or wake state estimates.
Portable Heally recording system mounted on a shirt. (1) ECG gel electrodes; (2) inductive belt sensor; (3) electronics modules; (4) NiMH battery.
An additional difficulty is that the generation of a set of labeled data for the training of
the classifier is typically a time-consuming activity for both the subjects
from whom data is collected and the technicians who must label the data
[
Neural networks can be described as directed graphs,
where the nodes represent a neuron model, and the edges of the graph are
associated with the weighted connections between the neurons, the so-called
synaptic weights. The design of a network for a particular task thus involves
the choice of the topology of the graph (i.e., the network architecture) and a
suitable set of numerical parameters (i.e., the synaptic weights and the
parameters of the neuron model). The automatic synthesis of the topology and
parameters of a neural network requires a computer representation for both
aspects of the network, combined with an algorithm capable of performing a
search in the space defined by this representation. Evolutionary algorithms
have been extensively used to evolve neural classifiers because these
algorithms can combine a flexible representation with a high potential of
stochastic exploration of the search space [
The simplest approach to this, the so-called
A promising alternative to direct and developmental
representations that is getting more and more popular is
The concept of implicit encodings like AGE
is loosely inspired by the working of biological gene regulatory networks
(GRNs). In biological GRNs, the interactions between the genes are not explicitly
encoded in the genome but follow implicitly from the physical and chemical environment in which the genome is immersed.
Simplifying a bit the picture, the activation of a biological gene depends on the
interaction of molecules produced by another gene with parts of the activated
gene called regulatory regions (Figure
(a) In biological gene networks, the link between genes is realized by molecules that are synthesized from the coding region of one gene and interact with the regulatory region of another gene. (b) Analog genetic encoding abstracts this mechanism using an artificial genome containing markers that identify the artificial genes, and an interaction map that creates links between pairs of artificial genes by associating with them a numerical value that represents the strength of the link.
Transcriptional regulation
The AGE abstraction
In summary, the AGE genome can
be decoded first by extracting the neurons with the associated (coding and
regulatory) sequences of characters. This is realized by scanning the genome
for tokens which indicate the presence of a neuron (GN). Together with
predefined terminator sequences (TE), these tokens delimit the part of the
genome associated with the respective neuron. The enclosed sequences of
characters are interpreted as the coding and regulatory sequences of the
respective neuron. Subsequently, the interaction map
A simple
artificial neural network represented with analog genetic encoding. The
interaction strengths
In this framework, there are several different
possibilities to implement connections from external inputs to external outputs
(see [
There are
different ways to implement external inputs and outputs in AGE [
As the sequences which define the strength of the
synaptic connections can have a variable length and the interaction map is
defined to operate on sequences of arbitrary length, a large class of genetic
operators can be used to alter the network. In particular, we use the
biologically plausible insertion, substitution, and deletion of characters and
the transposition, duplication, and deletion of fragments of genome. The
changes in the genome caused by these mutation operators can reflect both
changes in the parameters of the network as well as changes in the network
structure. For example, the insertion of a character in the genome can lead to
a change of the synaptic weight connecting a particular input to the output
neuron. The deletion of a fragment of genome associated with an input of the
network can lead to the removal of this particular input from the network.
Furthermore, the number of hidden neurons in the network can increase (e.g.,
after a genome fragment duplication) or decrease (e.g., after a character
substitution) over the course of evolution. Given the fact that parts of the
genome can be noncoding (i.e., they are not part of the description of a
neuron) and that the interaction map is defined to be highly redundant, many
mutations do not have an effect on the decoded networks. This allows for a high
neutrality in the search space, which can improve evolvability [
To compare the performance of the classical approach to classifier synthesis and training with the state-of-the-art neuroevolution method based on AGE, we performed a set of experiments, where we compared the performance of a neural network with fixed hand-designed topology and variable weights trained with back propagation, with that of neural networks synthesized with an evolutionary algorithm-based on AGE. As anticipated, we are interested in the performance in a sleep/wake detection task, where data from a set of users is available for network synthesis and training, but the performance is expected to generalize to additional users. We thus investigated the performance of the two methods when trained on ECG and RSP data collected on multiple subjects, and tested on data from a different subject.
The data used in the following experiments are
identical with those described in [
In order to
evaluate the performance of the two classifiers, we divided the data into three
different sets: atraining set (TR), a validation set (VA), and a test set (TE)
(see Figure
Distribution of the experimental data used for the training, validation, and test of the hand-designed and the evolutionary synthesized neural classifiers. The numbers indicate users in the training set (TR), users in the validation set (VA), and users in the test set (TE). There are six repetitions with different combinations of users/sessions in training, validation, and test sets.
As a baseline for the classification accuracy, we used
a feed-forward ANN with no hidden layers and a single output unit with a
tangent-sigmoid transfer function. Additional experiments not reported here
showed that the use of ANNs with a hidden layer does not improve the
performance of the classifier. A similar finding has been reported by [
For the automatic synthesis of the network topology
and parameters, the AGE representation was combined with a standard genetic
algorithm (see [
The neural classifier is automatically synthesized with analog genetic encoding. The evolved network can connect to an arbitrary subset of the 409 inputs from the ECG data, the 327 inputs from the RSP data and a bias unit. As the size of the network is not fixed, the number of hidden units in the network can increase or decrease over the course of evolution. The output unit indicates sleep or wake states using a simple threshold at an activation level of zero.
Selection was performed using tournament selection and
elitism. The algorithm parameters and mutation probabilities are listed in
Table
The parameters used in the evolutionary algorithm.
Parameter | Value |
---|---|
Population size | 100 |
Tournament size | 2 |
Elite size | 1 |
Recombination probability | .1 |
Probability of character substitution (per character) | .001 |
Probability of character insertion (per character) | .001 |
Probability of character deletion (per character) | .0015 |
Probability of fragment transposition | .01 |
Probability of fragment duplication | .01 |
Probability of fragment deletion | .015 |
Probability of neuron insertion | .01 |
For both the back-propagation training and the evolutionary process, the measure of quality of the classifier was the sum over the data points of the squares of the difference between the actual and the desired classifier output.
As shown in Figure
The average classification accuracy of the evolved
networks (AGE) and the fixed topology networks trained with back
propagation (
Histogram of the number of input features used by the evolved networks in the five repetitions of each of the six training cases. From the 30 networks, 8 used from 3 to 45 inputs, 2 used from 95 to 113 inputs, 3 used from 162 to 196 inputs, 4 used from 231 to 282 inputs, 1 used 411 and 1 used 484 inputs, 2 networks used from 522 to 533 inputs, 6 networks used from 602 to 647 inputs, and 3 networks used from 628 to 732 inputs.
The performance of the evolved networks in the five repetitions of each of the six training cases. The horizontal axis represents the number of input features used by the network and the vertical axis gives the corresponding classification performance. The symbols indicate the number neurons in the hidden layer of the network. A cross indicates 0 hidden neurons, a circle indicates 1 hidden neuron, a star indicates 2 hidden neurons. Both the number of inputs and the number of hidden neurons are not significantly correlated with classification accuracy (see text).
The evolved networks for the five repetitions of each of the six cases, sorted by the number of used input features. All networks use input features from both ECG and RSP data.
As mentioned above, the fixed topology network has no
hidden layer. Of the 30 evolved networks, 19 feature no hidden neurons, 7
feature one hidden neuron, and 4 feature two hidden neurons. However, there is
no correlation between the number of hidden neurons and the classification
accuracy (Spearman's rank correlation coefficient
Portable devices for biomedical signal analysis, like sleep/wake classification, have the potential to alleviate health problems and prevent accidents. Recent advances in sensor development and miniaturization allow for the construction of small mobile devices which integrate biomedical sensors and a microprocessor with sufficient processing power for many applications. However, one of the critical challenges, that remains, is the design of efficient classifiers which can be implemented on these small mobile systems. While the classification accuracy has to be as high as possible, the computational effort and thus the energy requirements for classification have to remain low. The results presented in this paper demonstrate that analog genetic encoding (AGE) permits the automatic evolutionary synthesis of compact neural classifiers for the problem of sleep/wake classification. Compared to a hand-designed classifier trained with back propagation, the possibility of the evolutionary selection of a subset of the available inputs permits a drastic reduction of the number of inputs without significant degradation of the classifier performance. For example, in the experiments presented here, the evolutionary synthesis with AGE found a classifier with the accuracy of 88.49%, using only 15 of the 736 input features used by the hand-designed network. The implementation of this evolved solution on a digital signal controller of the dsPIC33 product family (Microchip Technology Inc., USA) requires only 5.13% of the instructions used by an implementation of the hand-designed network on the same processor. This is a reduction of the computational cost of almost 95%. Moreover, the savings in computational cost and energy can be increased even further by adapting the sensory modalities and preprocessing steps to the reduced set of input features.
This work was supported by the Swiss National Science Foundation, Grant no. 200021-112060 and the Solar Impulse Project grant of Ecole Polytechnique Fédérale de Lausanne (EPFL). Thanks to Daniel Marbach and Sara Mitri for their comments on an earlier version of this manuscript and the anonymous reviewers for their helpful suggestions.