Silicon-based computer systems have powerful computational capability. However, they are easy to malfunction because of a slight program error. Organisms have better adaptability than computer systems in dealing with environmental changes or noise. A close structure-function relation inherent in biological structures is an important feature for providing great malleability to environmental changes. An evolvable neuromolecular hardware motivated by some biological evidence, which integrates inter- and intraneuronal information processing, was proposed. The hardware was further applied to the pattern-recognition domain. The circuit was tested with Quartus II system, a digital circuit simulation tool. The experimental result showed that the artificial neuromolecularware exhibited a close structure-function relationship, possessed several evolvability-enhancing features combined to facilitate evolutionary learning, and was capable of functioning continuously in the face of noise.
Effective programmability is an important feature inherent in computer systems, including software and hardware, which allows us to explore various problem domains. However, most of computer systems are brittle in the sense that a slight modification of a system’s structure can inadvertently change its functions or cause it to malfunction [
In the early 1990s, some other researchers concentrated on applying evolutionary techniques to hardware design. They attempted to use a reconfigurable hardware to continually change the internal circuit structure until the desired structure appears. This field was called evolvable hardware (EHW). EHW brought an interdisciplinary integration. One such idea is to combine the merits of biological systems and computer systems together and hopefully create hardware with better adaptability. For example, Sipper and Ronald [
Our goal is to provide the digital machine with a representation of the internal structure-function relations of biological systems, to capture some of the dynamic modes of the processing of these systems, and to incorporate learning algorithms of the type used in natural systems. Redundancy, weak interactions, and compartmentalization are three important features inherent in biological structures that facilitate evolutionary learning [
The ANM model was motivated from the molecular mechanisms inside real neurons. The model consists of two types of neurons: cytoskeletal neurons and reference neurons. Cytoskeletal neurons have significant intraneuronal information processing that might directly or indirectly relate to their firing behavior. They combine, or integrate, input signals in space and time to yield temporally patterned output signals. Reference neurons serve as pointers to other neurons in a way that allows for interneuronal memory manipulation.
In this section, we introduce the intraneuronal architecture that plays the role of integrating spatiotemporal signals inside a neuron and the interneuronal architecture that orchestrates groups of neurons for performing coherent tasks. We then explain the evolutionary learning algorithm used in this model.
The model is based on two hypotheses. There are some brain neurons in charge of the time-space information transition. This kind of neuron is called cytoskeletal neuron. Cytoskeletal neurons are based on the operation hypothesis between the nerve cell cytoskeletons and molecules, producing a time-space input signal and transducing it into a series of time outputs [ There are some brain neurons in charge of memory control and neuron group organization. This kind of neuron is called reference neurons. The purpose of reference neurons is to form a common-goal information processing group from cytoskeletal neurons. By the memory screening of reference neurons, each workgroup would have neurons of different internal structures, thus being able to finish the group task [
It has been firmly established by now that information processing inside a neuron is significant. The objective of the present study is not to identify the precise nature of these mechanisms, but rather to capture the working hypothesis that the cytoskeleton serves as a signal integration system. Our model is restricted to the membrane components. In the present implementation, the membrane of the cytoskeleton is abstracted as a macromolecular network (a cytoskeletal network) comprising a number of components capable of initiating, transmitting, and integrating cytoskeletal signals. Our assumption is that an inter-neuronal signal impinging on the membrane of a neuron is converted to an intraneuronal signal (a cytoskeletal signal) transmitting on the cytoskeleton. This process was called “transduction”; therefore, a cytoskeletal neuron could be considered a transducer with a specific structure. Cytoskeletal neurons are platforms of message processing, and they are inspired by the signal integration and memory function of the cytoskeleton.
This research utilized 2D cellular automata (CA) [
Structure of cytoskeletal neuron.
When an external stimulus hits a cytoskeletal neuron membrane, it will activate the readin enzyme at that location. The activation will cause a signal flow to transmit along the route of the same cytoskeletal elements. For example, after the on-location (3, 2) readin received the external input, it will transmit the signals to its eight neighbors that have the same cytoskeletal element locations. The illustration shows that it can transmit the signal to C2 at locations (2, 2) and (4, 2). Any cytoskeletal element that receives this kind of signal will do the same, thus forming the phenomenon of a signal flow. In order to ensure it is a one-way transmission, meaning there will not be any signal backflow or loop formed, the cytoskeletal element will enter a temporal resting state after the transmission. This is called a refractory state. The additional remark is that after a signal was transmitted by a cytoskeletal element, the signal did not disappear immediately within the element. Instead, the signal would decrease progressively until it finally disappeared. The decreasing signal and the new-coming signals would cause a time-space integration reaction, and that is a very important mechanism that decides when a firing will occur.
There could be some interactions among different cytoskeletal fibers. Microtubule-associated proteins (MAPs) have the ability to connect different cytoskeletal fibers, thus causing cross-fiber signal flow channels. This will help the flow of microsubstances within neurons. For instance, when the input signal originated from location (3, 2) goes along the C2 elements of the second column, it will meet an MAP-linked C1 element at location (5, 2). The C2 signal will be transmitted to C1 through MAP, and another signal flow will be formed in C1. However, due to different types of cytoskeletal fibers and different transmission features, there might be some energy transition problems when signals going through different mediums. Hence, regarding the cross-fiber signals, this research defined the signal bearing capacity of C1, C2, and C3 as S, I, and W, meaning strong, intermediate, and weak. Because the linking function provided by MAP allows the signals to flow among different molecule elements, there exist information processing behaviors within the neurons.
When a time-space integrated cytoskeletal signal arrives at a location of a readout enzyme, the activation will lead to a neuron firing. For example, the signal flows started at locations (1, 5) and (8, 7) may be integrated at location (5, 5), and the readout enzyme at that location would be activated, thus causing a neuron firing. Because the integrated cytoskeletal signals may continuously appear, the firing outputs become a series of signals that happened in different time points. This research collected these signals to serve as the reference for transduction efficiency assessments.
In the process of digitalization, each grid in cytoskeletal neuron is called processing unit (PU). Figure
Conceptual architecture of a PU. The input block is illustrated in Figure
In the present implementation, there are two possible mechanisms to initiate a cytoskeletal signal. One is directly initiated by an external stimulus. When a PU receives an external stimulus and there is a readin enzyme sitting on it, a new cytoskeletal signal is initiated. The other mechanism is combining some specific combinations of cytoskeletal signals in space and time to turn a PU into a highly activated state, which in turn initiates a new signal. Each PU processes the signals sent from its eight neighboring PUs through the input block (Figure
Conceptual architecture of the input block.
The following explains how to implement signal transmission on an
A switch controlled by the interrupt block is used to regulate the signal flow from the input block to the integration block (Figure
The interrupt block (a) and the output block (b).
As mentioned earlier, there are three types of PUs for transmitting signals. Our implementation of signal integration is that, to fire a neuron, it requires at least two different types of signals to rendezvous at a PU within a short period of time. In other words, a PU serves as a signal integrator that combines different signals in space and time. To capture this feature, two hypotheses are used. The first is that different PU types have different transmission speeds. The second hypothesis is that an activated PU can influence the state of its neighboring PU through MAP linking them together. That is, the latter will make a state transition when it receives a signal from the former.
We assume that a PU has six possible states: quiescent (
Transition rules of a PU. S, I, and W indicate a signal from a highly activated C1-, C2-, and C3-type PU, respectively. For example, if C1-type PU in the state
C1- type PU
C2- type PU
C3- type PU
In the present implementation, a signal traveling along C1-type PUs has the slowest speed, but also has the greatest degree of influence on the other two PU types. In contrast, a signal traveling along C3-type PUs has the fastest speed, but also has the least degree of influence on the other two PU types. The speed and the degree of influence of a C2-type signal are between those of a C1- and C3-type signal. We note that the degrees of influence between two different PU types are asymmetrical. For example, a C1-type PU in the state
The integration block is the major component of the ANM design that integrates signals transmitting in space and time (Figure
Conceptual architecture of the integration block.
All of the digital circuit modules were design with Verilog using Quartus II software, a digital circuit design tool developed by the Altera Corporation (San Jose, CA). The final design of circuits were downloaded into an FPGA device which produced by the Altera Corporation.
The reference neuron scheme is basically a Hebbian model, in which the connection between two neurons is strengthened when they are active simultaneously. This model also has a hierarchical control feature. With this feature, reference neurons are capable of assembling cytoskeletal neurons into groups for performing specific tasks. Orchestration is an adaptive process mediated by varying neurons in the assembly which selects appropriate combinations of neurons to complete specific tasks. Currently, cytoskeletal neurons are divided into a number of comparable subnets. By comparable subnets, we mean that neurons in these subnets are similar in terms of their inter-neuronal connections and intraneuronal structures. Neurons in different subnets that have similar inter-neuronal connections and intraneuronal structures are grouped into a bundle.
Two levels of reference neurons are used to manipulate these bundles of neuron. The two levels form hierarchical control architecture (Figure
Hierarchical inter-neuronal control architecture.
The connections among low-level reference neurons and cytoskeletal neurons are fixed. However, the connections between high-level reference neuron and low-level reference neuron layers are subjected to change during evolutionary learning. The above process is called orchestral learning.
Processing units are responsible for transmitting and integrating cytoskeletal signals. Evolution at the level of PU configurations is implemented by copying (with mutation) the PU configurations of neurons in the best-performing subnets to those of comparable neurons in the lesser-performing subnets. Variation is implemented by varying the PU configurations during the copy procedure. We note that different PU configurations exhibit different patterns of signal flows.
In the present implementation, the ANM system has 256 cytoskeleton neurons, which are divided into eight comparable subnets. As we mentioned earlier, comparable subnets are similar in terms of their inter-neuronal connections and intraneuronal structures. Thus, they also can be grouped into 32 bundles. The copy process occurs among neurons in the same bundle. The initial patterns of readin enzymes, readout enzymes, MAPs, and PU-types of the reproduction subnet are randomly decided. That is, the initial value of each bit is randomly assigned as 0 or 1. The evolutionary learning algorithm is shown in Algorithm
( patterns of each neuron in the reproduction subnet. Each neuron is denoted by neuron (2) neuron in the reproduction subnet to those of comparable neurons in the competition subnets. Copy neuron (3) second subnet, the pattern in the fourth subnet. Vary where (4) subnet. (5) in the best-performing subnet to those of comparable neurons in the reproduction subnets, if the former shows better performance than the latter. (6)
Evolution of reference neurons is implemented by copying (with mutation) the patterns of low-level reference neuron activities loaded by the most fit high-level reference neurons to less fit high-level reference neurons (details can be found in [
We applied the chip to the IRIS dataset, one of the best known datasets found in the pattern recognition literature. The dataset was taken from the machine learning repository at the University of California, Irvine. The dataset contains 3 classes (Iris Setosa, Iris Versicolour, Iris Virginica) of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other two; the latter are not linearly separable from each other. There are four parameters in each instance: sepal length, sepal width, petal length, and petal width.
The initial connections between these 4 parameters and cytoskeletal neurons were randomly decided, but subject to change as learning proceeded. Through evolutionary learning, each cytoskeletal neuron was trained to be a specific input-output pattern transducer. That is, each of these neurons became responsible for processing only a small subset of stimuli generated from these 4 parameters. We used five bits to encode each of these 4 parameters. In total, there were 20 bits required to encode all of them. For each parameter, the minimal and maximal values of these 150 instances were determined (to be denoted by MIN and MAX, resp.), and the difference between these two values was divided by 5 (to be denoted by INCR). The transformation of each actual parameter value (to be denoted by ACTUAL) into the corresponding 5-bit pattern was shown to be
Interface of the ANM design with the IRIS dataset.
The proposed hardware architecture incorporated several parameters PU, MAP, readin, and readout (denoted as P, M, I, and O, resp.) that allowed us to turn them on or off independently for evolutionary learning. We first study the manner in which problem-solving capability (or evolvability) depended on each level of parameter changes. And then we investigated the effects of increasing the number of evolutionary learning parameters (levels) opened for evolution. For each case of parameter changes (to be described later), five runs were performed. For each run, 100 out of these 150 instances in the IRIS set were selected at random as the training set whereas the remaining 50 instances were grouped as the testing set. All the results reported below were the average differentiation rates of five runs. We first trained the chip for 1200 cycles (at which point learning appeared to slow down significantly). Then, the chip after learning for 1200 cycles was tested with the testing set. The IRIS dataset includes 150 instances. In the evolvability experiment, the dataset were performed for five runs. For each run, the IRIS dataset was randomly divided into two parts: training set and testing set. The training set includes 100 instances and testing set 50 instances. We trained the chip for 1,200 cycles per run. It takes about 30 seconds to complete a learning cycle. The total running time of this experiment takes about 50 hours (30 s * 1200 cycles * 5 runs). When compared with BP neural network and SVM, the time needed to perform the experiment with our system is much longer than we expect. This is because the whole system has been simulated in a computer that simulation has to be performed in a step-by-step manner. When the hardware design has been totally realized with a real digital chip, it might take only microseconds to accomplish the assigned tasks.
To investigate the significance of each parameter, we first allowed only one parameter to change during the course of learning whereas evolution at the other three levels was turned off. In total, there were four experiments performed. Among these four parameters, the chip after learning when only readin enzymes were allowed to evolve alone achieved the highest recognition rate (i.e., 91.6%), when only PU types were allowed the second highest (i.e., 90.0%), when only MAPs were allowed the third highest (i.e., 84.4%), and when only readout enzymes were allowed the lowest rate (83.6%). This provided us some preliminary information about which levels of parameter changes might be friendlier to evolution than others.
The following experiment was to study the manner in which problem-solving capability (or evolvability) depended on different combinations of parameter changes. We first increased the number of parameter changes to two (i.e., two parameters were allowed to evolve simultaneously). There were six combinations of two parameter changes. As shown in Figure
Learning performance of each learning mode.
For comparison we applied SVM and BP to the same training and testing sets. As above, five runs were performed. In the former model the average differentiation rates of the training and testing sets were 95.6% and 91.9%, respectively, while in the latter were 96.7% and 91.7, respectively. The above result suggested that the chip had performance comparable to either BP or SVM (Table
Average differentiation rates of different models with the Iris dataset.
Model | Training | Testing |
---|---|---|
BP neural network | 96.7% | 91.7% |
Support vector machine | 95.6% | 91.9% |
ANM | 94.8% | 94.0% |
This experiment was to test the capability of the ANM design after substantial learning in tolerating noise. Through gradually increasing the degree of noise, we observed the input/output relationship of the ANM design. For each test, we kept the system’s structure unchanged but varied the pattern of signals sent to cytoskeletal neurons. If the system’s outputs changed gradually with the extent of the increase in pattern variations, this in part supported that the system’s structure embraced some degree of noise tolerance capability. The ANM design trained for 1,200 cycles was used.
In the following, we first tested the system with spatial noise imposed on the training patterns. To generate a test set, we made a copy of the training set, but altered some bits during the copy process (changing a bit into “1” if it was “0” and into “0” if it was “1”). Five levels of variations were imposed during the copy process: 5%, 10%, 15%, 20%, and 25%. For example, at the 5% level of variations, we mean that each bit has a 5% possibility of being altered. For each level of variations, ten test sets were generated. The total clock difference (TCD) value is the measure pointer indicates the difference of firing time in the circuit. The TCD value increased slightly as we imposed a 5% level of variations on the patterns. Even when we increased the variation level to 25%, the system still demonstrated acceptable results. The TCD value was increased from 38 to 310 at the 5% level of variation (i.e., increased 272), to 569 at the 10% level of variation (i.e., increased 531), to 633 at the 15% level of variation (i.e., increased 625), to 626 at the 20% level of variation (i.e., increased 588), and to 1030 at the 25% level of variation (i.e., increased 992). If we divided the increments by the TCD value before learning (i.e., 4781), the rate of the TCD values increased gradually as we augmented the noise levels (Table
Effect of increasing the degree of noise on the rate of the TCD value growth.
Level of variations imposed | 5.0% | 10.0% |
Rate of the TCD value increased | 5.7% | 11.1% |
When a system is running in the real world, it is inevitable to be confronted with noise generated either from the environment or the system itself. When noise is made only temporarily, structural changes of a system may not be necessary (i.e., a system may ignore this noise). By contrast, a system is required to alter its structure in responding to this noise if it leads to a permanent change in the environment. In such a case, a system has to learn in a moving landscape when environmental change occurs from time to time. Two main results are obtained in the noise tolerance experiment. One is that learning is more difficult in a noisy environment than in a noiseless environment, and that the system is able to learn continuously when noise is made in a temporary manner. The other result is that the system demonstrates a close structure/function relation. We note that noise occurring at different levels of the cytoskeletal structure has different degree of influence on its outputs. As we gradually modify the structure, its outputs change accordingly. On the other hand, we examine the system’s outputs by gradually imposing noise in space and time on its input patterns. The output changes gradually (i.e., proportionally) as the degree of noise is increased. An interesting result is that its system’s output does not necessarily change accordingly as we increase the degree of noise generated in time. Note that delaying a signal may alter a neuron’s firing activity. However, this may not be true when several signals are delayed simultaneously as these signals may integrate at a later time (undoubtedly, this will delay its firing timing). The above results demonstrate that this system has good noise tolerance capability in dealing with spatiotemporal changes in its inputs, implying that it possesses an adaptive surface that facilitates evolutionary learning. With this feature, the ANM design can be applied to various real-world problems. We note that the ability to separate patterns is clearly a prerequisite for pattern recognition. However, it is equally important to recognize a family of patterns that are slightly varied in space and time. If a system is trained on a particular training set, any ability that it has to respond correctly to noise induced variations in this set will be a form of generalization. The manner of generalization depends on its integrative dynamics (i.e., the flow of signals in the cytoskeleton). This is directly or indirectly influenced by a neuron’s PU configuration. Generally speaking, the input patterns recognized by a neuron with internal dynamics will be generalized in a more selective way than simple threshold neurons.
This paper was supported in part by the R.O.C. National Science Council (Grant NSC 98-2221-E-224-018-MY3).