Robust Template Decomposition without Weight Restriction for Cellular Neural Networks Implementing Arbitrary Boolean Functions Using Support Vector Classifiers

If the given Boolean function is linearly separable, a robust uncoupled cellular neural network can be designed as amaximalmargin classifier. On the other hand, if the given Boolean function is linearly separable but has a small geometric margin or it is not linearly separable, a popular approach is to find a sequence of robust uncoupled cellular neural networks implementing the given Boolean function. In the past research works using this approach, the control template parameters and thresholds are restricted to assume only a given finite set of integers, and this is certainly unnecessary for the template design. In this study, we try to remove this restriction. Mintermandmaxterm-based decomposition algorithms utilizing the softmargin andmaximal margin support vector classifiers are proposed to design a sequence of robust templates implementing an arbitrary Boolean function. Several illustrative examples are simulated to demonstrate the efficiency of the proposed method by comparing our results with those produced by other decomposition methods with restricted weights.


Introduction
Cellular neural networks (CNNs) are large scale nonlinear circuits composed of locally connected cells, which was introduced in 1988 by Chua and Yang [1,2].CNN has a tremendous variety of applications in the fields of dynamic systems and signal processing [3][4][5][6][7][8][9].The analysis of the dynamic behavior for the class of CNNs without feedback interconnections from neighboring cells, namely the uncoupled CNNs, is one of the popular research topics.A main feature of the uncoupled CNNs is that the binary steady state output in terms of the binary input of the CNN can be represented by a linearly separable Boolean function (LSBF) [1,2].
In the study of uncoupled CNNs, most of the elementary applications can be derived and analyzed via Boolean functions, and this is directly related to the CNN template parameters.For LSBFs, Chen et al. [10] developed an essential relationship among the template, offset levels, a basis of the binary input vector set, and a neat truth table of the corresponding Boolean functions.In their work, they found a criterion for LSBFs and the criterion depends only on symbolic relations among the outputs of the Boolean functions.In [11,12], the authors proposed an efficient method for implementing an LSBF and successfully realized all the 1882 and 94572 LSBFs by designing the corresponding CNN templates of 4 and 5 input variables, respectively.However, since the number of Boolean functions increases exponentially as the number of input variables increases, their method can be hardly extended.
The robustness indicator of a CNN is a measure quantifying the degree by which the parameters of a CNN can be perturbed while still producing the desired output.The definitions of robustness may be different in different contexts.Rigorous definitions of absolute and relative robustness of CNNs were given in [13].A theoretical upper bound for relative robustness was derived and the absolute robustness can be arbitrarily increased by template scaling.CNN autoassociative memories were designed in [14,15] using particle swarm optimization method with the robustness of designed memories taken into consideration.

Mathematical Problems in Engineering
The dynamic behavior of a class of third-order competitive CNNs depending on two parameters was investigated in [16].In the special class of one-parameter family of symmetric CNNs, the authors discussed the robustness of the complete stability with respect to nonsymmetric perturbations on the neuron interconnections.In the VLSI implementation of CNN-UM (CNN Universal Machine), the template values will usually deviate from the ideal template values due to numerous reasons.Therefore robust design is crucial to guarantee correct outputs.To make the chip react as an ideal CNN structure, adaptive simulated annealing (ASA) method, a chip-specific optimization method, was proposed in [17] to automatically tune the template values.The result of the optimization process is the least sensitive template to the actual chip instance.
Recently, some machine learning techniques are applied to CNN applications.The recurrent fuzzy cellular neural network (RFCNN) was proposed for automatically constructing a multiple-CNN integrated neural system in [18][19][20].The proposed RFCNN can automatically learn the network structure and the parameters simultaneously for uncoupled or coupled CNN.In the sense of learning, this structure learning includes the creation of fuzzy rules of fuzzy neural network and CNNs with pattern clustering algorithms.The parameter learning contains the tuning of fuzzy membership functions and CNN templates.
For an arbitrarily given Boolean function, Chen et al. [21,22] proposed a generalization of Rosenblatt's perceptron model (universal perceptron) with DNA-like learning and decomposition algorithm.Their papers considered all of the Boolean functions via single-layer perceptron using the DNA-like learning algorithm.The proposed algorithms first train the DNA-like offset sequence and decompose the given linearly inseparable Boolean function into a sequence of LSBFs with logic XOR as conjunctions of the sequence of LSBFs.On the other hand, the CFC decomposition method, proposed by Crounse et al. [23], can be used to find a sequence of uncoupled CNNs implementing the given Boolean function, whether linearly separable or inseparable.In CFC method, the entries of the control templates of the required uncoupled CNNs are restricted to {0, ±1} and the thresholds are some integers.The conjunctions of the sequence of CNNs are traditional logic operators.The method is a brute force one, yet it is simple and easy to implement.
It was pointed out in [24] that the geometric margin of a linear classifier with respect to a training data set, a notion borrowed from the machine learning theory, can conveniently be used to define the robustness of an uncoupled CNN implementing a linearly separable Boolean function.Larger geometric margin indicates better robustness against perturbations in both template parameters and the input data.Consequently, the so-called maximal margin classifiers (MMCs) can be devised to provide the most robust template design for uncoupled CNNs implementing linearly separable Boolean functions.An uncoupled CNN is said to be robust if it is a maximal margin classifier; that is, its template values are the solution of a maximal margin classification problem.
The decomposition method proposed by Lin et al. [24], abbreviated as LHJ method in this study, extends the CFC method in the sense that the entries of the control templates are restricted to more general sets {0, ±1, ±2} or {0, ±1, ±2, ±3}, and all robust CNNs with template entries belonging to these weight-restricted constraint sets are characterized.Hence, under essentially the same search mechanism in CFC method, a much simpler search space, consisting of all possible robust CNNs with template values belonging to the weight-restricted set, is focused.The extension of the restricted weights above {0, ±1, ±2, ±3} is computationally expensive because the number of searching templates will be an enormous number.
We wish to point out that the entries of the control template of a robust CNN are not required to be restricted to the set of small magnitude.For instance, the uncoupled CNN with the following control template and threshold:  = (25 22

Uncoupled CNN
Consider a standard  ×  CNN.In this study, we consider exclusively the most commonly used 3 × 3 neighborhood for each cell (, ).Thus the CNN parameters may conveniently be represented by a triple (, , ), where  and  are 3 × 3 feedback and control templates, respectively, and  is the threshold value.In this study, we consider the following uncoupled CNN written as With the static binary inputs, the steady-state output   (∞) of (, ) in this uncoupled CNN can be calculated explicitly as where and  1 , . . .,  9 are inputs to the cell.See Theorem 6.1 in [25].
It is well known that a (local) Boolean function ℎ( 1 , . . .,  9 ) of nine variables is realizable by every cell of an uncoupled CNN if and only if ℎ(⋅) can be expressed by the formula where   ,  ∈ 9,  are real constants, and   ∈ {1, −1},  ∈ 9, is the th Boolean variable.See Theorem 6.2 in [25].It is important to note that the discriminant function is an affine-linear function of  ∈ R 9 .Thus implementing a Boolean function by an uncoupled CNN can be regarded as a linear classification problem.

Support Vector Classifiers
Since maximal margin and soft margin support vector classifiers will be utilized in our proposed decomposition methods, we briefly review the support vector classifiers in this section.
Let  ⊆ R  and  := {1, −1}.Suppose we are given the training set The training set  in ( 7) is said to be linearly separable if there is a hyperplane  , of the form that correctly classifies all training data.By treating the truth table of a given Boolean function as the training set with  = 512 training data, this training set must be linearly separable in order for the Boolean function to be realizable by an uncoupled CNN.
The geometric margin of the classifier  , (⋅) with respect to the training set  is defined by [28]   (, ) : This will be used as the robustness indicator of an uncoupled CNN implementing a linearly separable Boolean function [24].Larger geometric margin indicates better robustness against perturbations in both template parameters and the input data.
Let the Boolean function be linearly separable.The maximal margin classifier (MMC) can be obtained by solving the following optimization problem: Suppose ( * ,  * ) solves the problem in (10a) and (10b).Then the optimal weights are given by The optimal discriminant function is thus given by Using the KKT conditions of the optimization problem (10a) and (10b), it can be shown that the corresponding margin is given by [28] It is obvious that the maximal margin classification problem (10a) and (10b) has no solution if the training data of a Boolean function is linearly inseparable.Note that as the maximal margin is obtained, the outputs of the optimal discriminant function will retain the same positive/negative signs even when the template parameters and the input data are perturbed.Since the optimal separating hyperplane has the maximal geometric margin, the best robustness can be achieved for the designed CNN.
To allow for misclassifications of training data, we now introduce the slack variables for classification problems.
Let  > 0 be given.The slack variable   of an example (  ,   ) with respect to the hyperplane  , and target margin  is defined by [28] See Figure 1   The 2-norm soft margin classifiers (SVC2) will be used in this study.The 1-norm soft margin classifiers (SVC1) may also be used.In soft margin classification problems, the target margin  is set to the canonical value of 1.
The SVC2 can be obtained by solving the following optimization problem: Here, the regularization (or smoothing) parameter  > 0 controls the tradeoff between the complexity of the machine and the number of nonseparable points.Suppose ( * ,  * ) solves the problem in (16a) and (16b).Then the optimal weights and discriminant function are given by ( 11) and ( 12), respectively.Moreover, the slack variables   ,  ∈ , are given by ) . (17)

Minterm and Maxterm CNNs
To understand our proposed decomposition algorithm, we first introduce the minterm and maxterm Boolean functions [24].
Every minterm (or maxterm) Boolean function is linearly separable [25].In the truth table of a minterm (resp., maxterm) Boolean function, let  min  (resp.,  max  ),  ∈ , be the values (−1 or 1) of the input variables corresponding to the row with output 1 (resp., −1).Then the optimal discriminant function with maximum (geometric) margin 1/√ is given by For the usual case of  = 9, the optimal discriminant function with maximum (geometric) margin 1/ √ 9 is given by Table 1: Simple Boolean function.
A CNN which implements a minterm (resp., maxterm) Boolean function in each cell is called a minterm (resp., maxterm) CNN.Our proposed algorithm is based on the following realization theorem [25].

Theorem 2. Every local Boolean function of 𝑛 variables can be realized by ORing (in minterm decomposition) or ANDing (in maxterm decomposition) at most 2 𝑛 uncoupled CNNs.
Let us consider for instance the Boolean function  with  = 3 given in the truth table shown in Table 1, where  1 ,  2 , and  3 are input variables and  is the output variable.
Table 1 can also be conveniently represented by the following figure by listing only the output values, where black and white dots represent logic values 1 and −1, respectively: Since there are five 1 s in the output, according to the realization theorem, the Boolean function  can be decomposed by ORing 5 minterm CNNs: where   is a minterm Boolean function, having a "1" in the th output and "−1"s elsewhere, and "∨" is the OR logic operator.It is desirable to shorten the length of the minterm decomposition.A natural idea is to group some minterms so that the resulting Boolean function is still linearly separable.Furthermore, we require that the uncoupled CNN realizing the resulting Boolean function is robust in the sense that it is a maximal margin classifier.For instance, the following decomposition produces exactly the same Boolean function: The first and second combined Boolean functions  min ( The problem is how we group appropriate minterms.

Decomposition Algorithms
In this section, we describe our proposed minterm-and maxterm-based decomposition methods.Before proceeding, we give some motivation for the minterm-based decomposition algorithm.First notice that the maximal margin classification problem (10a) and (10b) has no solution if the training data of a Boolean function is not linearly separable.Suppose now we are given a general Boolean function.
To find the first template in our proposed algorithm, we may start from the working Boolean function which is the given Boolean function , now thought of as the minterm decomposition.Since we do not know whether the Boolean function  is linearly separable or not, we may solve (16a) and (16b) for an SVC2 with a guaranteed solution.This results in an optimal separating hyperplane but some data may be misclassified.Let  remain be the set of all indices of data with output value 1 in the working Boolean truth table .If there are no misclassified data, then the algorithm stops because the working Boolean function is linearly separable and we may find the corresponding robust CNN by solving the maximal margin classification problem (10a) and (10b).Otherwise suppose that there are misclassified data, and the th data with  ∈  remain achieves the maximal value slack variables among all data with indices in  remain .The key idea is that this data might represent the point which is rather difficult to be correctly classified by the current hyperplane.In this case, the minterm with output value 1 in the th data will be dropped from the working Boolean function.This process is repeated until we find a linearly separable working Boolean function or end up with a minterm which is certainly linearly separable.
The preceding discussion can be summarized as the following top-down (or pruning) algorithm which is based on the minterm realization theorem.Note that the entries of the Boolean truth table are either +1 or −1.

Goal. A sequence of robust templates implementing the given Boolean truth table.
Step 1. Choose the regularization parameter  > 0 and the desired upper bound , 0 ≤  ≤ 1, for the slack variables   .
Step 5.2. index ←  index \  remain ,  remain ←  index , and go to Step 3.
Step 6.If #( remain ) > 1, from the current  training data with target set , solve the soft margin classification problem.Define Step 7. If  miss = ⌀, set  ←  + 1.Then the th robust template is obtained by solving the maximal margin classification problem with target set .
Step 7.1.Form the new target set  by setting   = −1 for all  ∈  remain .
Step 7.2. index ←  index \  remain ,  remain ←  index , and go to Step 3.
Step 8.2.Form the new target set  by setting   = −1 for all  ∈  delete .
Step 8.3. remain ←  remain \  delete and go to Step 3.
The given Boolean function is realized by ORing the Boolean functions corresponding to the robust templates generated by the algorithm.In the worst case, the algorithm yields templates being all minterms.
Note that in Step 6 if  = 1, the points with   ≥  = 1 (including the case   = 1 corresponding to the points on the separating hyperplane) must be included in the set  miss .However, if  = 0, the points with   =  = 0 (corresponding to the points achieving the target functional margin of value 1) must not be included in the set  miss .In Step 8.1, the indices to be deleted from  remain are all indices (possibly all indices except one) in the set  temp .
It is true that different values of the regularization parameter  and the desired upper bound  in Step 1 will affect the length and margins of the decomposition.In general, the choice of both parameters is problem-dependent.According to our experience, the choice of  ≥ 1 and  = 1 in Step 1 works well in most cases we have encountered.However, the algorithm may result in a long sequence of robust templates or poor robustness in some templates if  and  are small.

Template values Logic operator
Margin

Template values Logic operator
Margin ,  = 4 AND 0.3333 Notice that we may also state a similar algorithm based on maxterm realization theorem, namely, the maxtermbased decomposition algorithm.In this algorithm, we simply change   = −1 to   = 1 in Steps 3, 5.1, 7.1, and 8.2 of the minterm-based decomposition algorithm.

Illustrative Examples
In this section, we provide six illustrative examples.In Example 1, the Boolean function is generated from the socalled Game of Life, and those in Examples 2-6 will be given by the (minimal CNN) truth tables [25], where black and white dots represent logic values 1 and −1, respectively.In the following simulations, the maxterm-based decomposition algorithm using SVC2 with  = 1 and  = 1 is implemented in Examples 1-3 and the minterm-based decomposition algorithm using SVC2 with  = 1 and  = 2 is implemented in Examples 4-6.We wish to point out that the template values in the following simulations are rounded to integers.All the simulated results obtained from the proposed decomposition algorithms are compared to those obtained from the LHJ algorithms with  3 .
Example 1.Consider a two-dimensional matrix of cells.The Boolean function for the Game of Life, which is not linearly separable, is generated from the following local rules.The LHJ and maxterm-based decompositions are shown in Tables 2 and 3, respectively.As can be seen, we obtain essentially the same results including the length and margins of the decompositions by both algorithms.
Example 2. Consider the Boolean function given by the truth table of Figure 2. The LHJ and maxterm-based decompositions are shown in Tables 4 and 5, respectively.It is seen that Example 3. Let the Boolean function be given by the truth table shown in Figure 3.The LHJ decomposition is shown in Table 6 and the maxterm-based decomposition is given in Table 7.It is observed that the length of the maxtermbased decomposition is much shorter than that of LHJ decomposition, but the minimal margin of the templates by the maxterm-based algorithm is a little bit smaller.
Example 4. Consider the Boolean function given by the truth table of Figure 4.The resulting decomposition by LHJ algorithm is given in Table 8.As shown in Table 9, the length and the minimal margin of the minterm-based decomposition are the same as those of the LHJ decomposition.
Example 5. Let the Boolean function be given by the truth table shown in Figure 5.The LHJ and minterm-based decompositions are shown in Tables 10 and 11, respectively.Note that the length of the minterm-based decomposition is shorter than that of the LHJ decomposition.Note that the first templates in both decompositions have small margins.6.The resulting decompositions by the LHJ and minterm-based algorithms are given in Tables 12 and  13, respectively.Again, the minterm-based decomposition has shorter length but with smaller margins.In Table 13, it is seen that the first Boolean function has a smaller margin than other Boolean functions in the minterm-based decomposition.

Conclusion
In this study, simple minterm-and maxterm-based decomposition algorithms utilizing soft margin and maximal margin classifiers have been proposed to design a sequence of robust templates implementing an arbitrary Boolean function.In contrast to the past research works, the control template parameters and thresholds in our approach are not restricted to assume only a given finite set of integers.
Several numerical examples have been provided to illustrate the use of the proposed method.CFC and LHJ decomposition methods are in general faster than the proposed mintermand maxterm-based decomposition methods.However, in many cases, the minterm-and maxterm-based decomposition methods produce a sequence of robust templates with shorter length.To further speed up the decomposition algorithm, a natural idea is to make use of more logic functions as conjunctions for the sequence of robust templates, instead of OR operator in minterm-based decomposition or AND operator in maxterm-based decomposition.This constitutes an interesting future research topic.Furthermore, it is worthwhile to investigate if one can devise a decomposition algorithm such that each resulting robust template in the decomposition has a margin greater than or equal to a prespecified value.This constitutes another interesting future research topic.

Figure 1 :
Figure 1: Slack variables for linear classification problems.
amount by which the training set fails to have margin  and takes into account any misclassifications of the training data.
(a) If there are two neighbors whose state values are 1, then set the state value to 1.(b) If there are three neighbors whose state values are 1, then retain the state value.(c) For situations apart from (a) and (b), set the state value to −1.
for illustration.Clearly,   measures the amount by which the example (  ,   ) fails to have margin  with respect to the hyperplane  , .Consequently, the quantity defined by