Applying Neural Networks to Find the Minimum-Cost Coverage of a Boolean Function

To find a minimal expression of a boolean function includes a step to select the minimum cost cover from a set of implicants. Since the selection process is an NP-complete problem, to find an optimal solution is impractical for large input data size. Neural network approach is used to solve this problem. We first formalize the problem, and then define an “energy function” and map it to a modified Hopfield network, which will automatically search for the minima. Simulation of simple examples shows the proposed neural network can obtain good solutions most of the time.

INTRODUCTION gic minimization is to simplify a boolean func- ion so that the implementation cost can be reduced.Among the different techniques, two-level minimization is the most well-understood one and has a wide application in the design of Programmable Logic Array (PLA) [2].This technique normally involves two basic procedures: the genera- tion of Prime Implicants (PIs), and the selection of PIs.The generation of PIs can be exhaustive, as in traditional Quine-McCluskey method [15] and in McBoole  [6], or heuristic, as in Espresso [2].Once the PIs are found, we need to select a subset that can cover the function completely and has a minimal total cost.This selection process is an NP-complete problem [9] and may take exponential computation time to obtain an optimal answer.Although certain pre-processing techniques, such as column reduc- tion, row reduction [14] and partition [17], can re- duce the size of the input table, there is no effective way to obtain the optimal solution from the reduced table.
A neural network is an interconnected network of a large number of simple processors [18].Although neural networks are primarily used for information processing and biological modeling, it has been shown that neural networks can collectively compute good solutions to a wide range of complex optimiza- tion problems, such as Traveling Salesman, 3- Satisfiability, etc. [11] [12] [20] [19].Since it is possi- ble to implement the massive neural network in a VLSI chip [16] in the future, neural networks can be an alternative approach to obtain answer for some "hard" computational problems.In this paper, we investigate the possibility of applying neural network approach to solve the coverage problem.We first formalize this problem and define the "energy func- tion", and then derive network configuration accord- ingly.
The remaining of the paper is organized as follows: Section 2 gives an overview on neural network and its application to optimization; Section 3 shows the formulation of the network; Section 4 discusses the implementation issues and gives two examples; and the last section summarizes the study.

OVERVIEW ON OPTIMIZATION NEURAL NETWORKS
An artificial neural network is a network of a large number of simple computation elements, known as neurons, connected by links with variable weights.Instead of performing a program of instructions sequentially, neural network model explores many competing hypothesis simultaneously using its mas- sively parallel net.A typical neuron is shown in Figure l(a).The neuron's operation contains two basic steps.The first step calculates the activation (u) of the neuron by" summing n weighted incoming signals (i.e., for neuron j, performs Y'.in= 1WijUi(t) Oj and assign the value to uj(t / 1)).The second step passes the results through an output function g(.)(i.e., for neuron j, performs g(uj(t / 1)) and assign the value to vj(t + 1)).Normally, g(-) is a nonlinear sigmoid function, which is monotone in- creasing and bounded.A neuron can be extended to a "nonlinear" neuron if the summation in the first part is replaced by a more general nonlinear function f(vl(t),..., vn(t)).A neuron can also be imple- mented as an analog device with a nonlinear ampli- fier of gain g(-) [16].The analog version of the neuron is shown in Figure l(b).The two steps now can be described as duj(t)/dt _,in=lWijUi(t)--Oj and vj(t)= gi(uj(t)).The time unit here is normal- ized to the RC time constant of the integrator.The analog neuron can also be extended to a nonlinear one by generalizing the fist part: duj(t)/dt fi(v l(t),..., v,,(t)).In the remaining discussion, we will use the analog version.While the primary re- search interests in neural network are concentrated in the information processing and biological modeling, it has been shown that the neural network can collectively compute good solutions to complex opti- mization problems [19].This approach comes from the observation that in certain properly designed networks, the dynamic of the networks will force the network to converge to a minimal energy state [10].
If we can map the problem solution to this minimal energy state, the network can "automatically" solve the optimization problem.
The procedure listed below (extended over [7]) outlines the basic steps of this approach: introduce a set of variable {Vl(t), v2 (t),..., v,(t)} to represent the quantities to be opti- mized. 2. determine an energy function, E(u1(t), v2(t),...,Vn(t)) as a measurement of "opti- mality".E is a scalar function.It should be bounded below and its minimum should corre- spond to the desired solutions of the optimiza- tion problem.
In this procedure, step (1) determines the number of the required neurons; step (4) and step (3) specify the activation function and output function respec- tively.Once these entities are determined, the ac- tual network can be constructed accordingly.By choosing dui(t)/dt -dE/dvi(t), we can show that dE/dt < 0 (the proof is in the Appendix).The in- equality implies that as time progresses, the energy of the network will decrease or remain the same.However, since E is bounded below, the energy will evnetually reach the minimal point (in a local sense).In other words, becuase of our choice of E, output function and activation function, the dynamics of the network will force itself to converge to a mini- mal energy state, which corresponds the desired solution of the optimization problem.The purpose of the high-gain sigmoid function is to reduce the width of the "linear" region (the region with a value close to 0.5) and ensures that the output moves away from the "undetermined" area.Since steps (1) to (5) only guarantee that the network converges to a local minima, sometimes we may need to apply relaxation techniques, such as simulated annealing, to obtain a global minima [1, 13].Relaxation can be thought as "coarse searching", which will help the network to escape from local minima.

Formal Description of the Network
With the formal description, a neural network can be constructed as follows: 1. use n neurons to represent set {x1, x2,... Xn} and use the output of ith neuron (v), to repre- sent the value of x i. 2. define the energy function E as

FORMULATION
In logic minimization, the set of PIs and the vertices to be covered are normally represented as a table, with rows as PIs and columns as vertex.For each column, a check mark is placed in a row if that vertex is covered by the corresponding PI.For each PI, there is a positive number associated with it to represent the cost of implementing this PI (such as the number of literals).The goal of the.coverage process is to select a set of PIs so that there is at least one check mark in each column and the total cost of the set is minimal.Mathematically, we can use a matrix and a cost vector to describe the PI-vertex table.Let rn and n be the number of vertex and PIs.The cost vector can be defined as (cj), 1 < j < n, where cj is the cost associated with jth PI and cj < 0. The relation of vertex and PIs can be defined by a matrix [aji], where aji {0, 1}, 1 < j < n, 1 < < m; aji 1 if jth PI covers ith vertex and aji 0 otherwise.After introducing the two entitites we can formal- ize the coverage problem as an optimization The first line represents the total cost that needs to be minimized, and the second line represents m constraints which state that for each column there must be at least one check mark.where F(y) [ y2 if y < 0 0 ify>0 3. define the output function as a high-gain sig- moid function:

The Energy Functionxq
There are two major terms in the energy function, the first term Ecjvj represents the total cost of the selected PIs, and the second term DEF(Eayiv 1)   represents the penalty for the violation of con- straints.The first term is always greater than 0 because cj is always positive.The .second term is greater than or equal to 0, depending whether the constraints are satisfied.It returns a positive value if any constraint is violated (i.e., for some i, .,ajiuj < 1) and returns 0 if all constraints are satisfied (i.e., for all i, Y'.ajiu >_ 1).
There is a constant scaling factor D in penalty term.The value of D can be interpreted as the penalty for the violation of a single constraint and can be used to specify the relative weights between the cost term and the penalty term.Since it is more desirable to obtain a valid sub-optimal solution than an invalid solution, any constraint violation should PONG P. CHU be given a larger penalty.The D is used to assure this.
Our choice of D comes from the following obser- vation.Y'.]=lCj can be interpreted as the cost in- curred to the case that all PIs are selected.This is a valid solution since all the vertices are covered.Also, this is the worst solution since its cost is larger for any other valid solutions.In other words, Ey= lCj represents the cost (and energy value as well) of the worst valid solution.Since any invalid solution is worse than the worst valid solution, its energy value should be larger than E'= Cj.Thus, we choose D > E'=lCi and guarantee that constraint violation will always be penalized more.shows that the calculation of f( j=lajivj- sembles the operation of a regular neuron except the output function is f(.) and there is no integra- tor.Thus, we can introduce a set of slightly modified neurons to perform this task.The detailed imple- mentation is explained by the following example.
The coverage table of the first example is shown in Table 1 which has 4 PIs and 5 vertices.In this example, there are total 2 4 possible selections, and {P2, P4} is the optimal solution.
The implementation derived from section 3.1 is shown in Figure 2 The D is chosen to be 11.1.In this configuration, the activation function (denoted by a square box) is complicated and hard to implement.Further, computation to obtain DEim=l(-aji)f(Y'.=lajivj-1) is duplicated in every neuron.
A better alternative implementation is shown in Figure 3.In this configuration, we simplify the implementation of original neurons by distributing the :--11.1 :--114 ,-- FIGURE 3 The Alternative Implementation of the Proposed Network computation of activation functions to a new group of neurons.There are two groups of neurons in this implementation.The first group contains n neurons.The input part now is very simple, which is in a standard linear summation, form, similar to the con- vetional neuron.The second group contains rn neu- rons and each of them performs the computation of f(ETflajivj 1).They are regular neurons with f as their output functions and without the integrators.
This network is similar to the one used in [20].Although the two types of neurons look similar, their operation speeds need to be different.Since the neurons in the second group are not involved the network energy, they are essentially just "computational elements" that compute the activation functions.Their computation should be completed before any neuron in the first group changes to a new output value.Therefore, the neurons in the second group should be operated much faster than the neurons in the first group.We may need to add additional capacitance in the inputs of the neuron of the first group to ensure that it has a proper time constant.
The operation of the network is simulated by software.50 randomly generated initial conditions, in which the u is uniformly distributed between [-0.5, 0.5], are used.Simulation results show that all of them converge to the correct value and the convergence normally takes less than one time con- stant.A typical convergence trace is shown in Fig ure 4.
The convergence table of the second example is shown in Table 2, which is adopted from [15].
{P2, P3, P6} and {P1, P2, Ps} are the two optimal selections.This is a difficult case since the table is cyclic and no heuristic rules, such as essential PI, e Yl Y2 Y3 Y4 Y5 Y6 c dominance etc., can be used to reduce the complex- ity.The D is chosen to be 12.1.Again, 50 randomly generated initial conditions are used.Simulation results show that 46% converge to {P2, P3, P6}, and 40% converge to {P1, P2, Ps}, and 14% converge to undeterministic states (with some output values near 0.5).The convergence is normally within one time constant.A typical convergence trace is shown in Figure 5.

SUMMARY
In this paper, we apply the neural network approach to obtain the minimal cost coverage of a given PI-vertex table.We first formalize the problem and derive the energy function, and then map it to a neural network.The actual implementation includes two groups of neurons.One of them is used as regular neurons that works towards minimal energy state and another group is used as computation elements that assist the calculation of activation functions.Simulation on simple examples show that the neural network can obtain good solutions in a short amount of time.
In the current form, the major problem of this approach is the required computation time.Simulat- ing the neural network is essentially using numerical analysis to solve a set of non-linear partial differen- tial equations, which is computation intensive.This limits the total number of PIs to a small number and makes it not feasible for large practical problems.
However, this scheme may be an attractive alterna- tive when the analog neural network VLSI is avail- able [16].The VLSI device eliminates the need of In the future, it may be possible to include a generic programmable neural network as an "accelerator'" chip in a regular CAD system, and to use the chip to assist to solve the "core computation" of certain optimization prob- lems.i=l n E (g-l(ui))' dUi Since g(.) is monotone increasing, g-l(.) is monotone increasing and (g-1(.)),_ _ _ _ .0. Thus, dE <0 dt

APPENDIX
FIGURE Diagram of a Single Neuron

FIGURE 2 First
FIGURE 2 First Implementation of the Proposed Network

1 FIGURE 4 A
FIGURE 4 A Representative Convergence Trace for Example

FIGURE 5 A
FIGURE 5 A Representative Convergence Trace for Example 2 {0, 1} and xj 1 if Pj is selected; assign proper values to xj so that they can prob- lem [3]: let {XI, X2,...,Xn} be a set of variables, where xj

TABLE The Coverage
Table for the First Example

TABLE 2 The
Coverage Table for the Second Example