TOPS: A Target-Oriented Partial Scan Design Package Based on Simulated Annealing

C.P. RAVIKUMAR and H. RASHEED
Department of Electrical Engineering, Indian Institute of Technology, New Delhi 110016, INDIA

(Received December 3, 1992, Revised March 22, 1993)

In this paper, we describe algorithms based on Simulated Annealing for selecting a subset of flip-flops to be connected into a scan path. The objective for selection is to maximize the coverage of faults that are aborted by a sequential fault simulator. We pose the problem as a combinatorial optimization, and present a heuristic algorithm based on Simulated Annealing. The SCOAP testability measure is employed to assess the selection of flip-flops during the course of optimization. Our algorithms form a part of an integrated design package, TOPS, which has been designed as an enhancement to the OASIS standard-cell design automation system available from MCNC. We discuss the TOPS package and its performance on a number of ISCAS'89 benchmarks. We also present a comparative evaluation of the benchmark results.

Key Words: Design for Testability, Partial Scan Design, Target Faults, Simulated Annealing

1 INTRODUCTION

In designing digital integrated circuits, a popular method for achieving testability is full scan design, where the flip-flops in the circuit are threaded into a chain, which can perform as a shift register when the circuit is placed in the test mode. A test vector is shifted serially into the shift register, and the response of the circuit is shifted out serially for observation [1]. One of the serious drawbacks of full scan design is the area overhead that results due to

- the extra wiring required to thread the flip-flops into a chain
- the extra space occupied by flip-flops when they are modified into scan cells.

In order to reduce the overhead of full scan and still maintain the advantages offered by the scan methodology, several authors have designed techniques for partial scan design where only a subset of flip-flops are scanned. In order to guide the selection of as small a subset as possible, the following approaches have been considered in the literature. In all these approaches, the essential idea is to exploit the underlying structure of the circuit; a graph called S-graph is derived which captures the structural information. An S-graph has one node corresponding to each of the flip-flops in the original circuit, and a directed edge from node i to node j if (and only if) there exists a combinational path from the flip-flop i to flip-flop j in the circuit. In other words, an edge (i, j) in S-graph represents the combinational logic that separates the output of flip-flop i from the input of flip-flop j.

- Cheng and Agrawal observed that a circuit is poorly testable if its S-graph has long cycles [3]. When a flip-flop in the circuit is converted into a scan cell, the operation corresponds to deleting the corresponding node in the S-graph. In order to achieve high testability, the S-graph must be rendered acyclic by deleting as few nodes as possible. We refer to this acyclic graph as A-graph in this paper. This problem, also called the feedback vertex cover set problem, is known to be NP-complete. Cheng and Agrawal gave heuristic algorithms to obtain small feedback vertex covers.

- Gupta and Breuer introduced the concept of balance in a sequential circuit in addition to the concept of acyclicity [6]. The advantage of transforming the circuit structure into a balanced acyclic structure is that a combinational test vector generator can be employed to generate test patterns for the partially scanned circuit.

- Some authors believe that poor testability of a circuit may be ascribed to its large sequential
depth. The sequential depth of a circuit is defined as the length of the longest path in the A-graph. Thus, Lin and Reddy [8] considered the following problem. Delete the fewest number of nodes from the S-graph such that the graph becomes acyclic and the length of the longest path in the A-graph is minimized. The authors described a two-step heuristic algorithm for the problem; in the first step, cycles in the S-graph are removed by deleting nodes, and in the second step, the sequential depth is reduced by deleting more nodes. Each deleted node forms part of the final partial scan chain.

An entirely different approach may be considered for partial scan. If a sequential test generation algorithm is available, one could attempt to run this algorithm on the unscanned circuit to generate test vectors within a certain time limit to detect as many faults as possible. The remaining faults, also called aborted faults or target faults, are the only ones which need to be addressed by the partial scan mechanism. In a similar setup, one may use a random test vector generator and a sequential fault simulator to generate a list of target faults.

Chickermane and Patel observed empirically that hard-to-detect faults tend to lie in strongly connected components (SCCs) of the S-graph [4]. As a result, the partial scan system proposed by these authors examines the SCCs in the S-graph and computes a profit function \( p_i \) for each node \( i \). If a node \( i \) is part of cycles \( i_1, i_2, \ldots, i_k \), the profit \( p_i \) obtained by scanning the node \( i \) is given by

\[
p_i = \sum_{j=1}^{k} W(i_j)
\]

where \( W(i_j) \) indicates the weight of cycle \( i_j \). In turn, the weight of a cycle is the number of hard-to-detect faults that lie on the cycle. The objective in [4] is to select cells for scanning, such that the cumulative profit is maximized without exceeding the upper bound on the cost of scanning. The authors presented heuristic techniques to obtain good solutions to the above optimization problem.

In this paper, we apply the Simulated Annealing algorithm [7] to the partial scan design problem. Unlike the references cited above, our algorithm does not rely on structural properties such as cycles, weighted cycles, sequential depth, or strongly connected components. Instead, we regard the search space in a uniform manner when looking for a solution to the partial scan problem. In our approach, the structural information is used to evaluate the testability of a configuration during the course of annealing. In the following section, we describe the details of the algorithm. In Section 4, we discuss the results obtained by applying our technique to several ISCAS benchmark circuits. We also implemented a greedy algorithm to compare the results obtained through the annealing procedure. The greedy algorithm and its results are also discussed in Section 4. Our algorithms are integrated into a package named TOPS, which enhances the OASIS design automation system [9] available from MCNC by providing partial scan capability. We discuss the salient features of TOPS in Section 3.

2 PARTIAL SCAN FOR TARGET-FAULT DETECTION

The Simulated Annealing algorithm [7] has been widely applied to a number of optimization problems in Design Automation such as floorplanning, partitioning, placement, and routing (see [10], [11]). The annealing algorithm is similar to an iterative improvement algorithm in nature, with the exception that inferior states are also accepted with a certain probability. This probability is a function of a parameter known as the temperature, which is decreased slowly during the course of the algorithm. At higher temperatures, the annealing algorithm accepts inferior states with a high probability, hence behaving like a random search procedure. As the temperature is lowered, the acceptance probability for inferior states drops, and the procedure attains greedy characteristics at temperatures close to 0. The reader may refer to [7] for a complete description of the annealing algorithm.

The partial scan design problem may be phrased in terms of a state-space search as follows. A state, or configuration, consists of a subset of cells. Let \( n \) be the total number of flip-flops in the circuit. A subset of \( k \) flip-flops can be selected in \( \binom{n}{k} \) ways, and hence there are \( \sum_{k=1}^{n} \binom{n}{k} = 2^n - 1 \) possible solutions to the partial scan problem. A perturbation of a state consists of deleting a flip-flop from the present configuration, or adding a flip-flop to the present configuration, or both. The cost of a configuration is the area overhead that results by scanning the flip-flops.
which correspond to the present configuration. Our cost measure consists of two components.

- the increase in functional area
- an estimate of the increase in wiring area due to scan path

The profit of a configuration is measured by the SCOAP testability index of the configuration [5], as extended to the case of target faults. We will describe our cost and profit function in more detail later in this section.

The procedure Anneal is outlined in Figure 1. The annealing schedule is given by the initial temperature $T_0$, the final temperature $T_f$, the cooling rate $\alpha$, the number of iterations per temperature $M$, and the rate $\beta$ at which $M$ is increased progressively over temperatures. The initial configuration $S_k$ is also an input to the procedure; it consists of a randomly selected $k$-sized subset of the flip-flops in the circuit. The function perturb returns a new configuration $S'_k$ by perturbing the subset $S_k$ as explained earlier. The new configuration is accepted under two conditions.

- If both the profit and cost parameters of the new configuration are better, then $S'_k$ is accepted.
- When the new configuration is inferior in either the cost measure or the profit measure, or both, then the Metropolis criterion [7] is separately applied to both cost and profit terms. $S'_k$ is accepted if both the Metropolis criteria succeed.

2.1 Calculating the Profit

The profit function is an implementation of the SCOAP testability analysis procedure. In the SCOAP terminology, $SC^c[x]$ indicates the sequential $c$-controllability of a line $x$. $SC^v[x]$ is defined similarly. The sequential observability of a circuit is denoted by $SO$. The conventional SCOAP testability index for a sequential circuit is given by

$$ U = \sum_x SC^c[x] + SC^v[x] + SO[x] \quad (2) $$

where the summation is carried over all lines $x$. In our work, since we are mainly interested in target faults, we define an alternate testability index $T$. Let $\delta(x)$ be a 0-1 function which evaluates $1$ if and only if there exists a target fault of the form $x$-stuck-at-$r$.

$$ T = \sum_l \delta(l) \cdot SC^c[l] + \sum_l \delta(l) \cdot SC^v[l] $$

$$ + \sum_l (\delta(l) - \delta(l) \cdot SO[l]) \quad (3) $$

where the summation is carried over all lines $l$.

```
procedure Anneal($S_k, T_0, T_f, \alpha, \beta, M$);
(* $S_k$ is the initial configuration, with $k$ flip-flops.
$T_0$ is the initial temperature. $T_f$ is the final temperature.
$\alpha$ and $\beta$ are the cooling parameters.
$M$ is the number of trials attempted at any temperature. *)

begin
    $T := T_0$;
    while ($T_f < T$) do begin
        for $i := 1$ to $M$ do begin
            $S'_k :=$ perturb($S_k$);
            $\Delta_p =$ profit($S'_k$) - profit($S_k$);
            $\Delta_c =$ cost($S'_k$) - cost($S_k$);
            if (($\Delta_p > 0$) and ($\Delta_c < 0$))
                or (($random < e^{\Delta_p/T}$)
                    and ($random < e^{-\Delta_c/T}$)) then
                $S_k = S'_k$;
        end
    $T := T \cdot \alpha$
    $M := M \cdot \beta$
end
```

FIGURE 1 Simulated Annealing Algorithm for Partial Scan Selection
Scanning a flip-flop $F$ affects the $T$ index in two ways.

- the controllabilities of lines that are reachable from the output of $F$ may improve, and
- the observabilities of lines which lead to the input of $F$ may improve.

Let $\gamma(i, j)$ be a 0-1 function which evaluates to 1 if and only if there exists a directed path from line $i$ to line $j$. Let $I$ and $O$ respectively indicate the input and output of flip-flop $F$. Let $T^{sc}$ denote the testability index for the partially scanned circuit, where $S$ is the set of flip-flops selected for scan. The expression for $T^{sc}$ is given below.

$$T^{sc} = \sum_{S \subseteq S} \sum_{I} \gamma(O, I) \cdot \delta^i[I] \cdot SC^i[I]$$

$$+ \sum_{I} \gamma(O, I) \cdot \delta^i[I] \cdot SC^0[I]$$

$$+ \sum_{I} \gamma(I, I) \cdot (\delta^i[I] + \delta^i[I])$$

$$- \delta^i[I] \cdot \delta^i[I] \cdot SO[I]$$

The reader should note that while evaluating $T^{sc}$, the line observabilities $SO[I]$ and controllabilities $SC^0[I]$ and $SC^i[I]$ must be recomputed for the partially scanned circuit. For simplicity, we have used the same notation to indicate the observabilities and controllabilities for both unscanned and scanned circuits.

The profit function computes the difference $T^{sc} - U$ for a given circuit and a given subset of $k$ flip-flops selected for partial scan. This is done in two steps.

1. a forward breadth-first-search to identify the lines whose controllabilities are affected by the partial scan. If the search process encounters a line $l$ such that $l$-stuck-at-$r$ is a target fault, then the procedure accumulates the value $SC^1[l]$ into $PSC^1[l]$ (sequential $1-r$ controllability of the partially scanned circuit).

2. a backward breadth-first-search to identify the lines whose observabilities are affected by the partial scan. If, during the search, the procedure encounters a line $l$ such that $l$-stuck-at-$r$ is a target fault, then the procedure accumulates $SO[l]$ into $PSO$. Here, $PSO$ indicates the sequential observability of the partially scanned circuit.

It is clear that $T^{sc} = PSO + PSC^0 + PSC^1$. The unscanned testability index $U$ is computed by the program once initially. The worst-case time complexity of calculating the profit function is linear in the number of nodes of the circuit.

### 2.2 Computing the Cost

The computation of the cost of a configuration depends necessarily on the implementation technology and the layout style. In our work, the target technology is 2μ SCMOS, and the layout style is standard cell. The size of the cell which implements the unscanned flip-flop is 58.0 × 64.0λ². The scanned cell takes an area of 58 × 72.0λ². The functional area overhead due to scanning a single cell is therefore 464λ² units. An increase in wiring area is also incurred due to partial scan; this area is estimated as follows. We generate a placement $P$ of the unscanned circuit using the available standard-cell placement tool. We assume that the same placement of cells is used in the scanned circuit as well. Under this assumption, we have implemented an estimator for the increase in wiring area. This estimator first calculates the increase in track density for each channel due to the extra wiring required to implement the scan path. The order in which the scan cells are connected into a scan path is crucial in the above calculation. However, the problem of determining the best ordering is an instance of the Travelling Salesperson Problem, and is hence computationally difficult. We generate a good heuristic solution to the problem and use this ordering to estimate the increase in the channel track-densities. The cumulative increase in track densities, multiplied by the width of a single track, is used as an estimate of the wiring overhead.

### 2.3 A Greedy Algorithm

In this section, we describe a greedy approach to the partial scan design problem. We use this procedure to generate a good initial solution to the selection problem; this initial solution is passed on as input to the Anneal procedure. As a result, annealing can begin at a relatively low temperature. We found this method very effective in reducing the total computational requirements of Simulated Annealing.

The essential idea behind the procedure Greedy is to rank each of the flip-flops individually by its target testability improvement index which we define below. Given a sequential circuit with $n$ flip-flops $f_1, f_2, \ldots, f_n$, the target testability improvement index of a flip-flop $f_i$ is defined as

$$t(f_i) = T^{sc(f_i)} - U$$

(4)
procedure Greedy(n, limit, C);
 (* n is the number of flip-flops in the circuit. limit is the upper bound on the area overhead that can be tolerated. C is the circuit description. *)
begin
for i := 1 to n do begin
    S := {i};
    R(i).ff := i
    R(i).gain := profit(S);
end
sort(R); (* Ascending Order *)
P := {};
totalcost := 0;
for i := 1 to n do begin
    P := {R(i).ff} ∪ P
    totalcost := totalcost + FOVHD + wire_ovhd(P);
    if totalcost ≥ limit return(P);
end
end

FIGURE 2 Greedy Algorithm for Partial Scan Selection. FOVHD is the functional area overhead contributed by a single scan cell. wire OVHD is a procedure which estimates the wiring area overhead for a given subset P of flip-flops.

In other words, t(f_i) measures the improvement in target testability by scanning only the flip-flop f_i. Since the functional area overhead resulting from scanning any of the flip-flops is the same, a greedy strategy for scan selection is to pick those flip-flops with the highest values for t. The upper limit on the area overhead is used to guide the number of selected flip-flops. The complete procedure is shown in Figure 2. If the target faults are distributed uniformly over the circuit and not clustered in a small region, the greedy algorithm is likely to perform well. We discuss the experimental results on the greedy algorithm in the Section 4.

3 THE TOPS PACKAGE

The algorithms discussed in the previous section have been coded in the C programming language on a Sun SPARC workstation. In this section, we discuss the overall organization of the TOPS package. As mentioned earlier, the TOPS package is designed to interface with the OASIS design automation software from MCNC [9]. The input to the package is the structural description of a sequential circuit given in either the ISCAS format, HILO format, or the VPNR format. VPNR (Vanilla Place and Route) is a circuit description language developed by MCNC. The advantage of using VPNR is that it allows us to describe the circuit as various levels of detail [9]. If the input is provided in ISCAS or HILO format, we internally convert it into VPNR format (see Figure 3). The VPNR description can be compiled into a layout using two programs cplrt and dglrt, which generate a standard-cell placement and routing, respec-

FIGURE 3 The Organization of TOPS package
vitively. The layout is generated using unscanned flip-flops (cell \texttt{dr2s}). The \texttt{dftaudit} program is used to prepare a circuit description as required by the sequential fault simulator \texttt{sift}. \texttt{sift} applies a specified number of random test patterns to the circuit and reports the list of faults which could not be detected. The number of random test patterns, \(N\), plays an important role in the performance of the partial scan design system. If \(N\) is chosen large, the list of target faults may become smaller, giving less work to the partial scan selection algorithms; however, the fault simulator would then require an excessive amount of CPU-time. Of course, there are hard-to-test circuits (such as the \texttt{s420} benchmark from ISCAS) for which increasing \(N\) beyond a certain limit does not help in reducing the number of target faults. Presently, we select \(N\) by a trial-and-error procedure where \(N\) is initially set to 1000 and doubled in every iteration. If two successive values of \(N\) do not reduce the number of target faults, we use the smaller value of \(N\) to generate the final list of target faults. If the final value of \(N\) selected by our procedure is \(N_f\), it is easy to see that we need \(\log_2(N_f/1000)\) runs of sequential fault simulation. Assuming linear-time performance from the fault simulator, the total time spent on fault simulation is seen to be \(O(2N_f - 1000)\).

The TOPS package receives as inputs the original netlist, the placement information, routing information, and the list of target faults. After the selection process, the TOPS package modifies the layout description file (VPNRL format) to convert the selected flip-flops into scan cells (cell type \texttt{dsr2s}). The \texttt{scan} program is used to thread the flip-flops into a scan path.

4 EXPERIMENTAL RESULTS

The effectiveness of the TOPS package was tested against several standard ISCAS'89 benchmark circuits. These are tabulated in Table 1. \(NF\) is the number of flip-flops in the original circuit. \(UFC\) indicates the unscanned fault coverage obtainable by running a random test pattern generator as explained in the previous section. \(SS\) is the size of the scan set (number of flip-flops selected for scan). \(SFC(G)\) is the fault coverage obtained through the scan set selected by the \texttt{Greedy} procedure. \(SFC(G + A)\) is the fault coverage obtained by first running the \texttt{Greedy} procedure and then improving the solution by running the \texttt{Anneal} procedure.

We compared our benchmark results with other published work, namely, [4] and [8]. Our results were better in three cases (\texttt{s298}, \texttt{s386}, and \texttt{s510}), and comparable in the remaining cases. It is to be noted that in [4], and [8], the authors used a deterministic sequential test pattern generator on the unscanned circuit. As a result, the unscanned fault coverage reported by these authors is significantly higher than those in Table 1. As an example, for the circuit \texttt{s526}, the unscanned fault coverage is 49.4\% in [8]; the random test pattern generator which we used could only generate a fault coverage of 9.9\%. Similarly, the unscanned fault coverage for \texttt{s386} is 67.44\% in our system, whereas it is 81.8\% in [4]. Table 1 also throws light on the performance of the Simulated Annealing algorithm in comparison to the greedy algorithm. The greedy algorithm competes with the annealing algorithm in most cases, but the annealing algorithm performs better in three of the nine cases tested. This is to be expected, since the greedy al-

<table>
<thead>
<tr>
<th>Circuit</th>
<th>NF</th>
<th>UFC</th>
<th>NUFC</th>
<th>SS</th>
<th>SFC(G)</th>
<th>SFC(G + A)</th>
</tr>
</thead>
<tbody>
<tr>
<td>s208</td>
<td>8</td>
<td>49.3</td>
<td>8K</td>
<td>3</td>
<td>67</td>
<td>71.16</td>
</tr>
<tr>
<td>s298</td>
<td>14</td>
<td>87.98</td>
<td>128K</td>
<td>3</td>
<td>100</td>
<td>100</td>
</tr>
<tr>
<td>s386</td>
<td>6</td>
<td>67.44</td>
<td>32K</td>
<td>2</td>
<td>96.61</td>
<td>96.61</td>
</tr>
<tr>
<td>s420</td>
<td>16</td>
<td>33.04</td>
<td>2K</td>
<td>8</td>
<td>64.42</td>
<td>83.95</td>
</tr>
<tr>
<td>s510</td>
<td>6</td>
<td>0.00</td>
<td>1K</td>
<td>1</td>
<td>100</td>
<td>100</td>
</tr>
<tr>
<td>s526</td>
<td>21</td>
<td>9.9</td>
<td>16K</td>
<td>8</td>
<td>94.95</td>
<td>98.74</td>
</tr>
<tr>
<td>s820</td>
<td>5</td>
<td>50.11</td>
<td>128</td>
<td>1</td>
<td>89.88</td>
<td>89.88</td>
</tr>
<tr>
<td>s832</td>
<td>5</td>
<td>35.17</td>
<td>1K</td>
<td>1</td>
<td>98.39</td>
<td>98.4</td>
</tr>
<tr>
<td>s1238</td>
<td>18</td>
<td>76.75</td>
<td>128K</td>
<td>2</td>
<td>94.70</td>
<td>94.70</td>
</tr>
<tr>
<td>s1488</td>
<td>6</td>
<td>61.98</td>
<td>512K</td>
<td>1</td>
<td>99.5</td>
<td>99.5</td>
</tr>
<tr>
<td>s5378</td>
<td>179</td>
<td>68.99</td>
<td>512K</td>
<td>54</td>
<td>97.17</td>
<td>98.7</td>
</tr>
</tbody>
</table>
The annealing procedure, on the other hand, can start with the solution generated by *Greedy* and improve it further by applying local transformations.

5 CONCLUSIONS AND FUTURE WORK

In this paper, we have described a tool suite known as TOPS which we have implemented around an existing Design Automation package. TOPS is a hybrid of two heuristics for the partial scan selection problem. A greedy procedure is used to first select a good starting solution, which is improved iteratively using the Simulated Annealing procedure. The essential idea behind TOPS is to first use a sequential test generator to catch as many faults as possible. The aim of the scan selection algorithms is then to catch the remaining faults using the fewest number of scan cells. TOPS is an integrated package and interfaces to a standard-cell design automation package. We have described the performance of TOPS on several standard benchmark circuits. The benchmark results indicate that TOPS provides an effective method for partial scan selection. We are presently working on extending the TOPS package on several fronts. First, we feel that the greedy algorithm can be further improved by posing the selection problem as a *linear assignment problem*. Second, we are studying the relationship between the structural properties of the circuit (such as acyclicity) and their relationship to circuit testability. The greedy procedure and the annealing algorithm presented in this paper do not directly take into account such structural properties. Instead, they rely on the SCOAP testability measure in deciding the contribution of a flip-flop to total circuit testability. SCOAP, in turn, uses the circuit structure in assessing the controllabilities and observabilities of individual nodes. In a recent survey conducted by Chandra and Agrawal [3], Chickermane and Patel [4], and Lin and Reddy [8] have indicated the usefulness of several structural properties in judging the testability of a circuit. We have presently initiated an effort to improve TOPS through the use of similar structural properties such as acyclicity.

Acknowledgements

We thank Dr. Raghu Hudli of IBM Fishkill for helping us verify the correctness of our SCOAP implementation by supplying us the SCOAP testabilities computed independently.

References


Biographies

C.P. RAVIKUMAR obtained his B.E. degree (electronics) from Bangalore University, India (1983), M.E. (Computer Science) from Indian Institute of Science (1987) and Ph.D. (Computer Engineering) from University of Southern California (1991). Since 1991, he is with the Department of Electrical Engineering, IIT Delhi where he is an assistant professor. His research interests are in the areas of VLSI Design Automation, Parallel Architectures and Algorithms and Combinatorial Optimization.

HAROO RASHEED obtained his B.E. (Electronics) from REC, Calicut (India) in 1989, and M.TECH (Integrated Electronics and Circuits) from Indian Institute of Technology, Delhi in 1992. Since 1992, he has been employed at S.G.S. Thomson, New Delhi, as a design engineer.
Submit your manuscripts at http://www.hindawi.com