Low-Power and Min-Crosstalk Channel Routing for Deep-Submicron Layout Design

Consider a set of nets given by horizontal segments S = {s1, s2, ..., sn} and a set of tracks T ={t1,t2,...,tk} in a channel, then a track assignment consists in an assignment 
of the nets to the tracks such that no two nets assigned to the same track overlap. One 
important goal is to find a track assignment with the minimum number of tracks such 
that the signal interference between nets assigned to neighboring tracks is minimized. 
This problem is called crosstalk minimization. For a given track assignment with k 
tracks, crosstalk can be reduced by finding another track assignment for S with k tracks (i.e., by permuting tracks). However, considering all possible permutations requires 
exponential time. For general cost function for crosstalk measure, the problem is NPhard. 
Several heuristic approaches were previously presented. In this paper, we consider 
special instances of the crosstalk-minimization problem where the cost function depends 
only on the length of the segments that runs in parallel and all pairs of segments intersect. 
An algorithm solving this problem in O(n log n) time is presented. An extension 
applied to the instances with more general function of switching activity and mixed 
signal sensitivity to reduce crosstalk and power consumption is also presented.


INTRODUCTION
As CMOS technology advances into deep submi- cron, some of the net lengths for interconnection between modules can be so long that they have a wire resistance, Rwire, which is comparable to the resistance of the driver.As the interconnection delays become more and more dominant in the overall delay, performance-driven (for time and low power) routing becomes important, in addition to the traditional goals of area and interconnect minimization.The coupling capacitance, Ccoupling between minimum pitch wires on a 0.25 gm CMOS IC can account for over 80% of the total capaci- tance of a wire.This makes interconnect crosstalk noise one of the biggest challenges in VLSI design *This work was supported by the Korean Science and Engineering Foundation under Grant 965-0918-005-2.
tCorresponding author, e-mail: jdcho@yurim.skku.ac.kr today.When Rwire Ccoupling trising, there is noise problem.The increase of crosstalk not only holds for coupling via the interconnect, but also for the crosstalk via the substrate.The crosstalk is also pro- portional to the power consumption in CMOS circuits, thus minimizing crosstalk also leads to minimizing power consumption.A battery-oper- ated multimedia system in 0.1 tm technology will require 40 NiMH battery cells which causes the system not to be portable.
Therefore, as the interconnection delays become more and more dominant in the overall delay, per- formance-driven routing becomes important, in addition to the traditional goals of area and inter- connect minimization.Timing, routability, size, and power problems are not discovered until after detailed routing.Therefore, deep-submicron de- signs require crosstalk-free detailed routing.Major contributions on this paper is to develop a very fast optimal algorithm to improve the routing quality in terms of crosstalks for special instances of channel routing problems.One can extend the proposed basic algorithm framework to incorporate the other performance issues for practical use..1.Crosstalk Measure Crosstalk is a capacious and inductive interference caused by the noise voltage developed on signal lines when nearby lines change state.It can occur in the following manner as in Figure 1.When the voltage aggressor (a) of an aggressor signal changes as the voltage of a victim signal is in transition through the high gain region of a receiver circuit, a ripple or a small glitch is formed in the victim signal if the switching direc- tions are opposite of each other.This high-gain crosstalk can affect a circuit in many different ways: In a static CMOS design, this glitch increases the wire delay considerably.It also increases the re- ceiving circuit delay since it alters the effective input rise/fall time.This causes speed-related logic errors.
Sometimes, this glitch can propagate through many fanout gates.If the signal feeds into a dynamic logic gate, it can discharge the storage charge during the evaluation phase and cause a logic error.
The crosstalk is a function of the separation between signal lines and the linear distance that signal lines run parallel with each other.To maxi- mize system speed, crosstalk must be reduced to levels where no extra time is required for the signal to stabilize.Signals such as clocks, that are highly sensitive to crosstalk should be isolated by refer- ence planes from signals on other layers and/or by extrawide line-to-line spacing.
Signals are grouped into categories based on waveshape control requirements, crosstalk limits, or other special requirements.For example, clocks, strobes, buses, memory address, data, chip-select, and write lines, and asynchronous signals, ECL signals, and analog signals have special routing requirements as follows [1].
Data buses: Crosstalk between buses tends to be data-pattern-sensitive and is worse when all ad- dresses or data lines change in the same direc- tion at the same time.Memory address and data signals: Cross-cou- pling between any combination of memory in- put or output signals as well as crosstalk between nearby unrelated signals can be disruptive and must be guarded against.Excessive cross-coupling from memory data lines to address lines during read cycles can result in positive feed- back that degrades the response time of the mem- ory device and in extreme cases can cause unstable oscillatory operation.That data-to-ad- dress-line cross-coupling may upset address lines sufficiently to cause write signals to incorrect memory locations.Memory wire lines require the highest possible degree of isolation from cross- talk.They must be isolated by reference planes and by extrawide line-to-line spacing from other signals, particularly other memory chip-selects, address line, and data buses.When signals are not isolated by planes, care must be taken to ensure that memory signals are not run directly above or below a critical signal for some length.Clock signals and Strobes: To meet crosstalk limits, clock signals must be isolated and confined between reference layers.Other signals must not be mixed with clocks.Clock signals on a given layer must have extra spacing between lines.Clock signals of different frequencies must have extrawide spacing, as well as clock signals and other signals if they are to be mixed.ECL and analog: ECL and analog signals require a high degree of isolation from TTL-or CMOS-level signals.They must be physically isolated in separate board areas with separate ground and voltage planes that are isolated from TTL or CMOS switching currents.
Each net in the mixed analog/digital circuits is identified depending upon its crosstalk sensitivity [2].
Noisy high impedance signal that can disturb other signals, e.g., clock signals.High-Sensitivity high impedance analog nets; the most noise sensitive nets such as the input nets to operational amplifiers.
Mid-Sensitivity low/medium impedance ana- log nets.Low-Sensitivity digital nets that directly af- fect the analog part in some cells such as control signals.
Non-Sensitivity The most noise insensitive nets such as pure digital nets.
The crosstalk between two interconnection wires also depends on the frequencies (i.e., signal acti- vities) of the signals traveling on the wires.
Once the electrical designer has established the electrical requirements or limits of each signal cate- gory based on the system performance requirements and error budgets, the requirements must be tran- slated to specific mechanical requirements for the routing people.For example, twisting signal lines with ground lines provides some shielding effects minimizing the chance for coupling into adjacent wiring.As the VLSI technology improves, more layers are available for routing.As a result, there is a need for developing multilayer routing scheme (e.g., the layer assignment of 7-layer process is as follows.1,2: local routing, 3,4: inter-block routing, 5,6: power, ground, top layer: clock) that reduces the die size (and thus the average wirelength) and crosstalks.
A number of papers have been published related on the crosstalk issues: mixed analog and digital applications [2]; crosstalk minimum layer assign- ment [3, 4]; a spacing algorithm [5]; a channel rout- ing enhancement considering crosstalk by a linear programming of track permutations [6]. [7]also addressed a channel routing algorithm and also, [8] proposed an optimal algorithm for the prob- lem of minimizing crosstalk between vertical wires in 3-layer VHV channel routing.Recently, a cross- talk-minimum rainbow k-color permutation based on left edge dynamic programming was presented by [9].

Power Measure
Power consumption in CMOS circuits is due to three sources: dynamic power consumption due to charging and discharging of capacitive loads during output transitions at gates, the short circuit current which flows during output transitions at gates, and the leakage current.The last two factors above can be made sufficiently small with proper device and circuit design techniques, thus, research in design automation for low power has focused on minimization of the first factor, the dynamic power consumption.
The average dynamic power Pav consumed by a CMOS gate is given below, where, Ct is the load capacity at the output of the node, Vaa is the supply voltage, Tcycle is the global clock period, N is the number of transitions of the gate output per clock cycle, Cg is the load capacity due to input capacitance of fanout gates, and Cw is the load capacity due to the interconnection tree formed between the driver and its fanout gates.eov 0.s 0.s vJ .
rcycl Zcycl (Cg --Cw)N (1) Logic synthesis for low power attempts to minimize -]iCgiNi whereas physical design for low power tries to minimize i Cwi Ni" Here Cwi con- sists of CXi "Al" CSi where Cx, is the capacitance of net due to its crosstalk, and C; is the substrate capacitance of net i.
For lower power layout applications, power dissipation due to crosstalk is minimized by ensur- ing that wires carrying high activity signals are placed sufficiently far from the other wires.Simi- larly, power dissipation due to substrate capaci- tance is proportional to the wirelength and its signal activity.
We need to minimize Cxi + CS to both minimize crosstalk and power consumption.In this paper, we address an effective algorithm on the crosstalk minimization problem.An extension applied to a layout with minimum power consumption is also presented.
This paper is organized as follows.We for- mulate the problem in Section 2. In Section 3, we present an optimal algorithm for the crosstalk minimization problem in the special case of chan- nel routing.Sections 4 and 5 will present experi- mental results and conclusion, respectively.

FORMULATION OF THE PROBLEMS
In a channel, given a set of multi-terminal nets N specified by the locations of their terminals on two channel sides, top layer is usually reserved for verti- cal wires and the bottom layer is reserved for horizontal wires.The general 2-layer crosstalk- minimum channel routing problem is known to be NP-complete.
The complexity of the channel routing stems from the vertical constraints.The vertical con- straints imply that the two nets whose pins are at the same row in the channel cannot be overlapped vertically.Two-layer channels are usually dense and crosstalk-sensitive.Thus, it is crucial to attain the desired crosstalk minimization solution.
In early 90's, a third metal layer became feasible.Most of the current gate-array technologies use three layers for routing.For example, the Motor- ola 2900ETL, DEC's Alpha chip, Intel's 486 chip used a three metal layer process and original Intel Pentium was also fabricated on a similar process.The three-layer routing algorithm can be classified into two main categories: the reserved layer and the unreserved layer model.The reserved layer model can further be classified into the VHV model and the HVH model.Note that in VHV routing, the vertical constraints between nets no longer exist.Therefore, the channel height which is equal to the maximum density can always be real- ized using Left-Edge-Algorithm.Without vertical constraints, more nets are permutable in a channel.Thus, at a cost ofone more layer, the VHV routing is effective in terms of both area and crosstalk.
There are pairs of nets that cannot be assigned to the adjacent tracks because some nets might strongly interfere each other.Note that we can reduce crosstalk by maximizing the track separa- tion between pairs of nets with high crosstalk.Thus, we formulate the crosstalk minimization problem in VHV model as follows.

DEFINITION
Given k tracks T (tl, t2,..., tk) and n intervals, i.e., horizontal segments of net, S--(s1,$2,... ,Sn), k gl.The crosstalk minimiza- tion problem is to find an assignment b of the intervals to the tracks, i.e., d'S T, such that no two intervals assigned to the same track intersect, and the cost function Z Wi'j sllbject to Isi-sjl (si,sj)s is minimum, where wi is X Ni(resp.N/)= switching activity of net (resp.net j); Ai/-signal sensitivity (for mixed signal interactions) between nets and j; X o.Lo./D O.
coupled noise between nets and j; Li/-coupled length between horizontal wires of nets and j; D o. separative distance between horizontal wires of nets and j.
A special case of the crosstalk minimization problem where n k is track permutation for crosstalk minimization.
Note that in the worst case the crosstalk is proportional to (Ng+ N/).For example, when all addresses or data lines change in the same direc- tion at the same time.
Let us identify horizontal segments of nets in the channel with intervals.Then an interval clique is a set of intervals whose corresponding interval graph is a clique.That is, when scanning the channel from left to right say, we consider the sets of intervals corresponding to all the intervals intersecting a vertical cut-line as depicted in Figure 2.
In the following, we are only concerned with crosstalk-minimum track permutation.In Section 3, we will describe an algorithm considering only X, called the first order model, in Section 3.1, and extend the algorithm considering Xo.(Ni+Nj)Ai, called the second order model, in Section 3.2.

Traveling Salesman Problem (TSP)
INSTANCE Set S of n cities, distance W(Si, Sj) E Z + for each pair of cities si, sE S, positive integer B.
The problem is NP-complete in general graphs.
A brute-force algorithm generates n! tours and a dynamic programming uses O(n22n) time.
TSP is a special case of crosstalk-minimization for interval clique, provided an arbitrary cost function is given.So crosstalk-minimization pro- vided the cost function is arbitrarily chosen is NPhard, even in the interval clique case.The first order model we consider is easier not because of restriction to interval clique, but because of our restriction to a very special cost function, i.e., just the length of the interferences.
Vertioal Cut-Line intervals interval clique interval clique ki+l FIGURE 2 Two interval cliques in a channel.

Algorithm on the First-order
Crosstalk Model Consider an interval clique S {S1,S2,...,Sn} where si (gi, ri), gi <_ ri, where gi and ri represents x-coordinates of the left and right end points of the interval si.The length L(I) of an interval s (g, r) is defined as the quantity ]rg[.A simple heuristic to the problem of finding minimum-crosstalk track assignment on interval cliques is to adapt a "greedy" algorithm.
ALGORITHM Greedy (Interval Clique) Step 1 assigned Null; unassigned S; So a virtual segment corresponding to top channel shore; Sn+a a virtual segment corresponding bottom channel shore; Step 2 Select two segments s; and sj from unassigned such that w o. is the largest, and assign si to tl and sy to t2; assigned {si, sy}; unassigned S {si, sy}; Step 3 Select a segment Sk such that crosstalk gain, when segment k is inserted between two segments (Case 1) 0 and i, or (Case 2) between and j, or (Case 3) between j and n+l, g(ij, k) wij-(Wik + Wkj) is maximized.Then for Case 1, assign Sk to tl, s to t2, and sj. to t3; for Case 2, assign si to tl, sk to t2, and sy to t3; for Case 3, assign si to tl, sj to t2, and sk to t3; assigned assigned+ {sk}; unassigned unas- signed-{Sk} Step 4 Repeat Step 2 until all segments are inserted to the position with the most gain.
Even though the approach generates an optimal solution in most ofinstances, the time complexity of the Greedy Algorithm is O(n3) which may be not practical for large n.Also, we do not know yet whether the approach yields an optimal solution.We show in the next paragraph that there exists a polynomial-time algorithm to the crosstalk-mini- mum track permutation problem on an interval clique.
An interval clique can be partitioned into two subsets S and S(R) as follows.
PROCEDURE Clique-Partition(Input S; Output S, S(R)) Step 1 Consider a vertical cut-line that inter- sects all intervals in S, and for siE S denote left(si) the part of si to the left of that cut line and right(&) the part of si to the right of that cut line.Accordingly, partition the interval clique into two sets S left and a right.
Step 2 Sort the intervals in S left in increasing order of their wire length, and let cleft contain 'short the In/2J shortest intervals from S left and q,left "long the n-/n/2J longest intervals from Sleft.We denote by -(si)E{SHORT, LONG} the type qleft of the interval siS eft Then, "I-(Si 'short/ eft LONG. -SHORT and -(si "ong)= Step 3 Apply the process in Step 2 to S right.

COROLLARY
Let the cardinality of S be n, and the cardinalities of S and S(R) be n and n(R) respectively.
Proof The track assignment generated by Algo- rithm is an alternate LONG-SHORT sequence (refer to Fig. 3).When ne is odd, the lower bound on crosstalk is }/2j 2L(si).Similarly, the lower x-ne 2-1 bound on crosstalk when ne is even is z_,i= 2L(si) + L(sne/2).It is obvious that crosstalk for the alternate LONG-SHORT sequence meets the above lower bounds.The time complexity is dominated by sorting the interval lengths, m COROLLARY 2 Consider a track assignment gen- erated by Algorithm for a Containment Interval Clique S. Then the crosstalk for a track assignment induced by a permutation on S1 and $2 respectively is again minimum.Consequently, we can resolve a vertical con- straint (i.e., the case where two vertical wire segments overlap) without increasing the crosstalk, by exchanging two intervals having end points of the same x-coordinate.ALGORITHM 2 Track Permutation on a Mono- tone Interval Clique S(R) Step 1 Partition each interval in S(R) into two sets (left and right sets) using a vertical cut line that intersects all intervals in S(R).We denote the left part of an interval si as left(si) and the right part as right(si).Let S ft (resp.right S(R) denote the set of intervals containing all left(si) (resp.right(si)).Then, both S ft and .qright,(R) are considered as a Containment Interval Clique.
Step 2 Apply Algorithm to S ft.Proof Algorithm applied to S t generates a track permutation where the intervals from S ft form an alternate LONG-SHORT sequence.Corol- lary induces that qright also forms an alternate LONG-SHORT sequence, m COROLLARY 3 Consider a track assignment gener- ated by Algorithm 2 for a Monotone Interval Clique S(R).Then the crosstalk for a track assignment induced by a permutation on S1 {siS(R):-(left(si))= LONG} and $2 {si S(R) -(left(si)) SHORT} respectively is again minimum.procedure for the first order model.Note that by partitioning the intervals as in Figure 6, the in- stance of the second order model consists of four smaller instances of the first order model.This is true because we no longer need to consider the signal sensitivity part.For each partitioned inter- val group, we apply Algorithm or Algorithm 2 according to its type.Then, we apply Algorithm 3 to merge those four partitioned interval cliques.
For brevity, we omit the detailed algorithm description.

EXPERIMENTAL RESULTS
We have experimented our algorithm (Algorithm 3) using SUN Ultra-Sparc 2 and Pentium-Pro Machine with C/C++.We compared our algorithm with left-edge algorithm, and also with brute-force method which generates an optimal enumerative solution.Table I shows the results obtained by using our algorithm, left-edge algorithm, and brute-force method for the track permutation problem on an interval clique, respectively.We tested each algo- rithm in 10,000 times by using random generated interval cliques.For using brute-force method on the examples which have more than 9 tracks as Table I, we cannot get a result due to the exponential running time.For the cases which have less than 9 tracks, our algorithm generates the same result as the brute-force method.For all cases, the average crosstalk is about 30.

CONCLUSION
In this paper, we consider special instances of the crosstalk-minimization problem where the cost function depends only on the length of the seg- ments that runs in parallel and all pairs of seg- ments intersect.An algorithm solving this problem in is presented.An extension applied to the instances with more general function of switching activity and mixed signal sensitivity to reduce crosstalk and power consumption is also presented.The presented algorithm can be applied to a performance-driven lower power channel routing in deep submicron VLSI designs.
FIGURECrosstalk between two nets.

LEMMA 2 (
Algorithm 2) Algorithm 2 generates a track permutation for a Monotone Interval Clique with minimum crosstalk in O(n log n) time [refer to Fig. 4].