Deep-submicron Placement Minimizing Crosstalk

Placement of multiple dies on an MCM or high-performance VLSI substrate is a non-trivial task in which multiple criteria need to be considered simultaneously to obtain a true multi-objective optimization. Unfortunately, the exact physical attributes of a design are not known in the placement step until the entire design process is carried out. When the performance issues are considered, crosstalk noise constraints in the form of net separation and via constraint become important. In this paper, for better performance and wirability estimation during placement for MCMs, several performance constraints are taken into account simultaneously. A graph-based wirability estimation along with the Genetic placement optimization technique is proposed to minimize crosstalk, crossings, wirelength and the number of layers. Our work is significant since it is the first attempt at bringing the crosstalk and other performance issues into the placement domain.


I. INTRODUCTION
Rapid growth of multimedia and communication systems demands the use of both analog/digital mixed signal ICs and deep sub-micron (below 0.6gm) CMOS technologies.The higher density and improved electrical performance of MCM technology is needed in these systems.
The first step of the physical design process is placement.Over the years, a wide variety of placement algorithms have been developed.For an comprehensive overview of placement and routing * Corresponding author, e-mail: jdcho@skku.ac.kr algorithms, see [19].One example is from Esbensen and Kuh [9,8] who present a topic of Genetic- based multi-optimization in a floorplanning tool called Explorer, which generates alternative solu- tions with different main objectives, and simulta- neously minimize layout area, deviation from target aspect ratio, routing congestion and the maximum path delay.
In the mixed analog-digital layout design and deep sub-micron CMOS technologies, automated synthesis of interconne.ctionsduring the early placement stage of the design cycle is emerging as a most promising approach.High speed design techniques or signal integrity analysis becomes important when the propagation delay across the interconnect is 20-25% of the rise/fall time.Current placement level design models do not capture important physical design signal integrity effects such as crosstalk, power, and timing, simultaneously, which becomes the first order of factors in chip performance.Crosstalk noise should be considered because crosstalk between long wires increases delay (because of larger effective line capacitance) and also degrades signal integrity and causes logic faults.Crosstalk contributes as much as 50-75 % to interconnection delay [1] as the width of the wire and the space between wires is reduced.
As physical feature sizes decreases, the time delay of electrical signals traveling in the interconnect between active devices and gates is approaching the delay through the devices and gates.The parasitic information ofthe interconnect is absolutely critical to predicting circuit performance.Thus, physical interconnections delay will overtake gate delays as a design concern by the year 2000, mandating a shift in the physical design flow for deep-submicron.Therefore, iterations between synthesis and layout increase dramatically due to timing and routability problems.The key to solving this problem is knowing more about the physical design, i.e., placement and estimated interconnect, early in the design cycle.The RTL is being defined to accurately predict size, timing and power, early in the design cycle and avoid downstream iterations.This means the design engineer needs to get back to the fundamentals of physics.It is the goal of this paper to explore physical level solutions to the deep submicron problems.
In MCMs (resp.deep-submicron layouts), some of the netlengths for connection between bare dies (or modules) can be so long that they have a resistance which is comparable to the resistance of the driver.Performance-driven placement is very important since the interconnect delays form a major part of the system cycle time.A resistance- driven placement algorithm has been proposed in [23].
However when the more performance issues are considered, e.g., in deep-submicron technologies, additional constraints in the form of net separation and via constraint become important.This is because the fabrication of densely routed designs may result in low fabrication yield or a excessive crosstalk design.Excessive local congestion gives rise to future routing difficulty and also increases the potential crosstalk noise in high-speed signal lines.PCB/IC design has considered electrical interference (crosstalk and reflection).That is very likely encounted in MCM designs due to the fast signal rise time and long chip-to-chip connection.Furthermore, larger capacitance increases the power consumption of the dies.Crosstalk is minimized by ensuring that wires carrying high a6tivity signals are placed sufficiently far from the other wires.Moreover, for high-performance MCM routing, intersections of wires cause the use of more vias which, in turn, require more routing resources (because of the large via pitch), low manufacturing yield, and cause noise problems (because of the mismatched characteristic impe- dance between wires and vias) [26].As a result, note that we are facing toward wire-design problems as going to sub -0.1 m design.
The problem of crosstalk is addressed typically after the placement step.Now it is the responsi- bility of P & R tool to detect potential crosstalk as early as possible.The next step in physical design is to assign every global route in the layout environment to a plane pair, called layer assign- ment, so that the capacity constraints are satisfied on all plane pairs, and the number of plane pairs transmission lines.[21] described a new detailed routing method for analog full-custom ICs.The router is gridless and performs electrical optimization such as crosstalk, resistivity, ground capaci- tance and electrical symmetry, at a symbolic level.The algorithm is a sequential routing based on a wire ordering.[16] proposed a spacing algorithm for performance enhancement and crosstalk mini- mization, and [12,2] proposed a channel routing enhancement to minimize crosstalk.In [6], the routing area is decomposed into channels which is a dense and thus crosstalk-sensitive area.The main goal of the MCM router developed in [7] is to route all the nets with a minimum number of layers and reduce the crosstalk by separating high frequency wires with a bound over the number of vias used in routing.The routing is carried out by scanning the routing region horizontally or vertically from one end of substrate to another.The algorithm is an extension from [17].A post global routing crosstalk risk estimation and reduction is presented by [30].
There is no known reports on crosstalk mini- mization simultaneously during placement for high performance circuits.We need a reasonably fast timing and wirability estimation during placement.Timing errors due to crosstalk and timing violation after layout requires another design iteration.However, if we handle this problem in earlier stage like placement, there is much likely not to have timing errors in later routing stage.That leads to shorten the design cycle and thus lessen the design cost.
As a result, early estimation of wirability during placement is important, but net topology is difficult to estimate at the placement stage.One way to get over the problem is to consider placement and global routing simultaneously.
However, the high problem complexity may not lead to an effective solution.The advantage of our solution over the existing method is that we introduce a fast topology-mapped global routing (which will be described in the next section) to take the several performance constraints into account simultaneously.The problem is formulated as a graph-based optimization problem.Our work is significant and innovative since it is the first attempt at bringing the crosstalk and crossing minimization problem into the placement domain.
In Section 2, we formulate the problem.In Section 3, a new heuristic to find a global routing estimation using one-bend routes is presented.An efficient solution to the problem using a Genetic algorithm is then proposed in Section 4. Experi- mental results and conclusion are presented in Sections 5 and 6, respectively.

II. THE NEW ESTIMATION
Our placement model targets MCMs, but can also be applied to module placement in a chip layout.A given input is a set of rectangular chips of the same size with pins fixed within each block and a specification of n nets, including timing constraints on nets.Each output solution specifies an absolute position of each chip.The problem is stated as follows: Given a set of chips C and a set of chip sites S, find a mapping 4): C S, so as to minimize the crosstalk, crossings and total wirelength needed for routing and to ensure routability of the design in a minimum number of routing layers.Conventionally, a cost metric based on wirelength plus congestion increases the wirability.However, in our formulation, we do not consider the congestion measure, explicitly.We observed by experiments that congestion minimization is done automatically while we perform crosstalk and crossing minimization simultaneously, because it distributes wires evenly over the MCM substrate.Note that minimizing the number of crossings reduces the wirelength, whereas minimizing the crosstalk does not always do so.Next, we introduce a new interference measure based on crosstalk and crossings.

II.1. Net Topology and Graph Generation
Multi-terminal nets have many possible routing topologies such as daisy chain, Steiner tree, star and A-tree [29,3].However, it is impractical to consider all configurations of a large fan-out net because the number of net topologies as a function of the number of a large fan-out receivers increases rapidly.In [18], Raghavan, Cohoon and Sahni demonstrated a polynomial time solution (O(n2) time) for a one-layer routing problem called single bend wirability problem, for two-terminal nets, which is the problem of determining whether there exists a planar routing with at most one bend per net.The problem can be reduced to the 2- satisfiability problem.However, allowing multiple terminals renders the single bend wirability pro- blem NP-complete [31].
The formulation cannot be directly applied to solve our problem that considers multiple con- straints on wires.The bounding box measure (of wiring interference) for placement without taking net topologies into account completely is not sufficient.Thus, we consider, for two-terminal net i, two possible one-bend global routes, denoted i(-1) and i('r2).It is desirable that the multi-ter- minal nets are routed within the smallest bounding box enclosing the terminals belonging to the nets, and with their favorite topologies as mentioned previously.For example, one restricts one to a specific routing pattern for a multi-terminal net with a mincost Steiner tree 2 having minimum wirelength, minimum bends, and minimum stubs.
A stub or branch in a tree introduces extra delay and/or ringing in the received signal waveform [22].Evidently, the topology estimate from a placement in this way is poorer than the estimate from global routing, but it is necessary compro- mise for a strong coupling between the placement and global routing.
Based on these facts, given a placement, we create an interference graph G=(V, E) (refer to Fig. 2), where VI 2n and IEI < 2n(n-1) (in case of two terminal nets), to formulate the interference relation between n nets.In G, there are n nets of two types (/(7"1) and i(-2)), thus V[ 2n.Edges are formed by connecting every node i(-a) to 2(n-1) other nodes except for i(-2)).Each node in V represents a net and a weight on an edge in E represents a net-pair crosstalk and crossings mea- sured as below.If there is no crosstalk effect between two nets, then there is no corresponding edge in the graph G, thus IEI can be less than or equal to 2n(n-1).In general, a graph G(V, E) is a comparability graph if it satisfies the transitive orientation property, i.e., if it has an orientation such that in the resulting directed graph G(V, F), {(vi, vj) E F and (vj, vk) E F} (vi, lk) F [13].Note that the interference graph G is a comparability graph because it satisfies the transitive orientation property by directing edges from left to right as sliown in Figure 2.

II.2. Crosstalk Measure
A popular approach used in the past to model the dependency of performance functions on parasitics is net classification.Nets are classified according to the type of signal they carry (stable, large swing, sensitive to noise, etc.).A bus of several sensitive nets running parallel to each other with correlated signals might inject considerable noise into a single net.The crosstalk-critical region is defined as a region enclosed by two wire segments of net and net j so that their coupling distance (i.e., the distance between two wire segments of different nets that run in parallel) d(i,j) is less than or equal to a small constant.The value depends on device technology.For example, using AC device tech- nology on an MCM-L layer, tS= cm [5].The shaded regions in Figure corresponds to the set of crosstalk-critical regions induced by the given global routes of the two nets.The crosstalk between two nets i(7p) with toppolgy -p and j(7"q) with toppolgy -q, denoted as #(i(7-p), j(-q)), is   For example, a simple measure which satisfies this property adds an edge between two vertices iff the bounding boxes of corresponding nets intersect.If the bounding boxes of two nets intersect in a highly congested region, the routability is more severely affected than if they intersect in a region with very few nets.
For nets with large terminals, a mincost Steiner heuristic is used.estimated as proportional to the maximum length for which two nets run in parallel and is inversely proportional to the minimum separation between the parallel wires: # ('rp j('rq E g k /dk k E X(i("rp),j(7q)) where K(i('rp), j('rq)) is the set of crosstalk-critical regions between two nets with topology rp and j with topology rq.An interference graph is estab- lished for net-pairwise crosstalk value being an edge-weight of the graph in O(n2) time.Then, noise tolerance Ti for net with topology rp with respect to the crosstalk measure # is approximated as (a) Crosstalk Estimation in Global Routing, (b) Placement based on Genetic Algorithm with the global routing Ti(-p) Mi(-p) 'vj(q)#(i('rp),j('r'q)), where Mi(.)   is the maximum allowable coupled noise for net with topology type -p and j('rq) is the crosstalk- critically adjacent net with respect to net with topology type -q.We aim to identify the placement which either maximizes the sum of noise tolerance or maximizes the minimum noise tolerance for all nets.Thus, our goal is to remove all noise tolerance violations.

II.3. Crossing Measure
To minimize the number of signal line crossings and to minimize both congestion and wirelength, we also incorporate the wire crossings into our cost function.Especially for analog nets, crossings are one of the dominant crosstalk noises.The crossing effects of net-pair x(i(7"p), A'r'q)) can be computed by the number of intersection points between the net with its topology -p and the other net j with its topology -q.
II.4.Timing Constraints We need to consider net topologies and size that cause delays larger than the performance require- ments or longer than the maximum allowable driver-to-receiver path length.A method of gen- erating bounds on both net length and width of lossy transmission line interconnects to satisfy timing and overshoot constraints of the MCM and PCB designs is described in [11].
Then, timing tolerance Si(.r) for net with topologywith respect to the maximum driver- to-receiver path length Lio-) is approximated as Si(.r) Mi(.r)-Li(.r),where mi(.r) is the maximum allowable driver-to-receiver path length for net with topology -.

II.5. Object (Fitness) Function
The first approach to our placement algorithm is to find a placement q5 in (I, (a set of placements) with its global routing a; among a set of global routing solutions fl(qS) minvf(c), where the objective function f(qS) is minw(){a.norm (maxi(r) N{--Ti(r)}) + fl norm((i(rp),j(rq)) 6 E x(i (rp),j(rq))) + 7" norm (maxi(-) N{--Si(-)})).Here, the user-defined parameters a,/3 and ,y reflect the relative importance of minimizing the crosstalk, crossing, and timing errors respectively and norm () is a function to normalize the values of cross- talk, crossings and timing errors such that the values are between 0 and 1.A point of the design space is called a Pareto point if there is no other point (in the design space) with at least a inferior objective, all others being inferior or equal.A Pareto point corresponds to a global optimum in a monodimensional design evaluation space.The image of the Pareto points in the design evaluation space is the set of the optimal trade-offpoints.Their interpolation yields a trade-off curve (2-D) or surface (3-D).In our case, Pareto point set corresponds to an optimum solution to a placement problem minimizing either the total wirelength or crosstalk or timing errors by varying c, /3 and -y.We are to find a Pareto point set by determining for appropriate weights for c, /3 and -y to meet the various design con- straints.An example with a various c and /3 is shown in Figure 4 in our experimental results.
A design evaluation space: wirelength/crosstalk/ number-of-layers trade-off points is shown in Figure 5.

III. FINDING A TOPOLOGY-MAPPED GLOBAL ROUTING WITH LESS CROSSTALK AND CROSSINGS
We first select a placement q5 which minimizes the wirelength, satisfying the given timing constraints.
Given the initial placement, crosstalk and cross- ings are the next objectives to be minimized.
Note that we select a topology for a net among a set of topologies for the net given by designer.Thus we refer to the problem as "Topology-Mapped Global Routing".The problem of identifying the global routing which minimizes the total crosstalk noise and the number of crossings can be formu- lated as finding a minimum-edge weighted clique Kn of size n in G.Note that in G the maximum mode clique is of size n.In general graphs, both the maximum node-and edge-weighted k-clique pro- blem is NP-complete, but when restricted to a comparability graph, the exact algorithm on a node-weighted maximum clique problem can be implemented to run in linear time in the size of the graph [13].However, the exact polynomial-time algorithm on finding a minimum edge-weighted kclique problem is not known even in comparability graphs.In general, given a complete undirected graph G(V,E) with edge weights w(i,j), and an integer k, the minimum edge-weighted k-clique problem finds a minimum edge-weight clique with k nodes.The problem can be formulated as the following 0-1 integer programming problem which can be solved by any ILP package.
(i,j) E E s.t.(2) xi E 0, 1), y(i,j) 0, 1) O,y(i,j) xj <_ 0, V( X "[-Xj y(i,j) < 1, V(i,j) E ( The problem of minimizing the maximum cross- talk noise will be formulated similarly.We con- jecture that the problem can be shown as NPcomplete.Evidently, ILP is not our choice for fast wirability estimation even though it can be used for global routing.
Our experimental results on some heuristics to solve the problem showed that the edge-weighted formulation does not work well in practice due to intractable complexity of finding a minimum-edge weighted clique.
Motivated by the fact, for fast and reasonable estimation, the problem is transformed into a version of finding maximum node-weighted clique of a comparability graph G.Each node i(7"p) is assigned by their corresponding average crosstalk noise tolerance Ti(p) Mi(p) vj(q)(#(i(-p), j ('q))/]Q]), where ]Q] is the number of different topologies for net j and others are defined as in the previous section.Refer to Figure 2.  In Figure 2, given an edge-weighted net inter- ference graph G in Figure 2b, Figure 2c shows the weights (crosstalk, crossings) on nodes.The cross- talk value on each node can be obtained by computing Ti(p) for each node i(7-p) that corre- sponds to net with its topology rp. Figure 2d shows an optimal solution in terms of minimizing the crosstalk.Here, the total crosstalk is the summation of edge weights that is (1.5+3.8+ 1.9 7.2).Whereas, in Figure 2e shows a possible approximate solution by finding a maximum node- weighted clique.In this case, the total crosstalk (= 9.5) is little more than the case of Figure 2d.
The crossing measure on node can be similarly computed.The problem can be solved optimally in linear time by selecting a vertex with maximum weight among vertices corresponding to each net.
In this case, the result of minimizing the maximum crosstalk (equivalently, maximizing the minimum crosstalk noise tolerance) is same as one of minimizing the total crosstalk.

IV. PLACEMENT WITH GENETIC ALGORITHM
Over the years, a wide variety of placement al- gorithms have been developed.Sharookar and Mazumder [25] have provided a survey of various placement techniques.For solving simultaneous multi-objective optimization problems, iterative probabilistic search optimization algorithms like the simulated annealing [20] or genetic algorithms are used and the process is iterated until some stopping criteria is satisfied.There have been presented several genetic placement algorithms [24,15,28].The problem of crosstalk is typically addressed after the placement step.There is no know report on crossing and crosstalk minimiza- tion simultaneously during placement, thus our work is significant and innovative since it is the first attempt at bringing the crosstalk and cross- ings into the placement domain.
The usual string representation in a genetic placement algorithm is as follows.The i-th element within a string corresponds to the placement of the i-th chip in row major among all possible placement locations in all the rows in the MCM layout.Each string is also associated with the fitness cost function described before.Finding a appropriate set of parameters for GAs is crucial to its performance.A genetic algorithm is char- acterized by the number of offsprings to be generated (0 < C < 1) and the fraction of popula- tions to be mutated (0 < M < 1).Optimal values for these parameters, C and M, are obtained automatically by running another meta-genetic algorithm for optimization of those parameters.
Esbensen and Mazumder [10] have combined the genetic algorithm and simulated annealing algorithm to speed up the optimization search and obtain better placements compared to either algorithm alone.Our Genetic paradigm is similar to the above approach except that we set C=M= 1.Our algorithm can be described as follows.
In summary, our approach is depicted in Figure 3.

V. EXPERIMENTAL RESULTS
To evaluate the effectiveness of the algorithm, the placer was implemented in 'C', and was tested on MCC1, Ami49, and apte from MCNC benchmarks.Table I gives the description of the designs on which the placer was tested.All the experiments were tested using a SUN SPARC- classic.
To see the impact of the crosstalk and crossings, we compared the proposed method with the placement of not considering crosstalk and cross- overs, and with different values of a and/3 in the objective function in the previous section.
Our approach is similar to the approach of [9], but differs in generating a set of alternative solutions by controlling the parameters a and/3.Among all the solutions with different value of the parameters, we select the ones that correspond to Pareto point set.Then user can choose his or her favorate solutions.
We selected a minimum one with respect to crosstalk among 9 different a's and/3's.Figure 4 shows the result of using MCC1.The best solution was when a=0.2 and /3=0.8.After performing our placement algorithm, the wirability (i.e., the number of layers required), wirelength, and cross- talk were compared by running a crosstalk Driven MCM router [7].As in Table II, crosstalk is significantly reduced to 26% on the average.Furthermore, wirelength and the number of layers were decreased by 8.9% and 8.3% respectively on ttie average.For example, in the case of MCC1, the wirelength and crosstalk was improved by 14% and 17% respectively without increase in the number of layers.The execution time of running our algorithm took about 2 hours clock time (not CPU time) for MCC1, 30 minutes for Ami49 and 10 minutes for apte; the computing time was mainly due to wirability estimation step described in Section 3.  Finally, we investigated the wirelength effect of minimizing crosstalk.Remind that crosstalk is proportional to the maximum length for which two nets run in parallel.Figure 6 represents the relationship between wirelength and crosstalk.It is interesting to observe that our placement algorithm automatically minimized the wirelength metric implicitly without including wirelength measure in our objective function.Thus, it can be a "universal" metric for optimizing deepsubmicron physical designs.

VI. CONCLUSION AND FUTURE WORKS
We have presented an effective crosstalk-minimum placement algorithm that considers minimum bend global routing simultaneously.A Genetic approach to solve the placement problem is presented.A novel graph formulation to find a minimum node-weighted k-clique on an comparability graph is presented for fast wirability and performance enhancement.Finding an optimal minimum edge-weighted kclique problem in interval graphs is not known to be NP-complete, thus it is a open problem.Thus, one of future directions would be to develop a more efficient heuristic to find the near-optimal or optimal minimum edge-weighted k-clique in an interference graph.
FIGURECrosstalk and Crossing.Minimizing crosstalk may introduce more crossings.

FIGURE 4
FIGURE 4 Test results of MCC1 with varying a and/3.

TABLE II A
comparison between placement without ours and with ours