Optimal Differential Routing based on Finite State Machine Theory

Noise margins in high speed digital systems continue to erode. Full differential signal 
routing provides a mechanism for deferring these effects. This paper proposes a three 
stage routing process for solving the adjacent placement routing problem of differential 
signal pairs, and proves that it is optimal. The process views differential pairs as logical 
nets; routes the logical nets; then bifurcates the result to achieve a physical realization. 
Finite state machine theory provides the critical theoretical underpinning and formal 
proof of correctness necessary for linear time bifurcation. Regular expressions map the 
theoretical solution to an appropriate implementation strategy that employs feature 
vectors for net recognition.


I. INTRODUCTION A. Motivation
Current trends in technology indicate that adja- cent placement routing of differential signal pairs is gradually gaining acceptance as a mechanism for coping with reduced noise margins.By routing a signal and its complement in adjacent tracks throughout their run from source to terminus, the ability to reject common mode noise is enhanced.Digital Equipment's new macropipe- lined VAX and IBM's 9000 and AS400 now employ differential signals on critical nets and clock distribution signals [1,2,3].However, differ- ential routing is not without costs.Chip area increases, and interconnect doubles [4].Until  recently, these costs were considered prohibitive.
Our work produces an optimal solution to adjacent placement routing, while only adding a linear time factor to that required to route a single ended design of equivalent complexity.Finite state machine theory (FSM) provides the theoretical basis and formal verification for the bifurcation process.This is crucial because not all nets are bifurcatable, and they must be identified.Using regular expressions as a mapping function, the actual implementation strategy employs feature vectors to accomplish net recognition and identi- fication.A glossary of terms appears at the end of the paper.

B. Results Obtained
Traditional routing topics have been investigated extensively in the technical literature, to include recent articles on mixed signal routing topics [5].However, there is an absence of information concerning differential routing of integrated cir- cuits.The prevailing approach independently wires each member of a pair in succession with a conventional router.Often, long runs of closely tracking wire pairs can be obtained in this manner.
However, inevitably, one member of a pair will encounter an obstacle, forcing the pair to deviate from close traveling.
Our approach insures complete adjacent track- ing, with the following significant results: (1) Provides the optimal solution to the differen- tial routing problem.
(3) Identifies a basis set of bifurcatable net categories.
(4) Uses regular expressions to map the theore- tical identification technique into a readily implementable strategy using feature vectors.
The details of these findings will be examined beginning with an analysis of the router flow diagram and a proof of optimality.The focus then shifts to the bifurcation problem.The generalized bifurcation algorithm suffers from exponential running time.Finite state machine analysis re- vealed a fundamental basis set of bifurcatable net configurations.The actual implementation strat- egy relies on feature vectors, generated through regular expression mappings, to recognize the bifurcatable nets, and identify those that are not splittable for re-route.

A. Overview
Our three stage process is depicted in Figure 1.It begins by preprocessing net lists so that differential pairs are collapsed to logical nets, routing these single logical entities (SLE), and then bifurcating the resulting wires.This guarantees placement of differential signals pairs in adjacent tracks, while at !1 the same time making use of the best available routing tools.

B. Algorithm Analysis by Stage
(1) Stage I generates an image of the standard cell library where each differential port pair on a standard cell is merged into a single logical port as shown in Figure 2. Concurrently, the differential net list is collapsed, so that all differential pair nets are transformed into SLE nets.As the merging progresses, inversions are identified and extracted, since inversion infor- mation is not preserved in SLE form.These are all linear time translations.
(2) Stage II uses the best available commercial routing tool.Functioning in a modified environment that consists of-(a) a cell library Image, (b) an SLE net list, and (c) a modified design rule file, the core router routes the layout using large geometry or "fat" wires.
The time complexity of Stage lI is equivalent to that required to route a single ended design of similar logic complexity.
(3) Stage Ill is a post processing phase.The "fat" wires are bifurcated, the standard cell images are replaced by the real standard cells, and inversion information is re-introduced.The result is a chip with adjacent track routing of all differential nets.
Using a generalized bifurcation algorithm, the worst case time complexity is exponential as a function of decision points, or "fat" vias that must be resolved.Thus, to split all nets, an upper bound on the computational complexity is given by Eq. ( 1), Bifurcation_Time_

Complexity
where n represents the total number of nets, and Fi, the number of vias in net i.This upper bound arises, because to properly bifurcate a net you may have to exhaustively explore all configurations before finding a solution, if one exists.We propose a more effective bifurcation alternative based on FSM theory.This idea will lead to a linear time algorithm as we will demonstrate in Section IV.

III. OPTIMALITY OF ROUTING PROCESS
If we assume that a linear time solution is achievable for the third stage (bifurcation opera- tion), then the proposed solution is optimal in both computational complexity and physical layout space.

A. Optimality of Solution
Metrics such as chip area, vias, total interconnect and computational complexity define the optim- ality of the solution.The fundamental routing is accomplished by the core router.Since the pre and post processing stages do not affect the relative ratios of these metrics, the optimality of the fully differential solution will be a direct function of the core router.The only constraint imposed on the router is that it conform to the Manhattan constraint with respect to interconnect layer directions.
With respect to increases in chip area, each differential pair is always routed within the bounds of the fat wire, and generates a minimal block out region at any crossover.This also takes into account the introduction of inversions where necessary, since they will be applied at the "fat" vias.Any approach that routes the differential wires independently will generate crossover block out regions at least as large.
In the area of computational complexity, the fat-wire approach reduces the problem space from size N to size N/2, where N represents the total number of nets (counting each wire of a differ- ential pair as an individual net).This is the direct result of the single logical entity (SLE) net concept.
Let N represent the total number of thin wires.Now, the run time of a traditional router can be expressed by Eq. ( 2), where N represents the number of nets and M the available routing tracks [6].
If the number of fat wires is N/2, then the routing time for our proposed solution is given by Eq. (3).
For actual routing problems where the number of nets is in the thousands, Eq. ( 4) holds true.
T(R)Fat__Wire T(R)Traditional Therefore, if the assumption that bifurcation can be accomplished in linear time is true, then this process will always be equal to or better than a pure differential router, and the solution produced is indeed optimal from both a space and computa- tional complexity standpoint.

IV. FSM BASED BIFURCATION
A. Background Regardless of bifurcation algorithm selected, certain net topologies are unsplittable in the allocated fat wire tracks, without a local reroute.Examples of two such topologies are provided in Figure 3.By imposing the Manhattan constraint on wiring layers, the class of nets shown in Figure 3(B) can be split, Figure 3

(C). This is possible
because the constraint guarantees a via at each direction change.Each fat via introduces a degree of freedom for polarity propagation, Figure 4.The ability to perform the gross classification of nets into the two categories of bifurcatable and not bifurcatable is crucial to linear time bifurcation.A generalized bifurcation algorithm can be I l FIGURE 4 Large via split represents degree of freedom.formulated using exhaustive search.It explores all possible via settings and terminates with one of two outcomes: (1) successful bifurcation, or (2) realization that the net is not bifurcatable.Either of these cases may entail exponential running time as a function of vias in the net.Additionally, this technique ignores the inversion problem that must also be handled during bifurcation.An example of this shortcoming follows.Given the fat net in Figure 5, the algorithm will explore all setting options in an effort to resolve polarities.
Ultimately, it will fail to find a solution.Two results are apparent from this exercise.First, even with the Manhattan constraint on routing layers, some nets will not be bifurcatable.Second, without the benefit of "knowing" where to begin in the bifurcation process, the algorithm may have to explore all configurations before determining that a solution does not exist.In the worst case, this is exponential as a function of the vias in the net.
The configuration in Figure 5 that causes the problem is an over-constrained via.Each via has a single degree of freedom and can assure proper connection to a single port pair regardless of pin ordering.When two port segments converge at a via, sufficient freedom does not exist to insure proper connections.A net of this type would have to be designated for local reroute to introduce a jog to offset the segment junctions.Where the generalized algorithm fails, our solution successfully categorizes the net as to its type, and if bifurcatable, successfully splits it, reintroducing inversions where appropriate.Nets that are not bifurcatable are identified as such in linear time and marked for local reroute.
B. Basis Set of Net Categories An analysis of net structure reveals a major partitioning of bifurcatable and non-bifurcatable, Figure 6(A).Further study revealed the bifurca- table taxonomy of Figure 6(B), which consists of five categories and a group known as multiple backbone [7].The bifurcatable taxonomy repre- sents a basis set of net categories from which all other nets can be constructed.Multi-backbone nets are merely single backbone nets joined by a vertical segment.The ends of the joining segment terminate in a via on each individual backbone.
Passing a "cut line" through the adjoining segment produces a single backbone net and either another As noted earlier, the critical factor in accomplish- ing linear time bifurcation is the gross categorization of nets.The most effective way of accomplishing the task in the net list domain is by attempting to "recognize" a net as one of the bifurcatable types.If it is identified, then it is bifurcatable, otherwise it falls into the nonbifur- catable class.SLE nets can be decomposed into tokens, of which there is a finite set shown in Figure 7. Using the taxonomy as a guide, finite state machines can be constructed to identify or recognize sequences of input tokens.An example of this process is provided in Figure 8.Using the sample SLE net in  By constructing FSMs to recognize the bifurca- table categories shown in the taxonomy, arbitrary nets can be "recognized" in linear time.Addition- ally, failure to be recognized amounts to categor- ization as non-bifurcatable.This also occurs in linear time regardless of net complexity.There are two very important effects associated with this methodology.First, formal verification of correctness is achieved.For design automation tools, this is of paramount importance due to the nature of the regime.Second, system testing only requires a single net from each category be examined, as opposed to attempting extensive testing of randomly generated nets in an effort to achieve some degree of coverage.These concerns dominate as design automation system complexity continues to grow.

D. Implementation Strategy using Feature Vectors
Although the FSM approach provides an idea theoretical basis for net recognition and ultimate bifurcation, implementation of such a technique is burdensome at best.By mapping the FSM theory into an alternative domain a suitable implementation strategy can be derived.
The field of computer vision has employed feature vectors as a mechanism for recognition for many years.Let us define a feature vector F;, which is composed of components reflecting counts of primary net elements, such as number of vias, number of ports, and number of con- nectors.Each of these characteristics can be counted in time proportional to one scan over the components of the net.The value of ranges from to R, where R represents the number of net categories.By examining the relationship among the components of the vector, a categorization can be accomplished.
Fi {number of vias, number of ports, number of connectors(E/W), number of connectors(N/S), number of layers/backbones, (constraints) } The node labels of the tree in Figure 6(B) represent either the number of connectors con- tained in the net, e.g., c, indicating one con- nector.Nodes labeled E/W, N/S and E/W and N/S refer to nets with either an East (or) West facing connector, a North (or) South facing connector, or a combination of the two possibilities respectively.
Rewriting the vector components so that the number of vias is X1, the number of ports is X2, the number of E,/W connectors is X3, the number of N/S connectors is X4, and number of layers or backbones is Xs, F; can be written as V,: {X, X2, X3, X4, Xsl(constraints) } where constraints represents the conditions im- posed on the relations among the vector compo- nents.For Category 2 type nets, vector F2 with explicit conditions is defined as- F :{X, X, , X4, xlx x, x o, x4 o, xs } For the net portrayed in Figure 9, its feature vector, Fobscrvcd can be constructed by tallying

1}
Since the relationship among the components of Fobserved satisfy the conditions of F2, the net can be categorized as a Type 2 net and the bifurcation process commenced.

E. FSM-Feature Vector Equivalence
We now demonstrate equivalence between the theoretically sound FSM technique and the empirically derived feature vector approach.Fig- ure 10 shows a generic FSM.The corresponding regular expression is shown in Eq. ( 5).
Reg_Expression ab(cb)d ( 5) In this form, the strict mathematical relations among the components is established.For Eq. ( 5) the constraints are:{a 1}, {d 1}, {b > 1}, {#e #b-l} Using the FSM recognizers con- structed previously, and relabeling their arcs using FIGURE l0 Sample FSM to generate regular expression.
formal token symbol labels from Table I, the feature vectors for all categories can be generated directly.
As an example, the process will be applied to the SLE net in Figure 9. Beginning with the farthest left port, the input tokens form the expression shown below-

PVSVHSVPHSVPHSVP
Rewriting this expression, expanding segment names, results in the following form- P(VSeg)V(HSeg)VP(HSeg)VP(HSeg)VP Discounting the Seg(vertical segment) and HSe- g(horizontal segment) entries, the expression can be rewritten as- PVVPVPVP which resolves to the regular expression form- PV(VP}* From this expression, the feature vector conditions for this type net are written directly as- {Vias _> 1, Ports >_ 1, E/W Conn. 0,N/S Conn.
0, Layers 1,4Ports pVias} SLE nets can be broken into a number of distinct tokens.These tokens can then be used as input strings to FSM recognizers.A representative Type 3 Net and its corresponding FSM recognizer are given in Figure 11.
Beginning at the start state, S03 (corresponding to the farthest left port), a vertical segment to a via with a connector is encountered (VSVC).If there were no additional input tokens, the net would be accepted in state A3.For this example, there are now three occurrences of horizontal segments to vias with port segments emanating from them.This is represented by the HSYP transition to state A4, and the subsequent cycling back to state A4.
The last token found is the horizontal segment to the East/West connector on the right end (HSC(E/ W)) that causes the transition to acceptor state As.
If our example net did not have the left side E/W connector, then the transition path through the state machine would have been across the top path.
The feature vector concept provides the readily implementable, linear time method of categoriza- tion.Finite State Machine (FSM) theory provides the theoretical underpinnings of this approach.Regular expressions derived from these recogni- zers demonstrate equivalence between the theore- tical foundation provided by FSMs and the set of feature vectors.G. Analysis of the Algorithm By using the underlying FSM theory, SLE wires can be recognized and categorized in linear time by means of a feature vector.Once categorized, the SLE wires can be split in time proportional to scanning the components, replicating those parts necessary to construct the complimentary wire of the differential pair, and computing the via split decision at each junction.At the same time, inversions which were extracted from the problem space during the collapsing of the net list are reintroduced as part of the process.What remains to be shown is that the via split decision making and re-introduction of inversions can be dealt with in linear time.
The Manhattan routing layer constraint im- posed earlier guarantees that for all nets, other than the degenerate case of a single metal connection crossing a channel and connecting to a port on the opposite side, there will exist at least one horizontal segment and vias connecting that segment to ports.Since a via must exist for each connection from a port to a backbone, polarity matching is accomplished through the two via splitting options of Figure 4. Three variables affect the via splitting decision: (1) the port ordering of the differential signal pair along the edge of the standard cell, (2) cell orientation in the layout, and (3) inversion data extracted during the differential net list collapsing operation.
Since each input variable and the via split operation have two possible configurations, they can be viewed as state variables and represented by binary digits, Table II.By applying a state variable transformation, the logic operation for deciding the proper via cut can be elegantly defined by the concise formula shown in Eq. ( 6).
Via_ cut (Original_ Port_ Polarity)Q (Cell_reflection) Q (Inversion) (6) This methodology facilitates the re-introduction of inversions, and allows linear time decision making when determining the proper cut at each via during the bifurcation process.

VI. RESULTS AND CONCLUSIONS
When evaluating the results of this research, extreme care must be exercised when comparing the quality metrics generated by this approach and other solutions.In this relatively unexplored routing regime there are no established bench- marks against which one can measure the quality of the approach.For this reason, data collected from our system is only compared to the results produced by the core router (COMPASS Design Automation) routing differential pairs in a non- adjacent manner.The first production test of the system was provided by an ARPA sponsored RISC project.
All standard cell and block routing was performed by our system.Chips are currently undergoing functional testing.Closeup photos of selected chip sections are provided in Figure 13(A and B).
For analysis purposes we believe we have chosen sample routing problems which are representative of the entire class.The results of our experiments, showing total interconnect, vias and circuit area are provided in Table III.Data provided represent percent reductions (improvements) for each metric over the pure differential routing.Although our approach may appear simplistic, it is the first large scale effort aimed at solving the differential routing problem.An important test of any theory is its capacity to solve actual problems [8], and the results in Table III confirm our theoretical expectations.Our research has pro- duced both theoretical and practical results.The theoretical underpinnings rooted in FSM theory and the basis set of bifurcatable nets are exciting.The FSM-Feature Vector equivalence shown to exist through the regular expression link provides both closure and an ideal implementation medium.The overall concept appeals to intuition and serves to effectively halve the routing problem size.Finally, it produces a differential solution whose optimality is equivalent to the core router's capability, with only a linear time penalty factor.
However, nets with doglegs may cause some difficulty with our FSM approach.This may be primarily due to the number of cases that doglegs will introduce in FSM states.Our continuing research has extended the chip routing solution into the MCM domain [9].Along with this effort, other applications of FSM Theory in the recogni- tion arena are being investigated.
ARPA/ARODAAH04-93-G-0477 and DARPA/ARODAAL03-90-G-0187. The views, opinions, and/or findings of this paper should not be construed as an officail position or policy and do not represent the position of the U.S. Military Academy, U.S. Army, Department of Defense, or U.S. Government.interconnect and a standard cell.wire whose geometry encompasses the width of two standard wires and the interwire spacing.point of connection at the end of a tack to which another block would be connected by the block level router.These can be on the north, south, east, or west edges of the standard cell area.Manhattan -for our purposes it refers to the constraint sub-constraint that all layer interconnect run vertically, and all layer two interconnect run horizontally.

FSM
Finite State Machine.bifurcation conceptual act of splitting or resolving fat wires into their re- spective two wire components for the physical realization of the layout.

Figure 8 (
Figure 8(A), input tokens are generated and depicted graphically in Figure 8(B).Figure 8(C) demonstrates the functioning of the recognizer given the input string.

FIGURE 7
FIGURE 7 SLE net token set.

FIGURE 13 (
FIGURE 13(A) Standard cell area of test chip.
backbone net, or a final single backbone one.Partitioning continues recursively until the result of the partitioning no longer produces a multi-backbone net as one of the products. multiple

TABLE Formal
TABEL II State variable mapping table