An Efficient Algorithm for the Split K-Layer Circular Topological Via Minimization Problem

The split k-layer (k ≥ 2) circular topological via minimization (k-CTVM) problem is reconsidered here. The problem is finding a topological routing of the n nets, using k available layers, such that the total number of vias is minimized. The optimal solution of this problem is solved in O(n2k


INTRODUCTION
or the purpose of increasing both the yield and the circuit performance of the VLSI chip, the number of vias being used in a VLSI layout should be kept as few as possible.Thus, via minimization is an important issue in the routing problem of a VLSI layout.There are two approaches to minimizing the number of vias: con- strained via minimization and topological via minimiza- tion.The theoretical study of the constrained via mini- mization problem has been completely resolved from the standpoint of maximum junction degree [9].
The topological via minimization, the other approach to the routing problem, was first proposed by Hsu [4].This topic is interested in many practical or theoretical researchers [2], [4], [6], [8], [10], [11], [12], [14], [15].The k-layer topological via minimization (k-TVM) prob- lem is formally defined as follows.A set of nets on a k-layer (k -> 2) routing region R is given.Each net is a set of terminals to be electrically connected.The problem is to route the terminals of each net by a set of wire *To whom all correspondence should be addressed.This work was supported by the Minister of Economic Affairs under Grant 37H3100.segments, in which each wire segment is assigned to a layer and vias are formed to connect every two adjacent wire segments on different layers.Accordingly, no two wire segments representing two different nets intersect on the same layer, and the number of vias is minimized.The wires and the vias assume infinitely small widths and sizes, respectively.Moreover, there are no shape restrictions for wires and vias (hence name topological for the problem).
In this case, the routing region R is a circular channel with n two-terminal nets; each two-terminal net has one terminal located on the inner circle and the other terminal located on the outer circle of the circular channel.Such an k-TVM problem on a circular permutation channel was studied by Rim et al. 10], where it is referred to as a split k-layer circular topological via minimization (k-CTVM) problem.Based on a dynamic programming approach, they proposed an algorithm of time complexity O(n2k+l) to solve this problem.However, such a poly- nomial time algorithm may not be efficient enough, even for k 2, since the net number n in a VLSI layout can be very large.A heuristic algorithm with the time complexity of O(k n4) is presented.The proposed algo- rithm is based on the strategy namely, "generating a single-layer routable subset at a time while keeping the length of the longest cyclic decreasing subsequence of the remaining sequence as short as possible." The paper is further organized as follows.Section 2 lists some definitions and preliminary results.Section 3 presents the heuristic algorithm.The correctness and time complexity of the algorithm are given in Section 4. Section 5 summarizes the experimental results.Finally, Section 6 concludes the paper.

PRELIMINARIES
Without loss of generality, the n terminals on the outer circle of a circular permutation channel can be numbered clockwise as 1, 2 n.The n terminals (also numbered clockwise) on the inner circle of the circular permutation channel can be regarded as a permutation of the numbers 1, 2 n.Let the permutation be -rr (Tr(1), 7r(2) 7r(n)).The two terminals of each net are represented as the terminal on the outer circle and the terminal 7r(j) on the inner circle for some j.Let 7r denote cyclic permutation by circularly shifting s positions of 7r counter-clockwise, i.e., -n's(/) 7r(s + i), V 1 -< -< n s, andTrs(i 7r(s + i-n),/ n-s<i<-n.LetSbe a subset of the n nets of the circular permutation channel.The following two lemmas were shown by Rim et al.  [101.
Lemma 1. 10]. S is single-layer routable, if and only if the set of corresponding terminal numbers of S on the inner circle (clockwise) is an increasing subsequence of some cyclic permutation 7r s, 1 -< s -< n 1. Lemma 2. 10].Let rn be an integer such that 0 -< rn _< n.The n nets can be routed on k layers using rn vias, is and only if there exists k disjoint single-layer routable subsets S  k.For those nets that are not in the k single-layer routable subsets, by using two adjacent layers one via is sufficient for routing each net [8], [10].
The via number rn is minimized since the given net number n is fixed and=l Si is maximized.
An instance (n 8) of the k-CTVM problem with k 2 is shown in Figure (a), where the permutation is given as 7r (3,5,8,4,6,1,7,2).An optimal solution for the instance without any via is shown in Figure l(b).The two disjoint single-layer routable subsets $1 (1, 2, 3, 5,  8) and $2 (4, 6, 7) are routed by layer wire segments and layer 2 wire segments, respectively.A possible non-optimal solution for the instance with one via is shown in Figure l(c), where the two disjoint subsets of nets $1 (1, 2, 3, 4, 6) and $2 (5, 7) are single-layer routable subsets.The nets in $1 are routed by layer wire segments and the nets in $2 are routed by layer 2 wire segments.The net 8 is neither in the subset $1 nor in the subset $2; however, it is routed by the wire segment on layer and the wire segment on layer 2, both of which are connected by one via.Some terms are needed to be explained as follows.An increasing (decreasing) subsequence of 7r for some s, 0 -< s -< n 1, is called a cyclic increasing (decreasing) subsequence (CI(D)S) of 7r.Note that a CIS or CDS of 7r has a length of at least 2. The following lemma is trivial by Lemma 1. Lemma 3. The maximum single layer routable subset of the n two-terminal nets on the circular permutation channel is a longest CIS (LCIS) of 7r.
Returning to the split k-CTVM problem.Lemma and the definition of CIS suggest that the problem is equiva- lent to finding k CIS's S 1, $2 S k of 7r such that i=l Silis maximized.A simple and straightforward approach to the problem is to apply the LCIS algorithm k times.This approach has a time complexity of O(k n2).

THE ALGORITHM FOR THE SPLIT K-CTVM PROBLEM
Let the length of a longest CDS (LCDS) of the 7r be r.
For an LCDS of length 2, the minimum layer number for routing the corresponding two nets is one.In this case, all the n nets in the circular permutation channel can be routed in one layer since 7r (1, 2 n 1, n) exists for some integer i.Furthermore, the minimum layer number of two for routing the corresponding nets in an LCDS of length 3 or 4 can be easily verified.By induction, the corresponding nets in an LCDS of length r require r/2q layers to route them.All the other nets which are not in the LCDS may use either the existing r/2 layers or new layers to realize them.A via-free routing solution for the circular permutation channel requires at least r/2 layers.
Using the strategy of "generating a CIS of the current 7r while keeping the LCDS length of the remaining permutation (by excluding the CIS from the current 70 as short as possible," the algorithm, first, finds an LCDS M of 7r.Let the length of M be r.The algorithm, then, takes k iterations.At the i-th iteration, / 1, 2 k, the current length of S i, say currentlength, is initialized to zero.Then, the following steps are executed: (I) If r 2, then all the n nets of the current 7r can be routed in one layer.Let S be the current 7r and the execution of the algorithm stops.Otherwise, (II) a CIS of the current 7r is generated as Si such that the LCDS length, say ', of the remaining permutation (by excluding the CIS from the current 7r) is kept as short as possible.(III) The CIS S is removed from the current 7r.Let the remaining permu- tation be the current 7r with n updated as n n currentlength and with possibly updated r in step (II).
Then, the next iteration is started.
At the i-th iteration, the details of the step (II) are as follows.This step takes () n (n 1)/2 loops.In each loop, two elements r(jl) and 7r(j2) of the current 7r are selected as the starting pair.For each jl in the range of 1 <--Jl <-n 1, J2 is increased from jz + 1, Jl + 2 n.Then, the following substeps are executed: (1) The pair of two elements 7r0'1) and 7r(j2) is used to generate a CIS (to be described in the next paragraph).Let the length of the CIS be .(2) Exclude the CIS from the current 7r temporally, and let the remaining permutation be 7r'.(3) An LCDS of the remaining permutation 7r' is generated.Let the LCDS length be '.(4) If f' <-2 and < k, then the set of the nets in the remaining permutation r' is one-layer routable.Let Si be the CIS, set cur- rentlength as and go to step (III).( 5) If ' < r, then the CIS is replaced with the old S as the new S and the length r is updated as '.Set currentlength as .
Otherwise, (6) if ' /?r, but /? is greater than cur- rentlength of the previous selected S i, then the CIS is replaced with the old S as the new S and the cur- rentlength of the new S is updated as/?.Then, another pair of elements from the current 7r is selected and goes to the next loop.
From the discussion, the formal algorithm for finding a set of k disjoint CIS's of 7r is as follows.
VIA MINIMIZATION PROBLEM 45 Step 1.   Find an LCDS M of "rr (by finding an 7) with n 3 and updated r 2 for the second iteration LCIS of the reverse permutation "rrr).Let (i 2).In step 2.1, the current permutation 7r (4, 6, 7) its length be fr. is obviously one-layer routable because r 2. The Step 2.  For i: to k Do current permutation 7r (4, 6, 7), which is a CIS, is, currentlength: 0; therefore, selected as $2.Then, the execution of the Step 2.1 If r 2, then/* The nets in the current algorithm stops.The required solution is represented by "rr is one-layer routable.*/let Si be the current "rr and stop.
Endif /* Generate a CIS of the current "rr while keeping the LCDS length of the remaining permutation as short as possible.*/ Step 2.2.
Step 2.2.4.The correctness of the algorithm is proved in the follow- Generate a CIS by using the procedure ing.A CIS of the current 7r is generated at the end of each GENCIS(-rr, jz, j2).Let its length be .iteration.Since the remaining permutation 7r' is obtained

THE CORRECTNESS AND
Excluding the CIS from 7r, let the remaining permutation be xr'.by removing the elements of the CIS from the current 7r Find an LCDS of ax'.Let its length be for each iteration, the LCDS length of the remaining '. permutation 7r' cannot be greater than that of the current If ' -< 2 and < k, then 7r.Otherwise, there would exist another LCDS of the let S be the CIS.Set currentlength: f and r: f'.Go to Step 2.3.

=f. Endif Endif
Endfor/* The J2 forloop.*/ Endfor/* The J forloop.*/ Let "rr: S be the current mutation; n" n current Endfor/* The forloop.*/End of Algorithm.original permutation, and its length is longer than the one Endif.
just selected.This is impossible.Therefore, the algorithm If '< fr, then must be stopped when k CIS's $1, S_ Sg are let S be the CIS.Set currentlength: generated.When all nets are assigned to j(j < k) CIS's, and fr: /?, then the subsets Sj+I S become empty sets.

Else
The time complexity of the algorithm is analyzed as If '= r and f > currentlength, then follows.
Step takes O(n2) time, in the worst case, to let S be the CIS.Set currentlength: generate an LCDS.Step 2 takes k iterations.From steps 2.1 to 2.3, a constant time is taken to assign cur- rentlength := 0 at each iteration.Step 2.1 requires O(n) time.Step 2.2 is a nested loop that has substeps 2.2.1 to 2.2.6.These substeps will be executed at most per-()= n(n-1)/2 O(n2) times; the dominating step length.
2.2.3 computes the LCDS which takes O(n2) time in the worst case.

EXPERIMENTAL RESULTS
This heuristic algorithm is written in C language and tested on a SUN workstation.To measure how far the number of solutions obtained from the algorithm are away from the number of optimal solutions and the algorithm's speed, an exhaustive algorithm for finding all optimal solutions is, also, written.The exhaustive algorithm first creates k directed acyclic graphs [10], then finds the corresponding k disjoint directed paths such that the total length of these k directed paths is maximized.The search method tries all possible directed paths; each directed path of the k directed acyclic graphs corresponds to a CIS.Using the optimal solutions obtained by the exhaustive algorithm as a base, the optimal solution's discrepancy found by the proposed algorithm can be measured and summarized as follows.
First, all permutations for n 8, 10, and 12 are generated and tested with k 2, 3, respectively.To measure the discrepancy between the number of optimal solutions obtained from the heuristic algorithm and from the exhaustive algorithm, the optimal ratio of the heu- ristic algorithm is defined as the percentage of the number of optimal solutions obtained by the heuristic algorithm from these n! permutations/the n! optimal solutions found by the exhaustive algorithm.Table I lists the results.
When n 20 or 30, the execution time is too long to generate all optimal solutions for all possible permuta- tions; hence, only 1000 permutations are randomly generated and tested with k 2, 3 and 4. Therefore, the optimal ratio is defined as the number of optimal solutions obtained by the heuristic algorithm from these 1000 permutations/the 1000 optimal solutions found by the exhaustive algorithm.As indicated in Table II, the ratio is not impressive in the cases of k 2 and n 20 or 30, or n 30 and k 2 or 3.These results indicate that the proposed algorithm cannot always find the optimal solution for some permutation.
Two observations are made from Tables I and II.First, the optimal ratio decreases as n increases for fixed k.The reason is simple.The possibility that the number of nets can be grouped into one layer is lowered when the total number of nets is increased for the fixed k's.Second, the optimal ratio of the algorithm increases as k increases for the fixed n's because the more layers are available, the more nets can be selected into these layers.
Table III shows the total execution time between the proposed algorithm and the exhaustive algorithm for some test cases with k 2. For n 10 and 12, the data are calculated for all permutations.For n 20 and 30, the data are based on these 1000 permutations generated randomly.When the execution time of the proposed tAccording to the element's sequence in a permutation, the execu- tion time for each permutation is widely discrepant.As an example, the optimal solution's execution time of the permutation (1, 2, 3, 4, 5) is much shorter than that of the permutation (5, 4, 3, 2, 1).Hence, to represent the relative speed between the heuristic algorithm and the exhaustive algorithm, the total execution time is more appropriate than the average time.algorithm and that of the exhaustive algorithm are compared, the execution time of the proposed algorithm can be almost neglected when n >-12.

CONCLUSIONS
A heuristic algorithm with complexity of O(k n4) is proposed for solving the split k-CTVM problem.By the experimental results, this heuristic algorithm does provide an efficient solution to the problem.However, it may not always find an optimal solution.The reason is as follows.At each iteration, the algorithm only selects the first CIS so that the LCDS length of the remaining permutations may be as short as possible.There may be another CIS which gives a better solution for the remain- ing permutations.
The circular permutation channel has two related problems.One is determining a minimum number of layers, say r, needed to provide a via-free routing solution (as described in Section 3) for the n nets on a circular permutation channel.This problem, by similar arguments as in Section 2, is equivalent to determining the minimum number r needed to partition the permuta- tion r into r disjoint CIS's $1, $2 Sr such that 7=1 lag 1-n.It seems that the strategy, generating a CIS of "rr while keeping the remaining permutation's LCDS length (by excluding the CIS from r) as short as possible, could be used to solve this problem.
The other problem is finding a single-layer routable subset of the n nets with maximum cardinality.This problem, according to Lemma 3, is equivalent to finding a CIS $1 of " such that S is a maximum, i.e., an LCIS of 7r.The O(n log(t + 1) + n t) LCIS algorithm in Appendix can be used to solve this problem, where is the size of the LCIS.An O(n log n + n t) LCIS algorithm proposed by Lou and Sarrafzadeh [7] has recently been brought to our attention.Both algorithms improve the O(n 2 log n) results as previously found in 10].
Note that the problem of finding a single-layer routable subset with maximum weight for a circular channel with n weighted multi-terminal nets has been studied by Liao et al. [5].They proposed an O(n t) algorithm for solving this problem under global routing information, where is the total number of terminals.Whether this same problem could be solved without a global routing remains open.

APPENDIX THE LCIS PROBLEM
The LCIS problem is to find an LCIS of 7r.According to the definition of CIS, an LCIS is a longest increasing subsequence (LIS) for some cyclic permutation 7rg, / 0 <-<n 1.There exists a fact that the first element of 7rj is the last element of 7rj+, /j 0, 1 n-2.Using the above fact and other properties stated below, an efficient algorithm is presented which can find such an LCIS of A.1 The LClS Algorithm The algorithm consists of two phases.In the first phase, a set of the best partial solutions (increasing subse- quences) for r o 7r is formed in which an LIS of 7to is obtained.Then, in the second phase, the set of best partial solutions for rg_l is transformed into the set of best partial solutions for 7rg in which an LIS of rg is obtained, / 1, 2 n 1.Among the n LIS' s, the one with maximal cardinality is selected to be the LCIS.
The details of these two steps are stated below.
The first phase consists of the following steps: (1) Uses the same concept as in 13] to keep track of the best partial solutions for 7r o.This step has n loops starting with an empty solution set CS 0, it, then, adds the elements of 7r o (Tr(1), 7r (2) 7r(n)), one by one (from left to right), into the solution set CS.The last element of the best partial solution of length is stored in CS(l).While the element 7r(i), 1, 2 n, is being processed, the smallest last element (assume stored in CSfj)) is replaced by CSfj) := 7r(i).This occurs if there are previous partial solutions with last elements greater than 7r(i); otherwise, 7r(i) is added to the current longest partial solution, and forms a new longer partial solution.The best partial solution is represented by its last element and by a link to a partial solution which has one element less than the one being processed currently.In addition to the links, the first elements of the best partial solutions are recorded.
The second phase has (n 1) loops.Each element 7r(i), '' 1, 2 n 1, has been executed in each loop.It is clearly visible that "n'(i) is the last element of ,'/l" and the first element of 7ri_ because 7r is obtained by circularly shifting one element of "Ti_ 1.Moreover, the element 7r(i) is either (I) the first element of the longest partial solution or (II) does not in any partial solution of "lTi._ 1.The purpose of each loop is to obtain the best partial solutions for the preceding (n 1) elements of 7r from the best partial solutions of ri_ 1, and then the element r(i) is inserted into the right position to obtained the best partial solutions of 7r i.Each loop consists of the following steps: (i) If the case (I) holds, then the length of the partial solutions of "i'fi_ with 7r(i) as the first element is shortened one unit.This procedure is used to retain the best partial solutions for the preceding (n 1) elements of 7r i.The first element's corresponding infor- mation for the shortened partial solutions is also modi- fied.Otherwise, for case (II), the best partial solutions for Tl'i-1 are already the best partial solutions for the preceding (n 1) elements of "rr i. (ii) The element -rr(i) is, then, inserted into the best position using the same concept as in [13].The corresponding previous element and the first element's information are added.After inserting 7r(i), a new set of the best partial solutions for ']T is formed in which the longest partial solutions is the LIS of r i. (iii) If the LIS is longer than the one found in the previous LCIS solution, then the LIS is obtained by back-tracking the corresponding links and it is saved as the current LCIS solution.The next element 7r(i + 1) is picked; and the execution goes to the (i + 1)th loop.
Using the permutation 7r (3, 5, 8, 4, 6, 1, 7, 2) as an example, Figure 2 shows a sample execution of this step.When the element r(1) 3 is processed, this element is added into CS by setting CS(1) 3, and a partial solution (3) of length is formed.The first element of this partial solution is the element 3 itself.The notation F(CS(j)) denotes the first element of the partial solution of length j with CS(j') as the last element.Thus, for this case, F(CS(1)) F(3) 3 are shown in Figure 2(a), where the number in the brace is the information for the first element.When the element 7r(2) 5 is processed, it can, then, be added into CS by setting CS(2) 5 and a partial solution (3, 5) of length 2 can, then, be formed.The previous element of CS (2) 5 is the element 3. The notation P(CSQ')) denotes the link to the previous element of CS(j) for the partial solution of length j with CS(j) as the last element.The notation is used to trace back the partial solution of length j since the CS set may be overwritten by the subsequent elements.For this case, P(CS(2)) P(5) 3, F(CS(2)) F(5) 3 are shown in Figure 2(b), where the previous element is represented by a directed edge pointed to that element.When the element r(3) 8 is processed, the partial solution (3, 5, 8) of length 3 is formed with CS(3) 8, P(CS(3)) 5  CS(1). 3(3)1(1) . 2(11  6(3) M LIS (3, 4, 6, 7) currentlength 4   (f) (g) (h) FIGURE 2 A sample execution sequence of the first phase of the LCIS algorithm in which the best partial solution for % "rr (3,5,8,4,6,1,7,2) is obtained.
F(CS(3)) 3 as shown in Figure 2(c).When the element 7r(4) 4 is processed, the partial solution (3, 4) of length 2 is found to be better than the previous partial solutions (3, 5) because the length of the partial solution (3, 4) has a better chance to be lengthened in the future.Thus, CS(2) 5 is overwritten by setting CS(2) 4, and a new partial solution (3, 4) of length, 2 is, therefore, formed with P(CS(2)) 3 and F(CS(2)) 3 as shown in Figure 2(d), where the overwritten element 5 is marked by an underline.When the element r(5) 6 is processed, the partial solution (3,4,6) is better than the previous partial solution (3, 5, 8), since 6 is less than 8; therefore choosing 6 as the member will give the length of the partial solution (3, 4, 6) a better chance to be lengthened.CS(3) 6, P(CS(3)) 4 and F(CS(3)) 3 are shown in Figure 2(e).When the element 7r( 6) is processed, the (d) (e) (f) (g) FIGURE 3 A sample execution sequence of the second phase of the LCIS algorithm in which the best partial solution for % is obtained as the element "rr(i) is being processed, V FIGURE(a) An instance of the split k-CTVM problem with k 2. (b) An optimal solution without vias for the instance.(c) A possible solution with one via for the instance.

TABLE The Optimal
Ratio of the Proposed Algorithm for all Permutations of

TABLE II The
Optimal Ratio of the Proposed Algorithm for 1000 Random

TABLE III The
Total Execution Time Between the Proposed Algorithm and the Exhaustive Algorithm for k 2