Symmetric and Programmable Multi-Chip Module Low-Power Prototyping System

The advantages of a Multi-Chip Module (MCM) product are its low-power and smallsize. But the design of an MCM system usually requires weeks of engineering effort, thus we need a generic MCM substrate with programmable interconnections to accelerate system prototyping. In this paper, we propose a Symmetric and Programmable MCM (SPMCM) substrate for this purpose. This SPMCM substrate consists of a symmetrical array of slots for bare-chip attachment and Field Programmable Interconnect Chips (FPICs) for substrate routing. Experimental results demonstrate that our FPIC polygonal routing module uses 12% less switches than the conventional routing module for interconnecting bare-chip slots with 84 pads. Also, experiments are conducted to determinate proper parameters for the VLSI implementation of our FPIC.


INTRODUCTION
Portable systems design and add-on cards have stringent limits on low-power and small-size constraints.A Multi-Chip Module (MCM) is a device in which several bare-chips are attached to a single substrate and then packaged as a small- size and low-power system.Furthermore, MCM packaging technology used in electronic systems translate the semiconductor speed into system performance [1-3], but low-power and high- density MCMs are expensive to fabricate and usually requires weeks of engineering effort for system prototyping and product verification.The engineering delay in designing and fabricating such MCMs become unacceptable in today's com- petitive market.The needs of quick turnaround time, high product yield, and low cost have led to the development of another approach, called Symmetric and Programmable Multi-Chip Mod- ule (SPMCM) [4-7].This SPMCM technology provides the designers with a pre-characterized MCM substrate and some programmable inter- connections such that they can generate a fast prototyping or a final consumer product in a short time.The advantages of SPMCM are that the field programmable technology can reduce product development cycle and NRE (Non-Recurrence Engineering) cost, while MCM technology can achieve low power and small size.
Several systems have been proposed for the low- power and small-size prototyping system design on MCM [8][9][10][11][12][13], most of them interconnect the Field Programmable Gate Arrays (FPGAs) with some Field Programmable Interconnect Chips (FPICs) [14][15][16][17] on an MCM substrate.For instance, the BORG [8, 9] system is a reconfigurable prototyp- ing board for FPGAs based on the Clos network.Galloway et al. [10] proposed a reconfigur- able system, called Transmorgrifer-2, which is a hierarchical design based on the I-CUBE [15] routing chip.The Field Programmable Multi-Chip Module (FPMCM) [11] system is a reconfigurable system combining both the state-of-art FPGA and MCM technologies.Thomae and Bout [12] devised a multi-FPGA board for rapid prototyping, in which the ring architecture is used to interconnect FPGAs.A board for logic emulation has been developed by Babb et al., at MIT [13], which uses virtual-wires technique to overcome the pin count limitations.From the above existing reconfigurable systems [8][9][10][11][12][13], we observe that most of the efforts have been spent on designing a flexible interconnection architecture to mitigate the pin limitation.
In order to improve the foregoing problem, we propose an SPMCM structure, which consists mainly of a symmetrical array of bare-chip slots surrounded with some FPICs for slot interconnec- tions.The bare-chip slots allow bare chips (BCs) from different manufacturing processes to attach on the MCM substrate; therefore, our architecture is more flexible and can be used to realize a low- power and small-size prototyping system con- taining bare-chips of various technologies.Our proposed FPIC architecture uses polygonal rout- ing modules and virtual-wires [13] techniques to reduce the requirements of programmable switches and pin count.Since this architecture spends less hardware cost and has a regular structure, it is suitable for VLSI implementation.Moreover, cascading the architecture can scale up the routing resources.This paper focuses on the design of efficient FPICs and the structure of a flexible bare- chip slot in an SPMCM, which can be used to Support a low-power and small-size prototyping system.
The remainder of this paper is organized as follows.First, we show the SPMCM and the bare-chip slot structure in Section 2. Section 3 de- scribes a brief review of the conventional routing module; then our proposed polygonal routing module architecture and some experimental results are shown.Section 4 depicts our FPIC VLSI implementation, its polygonal routing modules, and virtual-wires technique.Conclusions are re- ported in Section 5.

SPMCM ARCHITECTURE
Our SPMCM is a programmable MCM substrate [5][6][7] that consists of an array of bare-chip slots and interconnection FPICs [14][15][16][17] on an MCM substrate, as shown in Figure 1.The MCM substrate and FPICs are pre-fabricated in large volume and well tested.On the MCM substrate, parts of the pads are designed for the FPICs; others are for the commercial or customized bare chips attached to the bare-chip slots.The FPICs are attached to the MCM substrate using flip- chip bounding technology, while bare-chips using wire bounding technology.Thus, these bare chips attached on the SPMCM can be manufactured with different processes.On the substrate, each bare-chip pad is connected via a substrate metal wire to one of the FPIC pads, and net routing is accomplished by programming the FPICs.
The purpose of these flexible bare-chip slots is aimed at attaching bare chips on an SPMCM in different combinations.Figures 2(a substrate metal wiring.Each bare-chip pad (illustrated by black pad) is connected via a substrate metal wire to one of the FPIC pads.We can have four small bare chips attached to the four small bare-chip slots, as shown in Figure 2(a), or a large bare chip occupying all the four slots, as shown in When applying this SPMCM design methodol- ogy for prototyping, designers do not need to consider the MCM substrate design, because SPMCM owns a reprogrammable interconnec- tion MCM substrate which permits us to quickly reconfigure the prototyping system.With this SPMCM, we can flexibly attach the I/O pads on the bare chips to the bare-chip slots on the MCM substrate.For low-to medium-volume MCM designs, this SPMCM design methodology can lower design costs and shorten design cycles.
Moreover, the design engineers do not need to have high MCM design skills in order to design an MCM system.
While SPMCM technology reduces the engi- neering delays and the cost of MCM development by a significant margin, it also degrades system performance in comparison with fully customized MCMs, due to the programmable switches usually have high resistance and capacitance, and occupy a large area.The number of programmable switches of an FPIC affects its speed performance, die size, and routability.Intuitively, increasing the number of programmable switches in an FPIC deliver good routability.However, an FPIC with fewer programmable switches can reduce the impedance of interconnect paths, and the overall speed of the SPMCM can thus be improved.
Our proposed SPMCM is similar to the sym- metric FPGA (Xilinx XC4000-type) [18][19].A brief review of the symmetric FPGA architecture is given as follows.A typical symmetric FPGA consists of an array of logic modules that can be interconnected using routing resources, as shown in Figure 3(a).The routing resources com- prise metal wires and routing modules.Thus, an arbitrary digital circuit can be implemented by appropriately configuring these routing mod- ules and logic modules.A routing module (RM) consists of two connection-modules (CMs) con- nected to a switch-module (SM), and each of these modules contains itself many programmable switches, as shown in Figure 3(a).The routing module is the section of the routing resources to be replicated across the entire symmetric FPGA.The logic-module (L) contains configur- able digital circuits to implement logic functions.The input and output pins of a logic module are connected to its surrounding connection modules, which in turn are connected to the switch modules.
Similar to the symmetric FPGA model shown in Figure 3(a), if we substitute the logic modules with the bare-chip slots and each of the routing modules with an FPIC, we obtain an SPMCM system, as shown in Figure 1.In this manner, the routing algorithm and architecture of the SPMCM are similar to the symmetric FPGA [19].Therefore, we use the terms "routing module" and "FPIC" interchangeably in this paper.In the following sections, we will first indicate that the conventional routing module presents an obstacle to the implementation of an FPIC in terms of the number of programmable switches.Thus, we pro- pose a polygonal routing module architecture to minimize the number of programmable switches.

Conventional Routing Module
In a conventional symmetric FPGA, the switch module is a 4-sided block, denoted as SM(4, m), where m is the number of terminals on each side of the switch module.For example, a Xilinx XC4000- type SM(4, m) can be partitioned into m indepen- dent submodules SM(4,1), as shown in Figure 3(b).
Let the flexibility of a switch module be Fs [18], which is used to represent the number of programmable switches connecting one terminal to Fs terminals on the other three sides of a switch module.For a conventional switch module with Fs 3, its switch module SM(4, m) would contain 6m programmable switches.
A connection module, denoted as CM(m, n), is an m n rectangular block, where m is the number of tracks connected to the switch modules, and n is the number of tracks connected to the bare-chip slots (logic modules), as shown in Figure 3(c).
Therefore, each bare-chip slot can have at most 2n pads.A conventional routing module consisting of two connection modules CM(m, n) and a switch module SM(4, m) is denoted as RM(4, m, n), as shown in Figure 3(a).The flexibility of a connec- tion module [18], Fc, is defined as the number of tracks to which each pad in a bare-chip slot (logic module) can be connected; for the example in Figure 3(c), Fc-6.Thereafter, a connection mod- ule can contain Fc n programmable switches.In a connection module CM(m, n), the ratio of Fc to rn is called the flexibility ratio, i.e., rFc--Fc/m.This rFc is the probability that a wire arriving at a particular track in the connection module is able to connect to the required pin of a bare-chip slot (logic module) [18], thus 0_< rF<_ 1. Rose and Brown [18] suggested that Fs-3 and a high value of Fc, i.e., rFc close to 1, are sufficient to provide high routability in a symmetric FPGA.For example, the Xilinx XC4000 family FPGAs use Fs--3 and rF--1.

Number of Switches in a Conventional
Routing Module Let the number of programmable switches in an RM(4, m, n) be denoted as PS(4, m, n), which is equal to the number of programmable switches between two CM(m, n) and an SM(4, m).This is given by: PS(4, m, n) 2Fcn + 6m 2rFcmn + 6m 2m(rFc n + 3) (1) For the Xilinx XC4000 family FPGAs with Fs-3 and rF--1, we have: In terms of the number of switches, we will show in our experiments that the conventional routing module RM(4, m, n) is unsuited for interconnecting bare-chip slots with high pincount, because the PS(4,m,n) values obtained are very large.Furthermore, bare-chip slots with a large 2n number of pins are very usual in an SPMCM system.This presents an obstacle to the VLSI implementation of an RM(4, m, n) FPIC.
Therefore, in order to improve the switch- efficiency, we propose a polygonal routing module that consists of many small connection modules connected to a polygonal switch module for interconnecting bare-chip slots with high pincount.

Polygonal Routing Module
Based on the conventional routing module RM (4,m,n), as shown in Figure 3(a), we can divide each connection module CM(m,n) into s smaller connection modules CM(m',n') such that each CM(m',n') is connected to one of the 4s sides of the polygonal switch module, as shown in Figure 4(a), where rn s m', n s n', Fc= s Fc', Fc and Fc' are the flexibilities of CM(m, n) and CM(m', n'), respectively.The polygonal switch module is a 4s-side block, denoted as SM(4s, m'), where m' is the number of terminals on each side of the polygonal switch module, as shown in Figure 4(b).Furthermore, a terminal in one side can be connected to a terminal in one of the other (4s-1) sides of the SM(4s, m') through program- mable switches, thus the flexibility F s' of an SM (4s, m') is equal to (4s-1).A polygonal switching module SM(4s, m') can be partitioned into m' independent submodules SM(4s, 1).Compared with the conventional routing module, a polygonal routing module RM(4s, m', n') comprises 2s smal- ler connection modules CM(m', n') interconnected by a 4s-side switch module SM(4s, m'), as shown in Figure 4(a).That is to say, the conventional routing module RM(4,m,n) is a special case of our polygonal routing module RM(4s, m',n') with s= 1.For example, Figure 4(a) represents a polygonal routing module RM(8, 3, 2) with s 2 and Figure 3(a) represents a conventional routing module RM(4, 6, 4) with s 1. CM(m', n').F 3, Fs' 7, m' 3, n' 2, rFc' Fc and s 2.

Experimental Results
In Figures and 3(a), each of our bare-chip slot has 84 (2n) pads to connect to the FPICs in the SPMCM system.To explore the effects of s, m' and n' values of a polygonal routing module on the switch-efficiency of an SPMCM, we implemented a maze router in C language on a SUN Ultra-1 workstation.We examine three parameters s, rn and n' related to the number of the switches needed in the CGE [18] and SEGA [20] bench- mark circuits.Note that no industrial benchmarks for SPMCM are available.For modeling the bare-chips in the SPMCM system, the N 4- input look-up tables (4-LUTs) are grouped to form thirteen large modules in these circuits, where N 4, 5,..., 16.A larger logic-module (bare-chip) has 8N pins, and each pin in a large logic-module can be connected to any of the m tracks (rFc - 1)   in a connection module.Because net ordering often affects the performance of a maze router, we router the benchmark circuits by using the fol- lowing three net-ordering schemes to avoid possible biases: (1) original net order in the benchmark circuits, (2) longest net first, and (3) shortest net first.
By detailed routing these large logic modules each having different pin size, the switches performance of our polygonal routing module was evaluated.For the original net order in the benchmark circuits, Tables I and II show the results.From the routing results of a 96-pin (N= 12) logic module as listed in Table I, we first determined the minimum number of tracks m' required for 100% routing completion for each circuit, in each of the four cases of polygonal routing modules RM(4s, m', n') with s 1, 2, 3 and 4, respectively.Then we get the PS(4s, m', #) value   Eq. ( 4).Table II shows the total of programmable switches PS(4s, m', n) of 14 benchmarks varies with s for larger logic-modules (modeling bare- chips) with 8N pins, where s= 1,2, 3 and 4, and N-4, 5, 16.For the longest and shortest net first, the Tables III and IV show the results, respectively.
Experimental results demonstrate that the con- ventional routing module PS(4, m, n) works quite well only for a bare-chip with less than 40-pin.
Each of our bare-chip slot has 84 (2n) pads, the RM(8,m,n) FPIC is well-chosen to interconnect bare-chips in the SPMCM.From Tables II, III   and IV, the RM(8, m , 21) FPIC compare with the conventional routing module (4-side), an average RM (8,     12% improvement in the switches performance was achieved. Thus, our polygonal routing modules can be used to improve switch-efficiency of an SPMCM system.Although the polygonal routing module needs more number of tracks, but the number of switches it needs is much reduced.State-of-art VLSI techno- logies providing multi-metal layers can be used to solve the larger metal tracks requirement.Experi- mental results and VLSI technologies demonstrate that implementing our proposed polygonal routing module RM(8, m', 21) in an FPIC could enlarge the practicability of an SPMCM system.The feature of scalability is very important to the architecture design of an FPIC and to its VLSI implementation.We will show that our polygonal routing modules possess the characteristics of scalability.An FPIC using RM(8, 1,21)'s as its routing modules is shown in Figure 5(a).In each RM(8, 1,21), we use one switch module SM(8, 1) as shown in Cascading m' RM(8, 1, 21)'s can be used to interconnect bare-chips, as shown Figure 5(a).In this manner, the m' cascaded RM(8, 1, 21)'s are equal to an RM(8, m', 21) routing module.Thus, the number of routing tracks is increased by m' times.
That is to say, the routing resources were increased by m' times in a cascaded SPMCM.Our FPIC uses an RM(8, 6, 21) with 132 pins, where m'= 6, and is packaged as a 160-pin CQFP.An RM (8,6,21) polygonal routing module is shown in Figure 6.

Virtual Wires
Virtual-wires [13] technology is used in our FPIC to improve the routing resources and to over- come the pin count limitation by multiplexing each physical wire among multiple logical wires.Figure 7 shows an example of four logical wires allocated to four physical wires in an RM(8, 6, 21).The chip layout of our FPIC is shown in Figure 9 and it will be fabricated in a 0.6 lam Single-Poly-Triple-Metal (SPTM) CMOS technology through the Chip Implementation Center (CIC), National Science Council, R.O.C. its performance data are summarized in Table V.As mentioned above, this chip architecture is highly scalable, uses less programmable switches, and has lower pin count.Thus it is very suitable for VLSI implementation.

CONCLUSIONS
For low power and small size prototyping system design, an area-efficient and flexible Symmetric and Programmable MCM (SPMCM) has been described, and a symmetric-array FPIC VLSI architecture was proposed for the substrate rout- ing between the bare-chip slots.This FPIC architecture consists of four polygonal routing modules and multiplexing structures, which can significantly reduce the requirements of program- mable switches number and pin count compared with a conventional routing module.We have implemented this VLSI architecture in a 0.61am CMOS technique to verify the function of our proposed SPMCM.In addition, this VLSI routing chip architecture can be easily scaled up with the FIGURESPMCM architecture.

FIGURE 2 FIGURE 3
FIGURE 2 Bare-chip slots structure; (a) Attached with several small bare chips, and (b) Attached with a large bare chip.

Figure 8 FPIC
Figure8shows the same example with the four

TABLE V
single physical wire.The physical wire has to multiplex and demultiplex respectively between the bare chip and the FPIC.The FPIC VLSI implementation combining four RM(8, 6, 21)'s and multiplexing is shown in Figure8. 4.3.Implementation routing resources.With its field programmable MCM substrate, our SPMCM can be used for implementing various prototyping designs based on user's requirements without going through the foundry facility.The advantages are that the field programmable technology can reduce product development cycle and NRE (Non-Recurrence Engineering) cost, while MCM technology can achieve low power and small size.AcknowledgementThis work was partly supported by the National Science Council, R.O.C. under Grant NSC-88- 2215-E002-037. a