Embedding N × N Crossbar Switches into ILLIAC ( N , N − 1 ) Torus Networks for Optical Interconnection Networks

A focus of research for crossbar switches has been to reduce the number of 2× 2 switch elements, each of which not only occupies a relatively large area but also costs a lot when implemented with optical devices. We point out that conventional N ×N crossbar switches, where N is the switch size, actually function as (N + 1)× (N + 1) switches and thus have redundant 2×2 switch elements to be removed. We show that ILLIAC(N ,N−1) torus networks, which have a less number of 2×2 switching elements than crossbar switches of an identical switch size, are equivalent to true N ×N crossbar switches. We describe in detail how the crossbar switches can be embedded into the equivalent torus networks. We also discuss switch control complexity of the equivalent torus networks and show that it can be reduced to O(1).


Introduction
The crossbar switch (XBS) is the practical choice for implementing switching fabrics in optical networks [1].There are two typical configurations of conventional optical XBSs [2].The first type, which uses 1 × 1 transfer switches and constitutes a full-mesh network, is strict-sense nonblocking but suffers from a large power loss, crosstalk, and wiring area to be out of the scope in this paper.We consider the second type of optical XBSs that is composed of 2 × 2 switching elements, each of which has two connection states (i.e., bar and cross), and is wide-sense nonblocking.We refer to the 2 × 2 optical switching element, typically implemented with an optical directional coupler, as a cell for simplicity.An XBS has up to N 2 cells, where N is the switch size.A focus of research has been to reduce the number of cells (or simply crosspoints) required for optical XBSs, because each individual optical cell occupies a relatively large space, consumes power, and costs a lot [3].In 1968, Kautz et al. devised different switches with fewer than N 2 cells, all of which were derived from XBSs, for example, triangular switches with N(N − 1)/2 crosspoints [4], while they are rearrangeably nonblocking unlike XBSs.In 1988, Smyth gave a pruned XBS with N 2 − 3 cells, preserving wide-sense nonblocking capability [5].In 2008 Obara suggested in a letter a new pruned XBS with N(N − 1) cells embedded into ILLIAC(N, N − 1) torus networks [6].In his discussion, however, the question of how conventional XBSs can be transformed into the ILLIAC torus networks has been left open.In this paper, we begin with some definitions and notations for XBSs necessary to later discussion.Then we describe the transformation process and discuss switch control complexity of the equivalent torus networks.

Definitions and Notations for XBSs
We consider a conventional N × N XBS composed of N 2 cells, as shown in Figure 1, where i and j represent its input and output port numbers, 0 ≤ {i, j} ≤ N − 1. (i, j) denotes the cross-point of the ith row and the jth column, and c(i, j) denotes the cell on the cross-point.The cell has two connection modes, that is, cross and bar, as shown in Figures 2(a) and 2(b).All the cells are initially set to the cross-state (i.e., the default state).When the ith input corresponds to the jth output, only a single cell at (i, j) will be set to the bar state.Upon releasing the connection, the same c(i, j) will be set back to the cross-state.Thus, the switch control for XBSs  is fairly simple with complexity of O( 1).In the default state an input signal exits from the rightmost port, and an output port corresponds to the vertical input of the cells at the top row, that is, conventional N × N XBSs actually function as (N + 1) × (N + 1) SWs.Here we can see the redundancy inherent to conventional XBSs.Note that an N 1 × N 2 switch (SW) means the switch matrix with N 1 input ports and N 2 output ports.The cell in Figure 1 can be logically decomposed into a pair of 1 × 2 and 2 × 1 SWs as shown in Figure 3.It is easily seen that 1 × 2 SWs associated with an input port i are aligned in row and 2 × 1 SWs with an output j in column.From a functional point of view, the 1 × 2 SWs distribute an input signal to every output, while the 2 × 1 SWs concentrate a signal at any input to a designated output.We see that a linear network is used for both distribution and concentration in Figure 2, where we also see that the rows and the columns are rotatable.The positions of input and output ports cannot be fixed on the far left column and the bottom row as shown in Figure 1.Instead, they are independently movable in the horizontal and vertical directions to arbitrary positions as shown in Figure 4, although the rotation of the port positions requires wraparound links in both directions.Note that there is no vertical wraparound link on the far left column because there is no vertical port shift.There is no horizontal wraparound link on the top row for a similar reason.We assume that the lengths of the port excursions in horizontal and vertical directions, h i and v j , are defined as the number of cells input and output ports move over.In Figure 4, for example, h 0 and v 3 are three and one, respectively.Note that the horizontal and vertical directions of movement are fixed to rightward and downward, respectively.

Procedure of Transformation from XBS to Torus Network
It is readily seen in Figure 1 that one of the input or output ports of the cells in the top row and the rightmost column is left unused.We know that these unused ports are reserved for scaling the switch size.If we add an XBS with three other XBSs of the same size, we have an XBS of double size.However, if the switch size is fixed, all the idle ports are of no use, and the idle 1 × 2 and 2 × 1 SWs in the cells can be eliminated, and thus we have a new N × N XBS with fewer than N 2 cells.The following discussion holds for any N, but we depict all figures for the case of N = 4 for simplicity and clearness.We begin with an XBS shown in Figure 4 that is deformed by the vertical and horizontal rotations, where h i and v j are given by Equation ( 1) means that both input and output ports are aligned diagonally right down.Input ports are on the diagonal cells, while output ports are set on the cells next to the diagonal cells.Output port 1 (i.e., j = 1 in (1)), for example, looks to be shifted upward by N − 1, but it is really shifted downward by 1 in accordance with the definitions described in Section 2. Note that c(0, 3) and c (3,3) in Figure 4 lie next to each other through a vertical wraparound link, although they seem to be far apart.In Figure 4, we see that plain squares at c(i, N − 1 − i) for i = 0 to N − 1, each of which has an input port i, function as 1 × 2 SWs.We also see that shaded squares at c(N − 2 − j, j) for j = 0 to N − 2 and c(N − 1, N − 1), each of which is adjacent to the plain squares, function as 2 × 1 SWs.Recall that a pair of 1 × 2 and 2 × 1 SWs constitutes a cell as shown in Figure 3.As a result we have a modified XBS in Figure 5, where all the pairs of adjacent 1 × 2 and 2 × 1 SWs in Figure 3 are merged into cells, which are shown by the shaded cells.Note that dashed cells show empty cells corresponding to the original 2×1 SWs that are merged into the shaded cells.Finally, move all the cells under the shaded cells upward by one and also move c(0, N − 1) to (N − 1, N − 1).As a consequence, the bottom line of cells in Figure 5 disappears, and we have an ILLIAC(N, N − 1) torus network as shown in Figure 6, where all the cells are interconnected in the horizontal direction to constitute a global loop, while each set of vertically aligned cells constitutes a local loop [7].We refer to it as a torusembedded XBS (TE-XBS).The number of cells in the N × N TE-XBS is N(N − 1), which is N less than N 2 of conventional XBSs.

Switch Control Complexity
We assume a TE-XBS shown in Figure 7, of which input ports are renumbered so as to have a number identical to the output port number in each column.We label every cell with a pair of numbers, that is, (p, q).p increases from left to right for each row, and thus it is identical to the output port number in the column (0 ≤ p ≤ N − 1).Note that q starts from the cell that has an input port and ends in the cell that has an output port within each row (0 ≤ q ≤ N − 2).We assume that all the cells are initially set to the cross-state, where an input port corresponds to the output port with the identical number.It is easy to see that each column is devoted to the route to a corresponding output port.Now we show how an internal route is provided when a pair of source and destination numbers, i and j, is given, that is, a call arrives at c(i, 0) destined for an output j.We have the following simple switch control algorithm: Set c(i, h) to the cross-state for h = 0 to N − 2. else Set c(i, 0) and c j, k to the bar state, Set c((i + s) mod N, N − 1 − s) to the cross-state for s = 1 to N − k, Set c j, k + t to the cross-state for t = 1 to In other words, when i = j, a route is provided vertically within a column.When i / = j, a route starting from the cell with the ith input port goes along the global loop to the jth column and then diverts its route to exit the jth output port straight through the column.In either case the internal route is provided over one or two sets of contiguous cells aligned in the horizontal or vertical direction.When the connection is released, the set of cells once fixed in its setup will be left unchanged unlike conventional XBSs, unless another input port corresponds to the idle output port.Consequently, the number of cells to be set in the TE-XBS is distributed between N − 1 and 2N − 2, while that in XBSs is only one.Processing time for the switch control becomes O(N) when conventional CPUs are used for the switch control circuit.However, the switch control algorithm is so simple that the sets of cells given in (2) can be identified simultaneously by a certain digital circuit of a parallel processing capability, and thus the switch complexity can be reduced to O(1).

Conclusions
We have presented how conventional N ×N crossbar switches with N 2 cells can be embedded into ILLIAC(N, N − 1) torus networks with N(N − 1) cells.The key idea is to rotate input and output ports in horizontal and vertical directions and to merge adjacent half-idle cells.Although the idea is simple and straightforward, we believe that it is published for the first time.The number of cells to be switched for one call in the torus-embedded crossbar switch is distributed between N − 1 and 2N − 2, while that in the crossbar is only one.
Here we can see a trade-off relation in performance between the number of cells and switch control complexity.The switch control complexity for the torus-embedded crossbar switches, however, can be reduced to O(1) when a certain parallel control circuit is used.

Figure 2 :
Figure 2: Two connection patterns of a cell.

Figure 3 :
Figure 3: Decomposition of a cell into a pair of 1×2 and 2×1 SWs.

Figure 4 :
Figure 4: Deformed XBS with rotated port positions in diagonal direction.

Figure 5 :
Figure 5: Merging each pair of adjacent half-idle cells into a single cell.

Figure 7 :
Figure 7: Renumbering input ports and cells in the torusembedded XBS.