On Asymptotic Analysis of Packet andWormhole Switched Routing Algorithm for Application-Specific Networks-on-Chip

The application of the multistage interconnection networks (MINs) in systems-on-chip (SoC) and networks-on-chip (NoC) is hottest since year 2002. Nevertheless, nobody used them practically for parallel communication. However, to overcome all the previous problems, a new method is proposed that uses MIN to provide intra-(global) communication among application-specific NoCs in networks-in-package (NiP). For this, four O(n) fault-tolerant parallel algorithms are proposed. It allows different NoCs to communicate in parallel using either fault-tolerant irregular Penta multistage interconnection network (PNN) or fault-tolerant regular Hexa multistage interconnection network (HXN). These two are acting as an interconnects-on-chip (IoC) in NiP. Both IoC use packet switching and wormhole switching to route packets from source NoC to destination NoC. The results are compared in terms of packet losses and wormhole switching which comes out to be better than packet switching. The comparison of IoC on cost and MTTR concluded that the HXN has the higher cost than the PNN, but MTTR values of the HXN are low in comparison to the PNN. This signifies that the ability to tolerate faults and online repairing of the HXN is higher and faster than the PNN.


Introduction and Motivation
Parallel Processing refers to the concept of speeding-up the execution of a program by dividing the program into multiple fragments that can execute simultaneously, each on its own processor.A program being executed across n processors might execute n times faster than it would use a single processor.
It is known that one way for processors to communicate data is to use a shared memory and shared variables.However, this is unrealistic for large numbers of processors.A more realistic assumption is that each processor has its own private memory and data communication takeing place using message passing via an Interconnection Networks (INs).
INs originated from the design of high-performance parallel computers.INs make a major factor to differentiate modern multiprocessor architectures and are categorized according to a number of criteria such as topology, routing strategy, and switching technique.IN is building up of switching elements; topology is the pattern in which the individual switches are connected to other elements, like processors, memories, and other switches.

Interconnection Networks.
"Interconnection Networks should be designed to transfer the maximum amount of information within the least amount of time (and cost, power constraints) so as not to bottleneck the system." INs have a long development history [1][2][3][4][5][6][7][8].The Circuit switched networks have been used in telephony.In 1950s, the interconnection of computers and cellular automata as few prototypes was developed until 1960 when it awaited full use.Solomon in 1962 developed multicomputer network.Staran with its flip network, C.mmp with a crossbar and Illiac-IV with a wider 2D network received attention in early 1970s.This period also saw several indirect network used in vector and array processors to connect multiple processors to multiple memory banks.This problem was developed in several variants of MINs.The BBN Butterfly in 1982 was one of the first multiprocessors to use as an indirect network.The binary e-cube or hypercube network was proposed in 1978 and implemented in the Caltech Cosmic Cube in 1981.In the early 1980s, the academic focus was on mathematical properties of these networks and became increasingly separated from the practical problems of interconnecting real systems.
The last decade was the golden period for INs research driven by the demanding communication problems of multicomputer enabled by the ability to construct single-chip Very Large Scale Integration (VLSI) routers, the researchers have made a series of breakthroughs that have revolutionized in digital communication systems.The Torus Routing Chip, in 1985, was one unique achievement.The first of a series of single-chip routing components introduced wormhole routing and virtual channels used for deadlock avoidance.The whole family of chip laid the framework for analysis of routing, flow-control, deadlock, and livelock issues in modern direct networks.A flurry of research followed with new theories of deadlock and livelock, new adaptive routing algorithms, and new methods for performance analysis.The research progressed in collective communication and network architectures on a regular basis.By the early 1990s, low-dimensional direct networks had largely replaced the indirect networks of the 1970s, and the hypercubes of the 1980s could be found in machines from Cray, Intel, Mercury, and some others.The applicability of INs in digital communication systems with the appearance of Myrinet was adopted in 1995.The point-to-point multiple networks technology replaced the use of buses, which were running into a limited performance due to electrical limits and were used in the barrier network in the Cray T3E, as an economical alternative to dedicated wiring.However, the interconnection network technology had certain barriers on design, and the various researchers and engineers have observed analysis of these networks [1,[4][5][6][7][8].

Multistage Interconnection Networks.
As the acceptance and subsequent use of multiprocessor systems increased, the reliability, availability, performability, and performance characteristics of the networks that interconnect processors to processors, processors to memories, and memories to memories are receiving increased attention.A brief survey of INs and a survey of the fault-tolerant attributes of MINs are reported in [1][2][3][4][5][6][7][8].A MIN in particular is an IN that consists of cascade of switching stages, contains switching elements (SEs).MINs are widely used for broadband switching technology and for multiprocessor systems.Besides this, MINs offer an enthusiastic way of implementing switches used in data communication networks.With the performance requirement of the switches exceeding several terabits/sec and teraflops/sec, it becomes imperative to make them dynamic and fault tolerant [9][10][11][12][13][14].
The typical modern day application of the MINs includes fault-tolerant packet switches, designing multicast, broadcast router fabrics, while SoCs and NoCs are hottest research topics in current trends [9][10][11][12][13][14]. Normally the following aspects are always considered while deigning the fault-tolerant MINs: the topology chosen, the routing algorithm used, and the flow control mechanism adhered.The topology helps in selecting the characteristics of the present chip technology in order to get the higher bandwidth, throughput, processing power, processor utilization, and probability of acceptance from the MIN-based applications, at an optimum hardware cost.Therefore, it has been decided to work on both irregular and regular fault-tolerant MINs as an application for NoCs.
1.2.Networks-in-Package. Networks-in-package (NiP) designs provide integrated solutions to challenging design problems in the field of multimedia and real-time embedded applications.The main characteristics of NiP platforms are as follows: (1) networking between chip-to-chip in a single package, (2) low development cost than NoC approach, (3) low power consumption, (4) high performance, (5) small area.
Along with these characteristics, there are various fields to explore in NiP, which include the following: This paper focuses on an emerging paradigm that effectively addresses and presumably overcomes the many on-chip interconnection and communication challenges that already exist in today's chips or will likely occur in future chips.This new paradigm is commonly known as the NoC paradigm [15][16][17][18].The NoC paradigm is one, if not the only one, fit for the integration of an exceedingly large number of computational, logic, and storage blocks in a single chip.Notwithstanding this school of thought, the adoption and deployment of NoC face important issues relating to design and test methodologies and automation tools.In many cases, these issues remain unresolved.

Networks-on-Chip.
NoC is an emerging paradigm for communications within VLSI systems implemented on a single silicon chip.In a NoC system, modules such as processor cores, memories, and specialized Intellectual Property (IP) blocks exchange data using a network as a "public transportation" subsystem for the information traffic.A NoC is constructed from multiple point-to-point data links interconnected by switches, such that messages can be relayed from any source module to any destination module over several links, by making routing decisions at the switches.A NoC is similar to a modern telecommunications network, using digital bit-packet switching over multiplexed links.Although packet switching is sometimes claimed as a necessity for NoC, there are several NoC proposals utilizing circuit-switching techniques.This definition based on routers is usually interpreted so that a single shared bus, a single crossbar switch, or a point-to-point network is not NoC, but practically all other topologies are.This is somewhat confusing since all above mentioned are networks   but are not considered as NoC.Note that some articles erroneously use NoC as a synonym for mesh topology although NoC paradigm does not dictate the topology.Likewise, the regularity of topology is sometimes considered as a requirement, which is obviously not the case in research concentrating on "Application-Specific NoC".The wires in the links of the NoC are shared by many signals.A high level of parallelism is achieved, because all links in the NoC can operate simultaneously on different data packets.Therefore, as the complexity of integrated systems keeps growing, a NoC provides enhanced performance and scalability in comparison with previous communication architectures.Of course, the algorithms must be designed in such a way that it offers large parallelism and can hence utilize the potential of NoC.
Several forces drive the adoption of NoC architecture: from a physical design viewpoint, in nanometer Complementary Metal-Oxide Semiconductor (CMOS) technology interconnects dominate both performance and dynamic power dissipation, as signal propagation in wires across the chip requires multiple clock cycles.NoC links can reduce the complexity of designing wires for predictable speed, power, noise, reliability, and so on, thanks to their regular, well-controlled structure.From a system design viewpoint, with the advent of multicore processor systems, a network  is a natural architectural choice.A NoC can provide separation between computation and communication, support modularity and IP reuse via standard interfaces, handle synchronization issues, serve as a platform for system test, and, hence, increase engineering productivity.
Although NoC can borrow concepts and techniques from the well-established domain of computer networking, it is impractical to reuse features of old networks and symmetric multiprocessors.In particular, NoC switches should be small, energy efficient, and fast.Neglecting these aspects along with proper, quantitative comparison was typical for early NoC research, but today all are considered in more detail.The routing algorithms must be implemented by simple logic, and the number of data buffers should be minimal.Network topology and properties may be Application Specific.NoC need to support quality of service, namely, achieve the various requirements in terms of throughput, end-to-end delays, and deadlines.To date, several prototypes of NoCs are designed and analyzed in industry and academia.However, only few are implemented on silicon.However, many challenging research problems remain to be solved at all levels, from the physical link level through the network level and all the way up to the system architecture and application software.
Most NoC are used in embedded systems, which interact with their environment under more or less hard time constraints.The communication in such systems has a strong influence on the global timing behavior.Methods are needed to analyze the timing, as average throughput as well as worst-case response time [17].However, from a VLSI design perspective, the energy dissipation profile of the interconnect architectures is of prime importance as the latter can represent a significant portion of the overall energy budget.The silicon area overhead due to the interconnect fabric is important too.The common characteristic of these kinds of architectures is such that the processor/storage cores communicate with each other through high-performance links and intelligent switches and such that the communication design is represented at a high abstraction level.The different NoC topologies are already used in [19], and these topologies give different communication structure in NoC [20].The application of the MIN in SoCs [10,[15][16][17][18] and NoCs [10,[16][17][18] is consistently drawing attention since year 2002.The parallel communication among Application-Specific NoC [9,10] is a major problem to handle by the researchers.Nevertheless, nobody used them practically for parallel communication.The literature survey reveals that Star, Common Bus, and Ring topologies were used as a medium to set up intra-NoCs communication [21].However, these communication systems have many tradeoffs and disadvantages as mentioned below: (1) high latency, (2) low scalability, (3) poor performance, (4) zero fault-tolerance, (5) no On-chip repairability, (6)  The rest of the paper is organized as follows: Section 2 describes the general NiP architecture including the faulttolerant parallel algorithm designed to provide parallel communication among different NoCs using HXN and PNN followed by the their comparisons on Cost and Mean Time to Repair (MTTR).Section 3 provides the conclusion followed by the references.

Application-Specific NiP Architecture Using Irregular PNN and Regular HXN
The general architecture of NiP resembles with the Open Systems Interconnection (OSI) Model.The Physical layer refers to all that concerns the electric details of wires, the circuits and techniques to drive information, while the Data Link level ensures a reliable transfer regardless of any unreliability in the physical layer and deals with medium access.At the Network level there are issues related to the topology and the consequent routing scheme, while the Transport layer manages the end-to-end services and the packet segmentation/reassembly. Upper levels can be viewed merged up to the Application as a sort of adaptation layer that implements services in hardware or through part of an operating systems and exposes the NoC infrastructure according to a proper programming model, for example, the Message Passing (MP) paradigm.
NiP is a specific approach, which provides the common interface through which all NoC can communicate together more efficiently and robustly.It contains three different types of building blocks, appropriately interconnected to each other and a patented network topology that promises to deliver the best price/performance trade-off in future NiP applications as follows [20].Figures 1 and 2 show a general NiP architecture in which four NoC chips are mounted on a single package.These NoC communicate with each other through an intermediate chip, known as IoC [22].2.1.Interconnects-on-Chip Architecture.Figures 3 and 4 show the different types of architecture of IoC, that is, one belongs to the class of irregular fault-tolerant MIN and other belongs to the class of regular fault-tolerant MIN.This first chip, shown in Figure 3, consists of five routers; working as Switching Elements (SEs) is known as PNN and Figure 4 shows the architecture of HXN with 6 SEs.These routers are connected with the main link and chaining or express links, which makes the IoC highly, fault tolerant.Here the architectural design of IoC is similar to a small MIN, widely used for broadband switching technology and for multiprocessor systems.Besides this, it offers an enthusiastic way of implementing switches/routers used in data communication networks.With the performance requirement of the switches/routers exceeding several terabits/sec and teraflops/sec, it becomes imperative to make them dynamic and fault tolerant.The typical modern day application of the MIN includes fault-tolerant packet switches, designing multicast, broadcast router fabrics while SoC and NoC are hottest now days [9][10][11][12][13][14].

Switching Methodologies, Testbed, and Assumptions.
Switching techniques determine when and how internal switches connect their inputs to outputs and the time at which message components may be transferred along these paths.For uniformity, the same approach for all NoC architectures has been used here.There are different types of switching techniques [6][7][8] as follows.
Definition 1 (Circuit Switching).A physical path from source to destination is reserved prior to the transmission of the data.The path is held until all the data has been Input four nodes from user; (5) k, c, d, e, l, g = 0; (6) FOR i = 0 to 3 (7) FOR j = 0 to 3 (8) arr [1]) to one.doc(15) FOR j = 0 to 12 ( 16)  transmitted.The advantage of this approach is that the network bandwidth is reserved for the entire duration of data.However, valuable resources are also tied up for the duration of the transmitted data, and the set up of an endto-end path causes unnecessary delays [5][6][7][8].

Definition 2 (Packet Switching).
A data is divided into fixedlength blocks called packets and instead of establishing a path  FOR (y = 0 to n−1) DO (20) Transfer the payloads of NoC to the Info of next immediate Switching Elements (21) Transfer the Source NoC Number to the Source of Switching Element (22) Transfer Definition 3 (Wormhole Switching).The packets are divided into fixed length flow control units (flits), and the input and output buffers are expected to store only a few flits.As a result, the buffer space requirement in the switches can be small compared to that generally required for packet switching.Thus, using a wormhole switching technique, the switches will be small and compact.The first flit, that is, header flit, of a packet contains routing information.Header flit decoding enables the switches to establish the path and subsequent flits simply follow this path in a pipelined fashion.As a result, each incoming data flit of a message packet is simply forwarded along the same output  subsequent flits also have to wait at their current locations [5][6][7][8].
We have used packet and wormhole switching algorithms to send data from one NoC to other NoC in parallel environment.However, wormhole routing is better in today's scenario but considering the low cost, as it is a measure factor for system performance.The inclusion of buffer will complex the current system and the cost would increase smartly.
Deadlocks, livelocks, and starvation arise because the number of resources is finite.Additionally, some of these situations may produce the others.For instance, a deadlock permanently blocks some packets.As those packets are occupying some buffers, other packets may require them to reach their destination, being continuously misrouted around their destination node and producing livelock.It is extremely important to remove deadlocks, livelocks, and starvation when implementing an interconnection network.Otherwise, some packets may never reach their destination.The following definitions and the problems are very important therefore, care should be taken while designing the code.Definition 4 (Deadlock).A deadlock occurs when some packets cannot advance toward their destination because the buffers requested by them are full.A packet may be permanently blocked in the network because the destination node does not consume it.This kind of deadlock is produced by the application [5][6][7][8].Definition 5 (Livelock).A situation when some packets are not able to reach their destination, even if packets never block permanently.A packet may be traveling around its destination node, never reaching it because the channels required to do so are occupied by other packets.It can only occur when packets are allowed to follow nonminimal paths [5][6][7][8].
Definition 6 (Starvation).Packet may be permanently stopped if traffic is intense and the resources requested by it are always granted to other packets also requesting them.It usually occurs when an incorrect resource assignment scheme is used to arbitrate in case of conflict [5][6][7][8].
The two types of policies, which exist while transferring the packets through IoC to NoCs in parallel communication environment, are as follows.
Definition 7 (Milk Policy).This policy states that the newer packet kills the previous residing packet.The older packet is destroyed.

Definition 8 (Wine Policy
).This policy states that the older packet will survive and the newer packet gets destroyed or in other words whenever a new transfer takes place the next SE (in respect to the current SE) is checked for ideal condition and if it is not ideal then no transfer takes place, that is, older packet resides and new arriving packet gets destroyed.This paper uses wine policy as it is the best according to our problem.

Output Simulation Scenario to Simulate the NiP Architecture Using PNN as IoC.
The following are assumptions to simulate the NiP architecture using PNN as IoC.
(1) No packet can survive for more than 5 clock cycles where each clock cycle equals to 1 for loop.

Output Simulation Scenario to Simulate the NiP
Architecture Using HXN as IoC.The following assumptions are considered while simulating the NiP architecture using HXN as IoC.
(1) No packet can survive for more than 6 clock cycles where each clock cycle equals 1 for loop.
(3) Once a packet arrives at SE "2" or "3" it cannot back track.This assumption is made to reduce the network latency.So only "1" disjoint path is made available in form of "3" or "2" accordingly.
Total time: If we talk about n terms,
(2) For the FOR loop starting on line #13, Time T 2 = n × t 3 (here t 3 is constant).
Table 13: The following source destination pairs of NoCs are used to show 62.5% and 72.5% efficiency of the NiP (based on PNN, used as IoC).All source destination pairs delivered the information to the respective NoC.The following data does reflect the parallel communication among the NoCs using packet and wormhole switching algorithms.Some of the payload information are dropped in the network due to nonavailability of paths.(4) For the FOR loop starting on line #24, Time T 4 = 2t 5 (here t 5 is constant).

Output Simulations of NiP Using PNN as IoC.
To show the simulations, two kinds of simulating scenarios are considered here for intra-NoC communication: (1) best case, Best Case.For this communication scenario, the following source NoC and destination NoC have chosen as an example, shown in Table 1.

First
Step.Refer to Figure 8 and Table 2.
(1) When communication is set up between NoC 0 and NoC 3 the payload to be sent is 111 therefore the payload first moves to SE 0.
(2) For communication between NoC 1 and NoC 3, the payload to be sent is 222 therefore the payload moves to SE 1.
(3) For communication between NoC 2 and NoC 3, the payload to be sent is 333 therefore the payload moves to SE 3.
(4) For communication between NoC 3 and NoC 1, the payload to be sent is 444 therefore the payload moves to SE 4.

Second
Step.Refer to Figure 9 and Table 3.
(3) For communication between NoC 2 and NoC 3 the payload 333 should move to either the SEs 2, 0, 4 (already carrying payloads) otherwise the packet gets destroyed and hence the communication link does not proceed further.

Third
Step.Refer to Figure 10 and Table 4.
(3) For communication between NoC 3 and NoC 1, the payload 444 has reached its destination NOC 1 and transferred.

Fourth
Step.Refer to Figure 11 and Table 5.
(1) For communication between NoC 0 and NoC 3, the payload 111 has reached its destination NoC 3 and transferred.

Fifth
Step.Refer to Figure 12 and Table 6.
(1) For communication between NoC 1 and NoC 3, the payload 222 has reached its destination NoC 3 and transferred.Worst Cases.For this communication scenario, the following source NoC and destination NoC have been chosen as an example, shown in Table 7.This case is only 50% efficient.

First
Step.Refer to Figure 13 and Table 8.
(1) When communication is set up between NoC 0 and NoC 2, the payload to be sent is 111 therefore the payload first moves to the SE 0.
(2) For communication between NoC 1 and NoC 3, the payload to be sent is 222 therefore the payload moves to the SE 1.
(3) For communication between NoC 1 and NoC 2, the payload to be sent is 333 therefore the payload moves to the SE 1 and overwrites the residing packet 222.Hence, payload 333 is at SE 1.
(4) For communication between NoC 1 and NoC 0, the payload to be sent is 444 therefore the payload moves to SE 1 and overwrites the residing packet 333.Hence, payload 444 is at SE 1.

Second
Step.Refer to Figure 14 and Table 9.
(2) For communication between NoC 1 and NoC 3, the payload 222 moves to SE 4. All source destination pairs delivered the information to the respective NoCs.The following data does reflect the parallel communication among the NoC using packet and wormhole switching algorithms.Some of the payload information are dropped in the network due to non-availability of paths.

Third
Step.Refer to Figure 15 and Table 10.
(1) For communication between NoC 0 and NoC 2, the payload 111 has reached its destination NoC 2 and transferred.
(2) For communication between NoC 1 and NoC 0, the payload 444 has reached its destination NoC 0 and transferred.(2) For the FOR loops starting on line #13, Time T 2 = n × t 3 (here t 3 is constant).
Here Figure 7 shows the architecture of the NiP model in the form of blocks.The same is used for showing the simulation behavior of the inter-NoC communication in NiP for the best and the worst case.
Best Cases.For this communication scenario, the following source NoC and destination NoC have chosen as an example, shown in Table 14.

First
Step.Refer to Figure 16 and Table 14.
(1) When communication is set up between NoC 0 and NoC 1, the payload to be sent is 111 therefore the payload first goes to the SE 0.
(2) For communication between NoC 1 and NoC 0, the payload to be sent is 222 therefore the payload goes to the SE 1.
(3) For communication between NoC 2 and NoC 3, the payload to be sent is 333 therefore the payload goes to the SE 4.
(4) For communication between NoC 3 and NoC 2, the payload to be sent is 444 therefore the payload goes to the SE 5.

Second
Step.Refer to Figure 17 and Table 15.
(1) When communication is set up between NoC 0 and NoC 1, the payload 111 moves to the SE 2.
(2) For communication between NoC 1 and NoC 0, the payload 222 moves to the SE 0.
(3) For communication between NoC 2 and NoC 3, the payload 333 moves to the SE 3.
(4) For communication between NoC 3 and NoC 2, the payload 444 moves to the SE 4.

Third
Step.Refer to Figure 18 and Table 16.
(1) When communication is set up between NoC 0 and NoC 1, the payload moves to the SE 1.
(2) For communication between NoC 1 and NoC 0, the payload 222 has its destination reached NoC 0 and transferred.
(3) For communication between NoC 2 and NoC 3, the payload 333 moves to the SE 5.
(4) For communication between NoC 3 and NoC 2, the payload 444 has reached its destination NoC 2 and transferred.

Fourth
Step.Refer to Figure 19 and Tables 17 and 18.
(1) For communication between NoC 0 and NoC 1, the payload 111 has reached its destination NoC 1.
(2) For communication between NoC 2 and NoC 3, the payload 333 moves to the SE 3.
Worst Cases.For this communication scenario, the following source NoC and destination NoC have been chosen as an example, shown in Table 19.This case is only 50% efficient.

First
Step.Refer to Figure 20 and Table 19.
(1) For communication between NoC 0 and NoC 1, the payload to be sent is 111 therefore; the payload first moves to the SE 0.
(2) For communication between NoC 0 and NoC 2, the payload to be sent is 222 therefore, the payload moves to the SE 0 and it overwrites the residing packet 111.Hence, payload 222 is at SE 0.
(3) For communication between NoC 0 and NoC 3, the payload to be sent is 333 therefore, the payload moves to the SE 0 and it overwrites the residing packet 222.Hence, payload 333 is at SE 0.
(4) For communication between NoC 3 and NoC 1, the payload to be sent is 444 therefore; the payload moves to SE 5.

Second
Step.Refer to Figure 21 and Table 20.
(1) For communication between NoC 0 and NoC 3, the payload 333 moves to the SE 3.

Third
Step.Refer to Figure 22 and Table 21.

Fourth
Step.Refer to Figure 23 and Tables 22 and 23.
(1) For communication between NoC 0 and NoC 3, the payload 333, residing at SE 5, will get transferred to NoC 3.
(2) For communication between NoC 3 and NoC 1, the payload 444, residing at SE 1, will get transferred to NoC 1.

Results and Discussions
From Figure 24 in case of Packet Switching NiP, using HXN and PNN as IoC has shown 100% efficiency for the single and double pair of communication among the NoCs.However, as the number of pairs increases, that is, for quad pair of communication, the system using PNN as IoC shows 62.5% of efficiency whereas, the same system using HXN as IoC shows 75% efficiency only.In case of wormhole switching NiP, using HXN and PNN as IoC has shown 100% efficiency for the single and double pair of communication among the NoC.However, as the number of pairs increases, that is, for quad pair of communication, the system using PNN as IoC shows 72.5% of efficiency, whereas the same system using HXN as IoC shows 82.5% efficiency only.Let the constant failure rate of individual switches be λ and the constant repair rate be μ.Now consider a MIN with M switches and N as network size.For a single fault-tolerant network, the Markov chain model is shown in Figure 25.A Markov chain describes at successive times the states of a system.At these times, the system may have changed from the state it was in the moment before to another or stayed in the same state.The changes of state are called transitions.The Markov property means that the system is memory less, that is, it does not "remember" the states it was in before, just "knows" its present state and hence bases its "decision" to which future state it will transit purely on the present, not considering the past.27.From graph, it seems that with increase in the size of the MIN, the MTTR improvement factor is actually decreasing.This is due to the conservative assumption that exactly one fault is successfully tolerated.In reality, with increasing size of the MIN the average number of faults tolerated increases.Moreover, it is depicted that the cost of the HXN is higher in comparison to the PNN.As the size goes more higher, that is, the order of the 512 × 512 or even more the comparison between the cost increases more.Nevertheless, the number of faults tolerated by the HXN is higher in comparison to the number of faults tolerated by the PNN.

Conclusion
This paper presents a new method that allows global communication between NoCs in NiP.For this, four O(n) 2 faulttolerant parallel algorithms have been proposed.It allows different NoCs to communicate in parallel using either faulttolerant irregular PNN or fault-tolerant regular HXN.These two are acting as an IoC in NiP and can tolerate faults also.The two NiP architectures have been automated and the simulation results are provided in the Tables 11-13 and 24-26.
In case of packet switching the NiP, using HXN and PNN as IoC has shown 100% efficiency for the single and double pair of communication among the NoC.However, as the number of pairs increases, that is, for quad pair of communication, the system using PNN as IoC has shown an efficiency of 62.5%, whereas the same system while using HXN as IoC has shown an efficiency of 75% only.
In case of wormhole switching the NiP, using HXN and PNN as IoC has shown 100% efficiency for the single and double pair of communication among the NoC.However, as the number of pairs increases, that is, for quad pair of communication, the system using PNN as IoC has shown an efficiency of 72.5%, whereas the same system while using HXN as IoC has shown an efficiency of 82.5% only.
The comparison of IoC on cost and MTTR concluded that the HXN has the higher cost than the PNN, but the MTTR values of the HXN are low in comparison to the PNN.This signifies that the ability to tolerate faults and online repairing of the HXN is high and faster than the PNN.The various features of the old system and the current proposed system are tabulated in Table 28.From the provided data, one can easily compare the systems and their efficiency.The proposed system comes out to be superior in every aspect.

Figure 5 :
Figure 5: Following are the legends used in the simulation technique.

Figure 12 : 2 Figure 13 :
Figure 12: The simulation state of NiP-PNN model at fifth step of best case.

2 Figure 14 :Figure 15 :
Figure 14: The simulation state of NiP-PNN model at second step of worst case.

Figure 16 :Figure 17 :Figure 18 :Figure 19 :Figure 20 :
Figure 16: The simulation state of NiP-HXN model at first step of best case.
Input: n, stores Number of parallel processing NoC.Source, a NoC type array, stores the Source NoC Numbers a part of NoC structure.Destination, a NoC type array, stores the Destination NoC Number a part of NoC structure.Payload, a part of NoC structure that holds the data generated as Source NoC.Output= 0 to 6) DO / * Initialization of Elements of Switching Element structure * / = 0 / * Initializing payload of all the 4 NoC to 0 * / (10) Number = I / * Numbering of all the 4 NoC from 0 to 3 * / (11) END FOR (12) Get the Number of parallel communicating NoC, n (13) FOR (l = 0 to n) DO (14) Get the Source and Destination NoC S[l], D[l] (15) Get the respective payloads (16) END FOR (17) FOR (x = 0 to 6) DO (18) BEGIN / * Stage1: Transferring the "Payload" values from the NoC to the respective "Info" values of Switching Element structure * / (19) FOR (y = 0 to n−1) DO (20) Transfer the payloads of NoC to the Info of next immediate Switching Elements (21) Transfer the Source NoC Number to the Source of Switching Element (22) Transfer the Destination Number to the Switching Element (23) END FOR / * Stage2: If the communicating NoC are on the same side then check the source and respective destination number of the Switching Element and transferring it to the next Switching Element having empty "Info" value

3 Figure 23 :Figure 24 :Figure 25 :
Figure 23: The simulation state of NiP-HXN model at fourth step of worst case.
Figure 26: MTTF under repair of PNN and HXN along with the MTTF lower bounds.

Figure 6
Figure 6 shows the architecture of the NiP model in the form of blocks.The same is used for showing the simulation behavior of the intra-NoC communication in NiP for the best and the worst case (Note: Figure 6 to Figure 23 have been designed using the legends declared in Figure 5.).
c h e ds u c c e s s f u l l y R e a c h e ds u c c e s s f u l l c h e ds u c c e s s f u l l y R e a c h e ds u c c e s s f u l l c h e ds u c c e s s f u l l y R e a c h e ds u c c e s s f u l l Statuses for HXN-Packet Switching and HXN-Wormhole Switching.(See Tables 25, 26, 27 and Figure 24).

3. 1 .
Mean Time to Repair of IoC.In fault-tolerant MIN, it is always expected that the detection of a fault in SE initiates the repair of the fault, to protect the SE from the occurrence of a second, extremely harmful, fault.Only the conservative approximation of the Mean Time to Failure (MTTF) of single fault-tolerant MIN assumes the repair of the faults.

Table 1 :
Initial source and destination information used to set up communication between NoCs.

Table 2 :
Intermediate SE information while data moves from source NoC to destination NoC at first step of best case.

Table 3 :
Intermediate SE information while data moves from source NoC to destination NoC at second step of best case.

Table 4 :
Intermediate SE information while data moves from source NoC to destination NoC at third step of best case.

Table 5 :
Intermediate SE information while data moves from source NoC to destination NoC at fourth step of best case.

Table 6 :
The payload successfully reached to its desired destination NoC.

Table 7 :
Initial source and destination information used to set up communication between NoCs.: n, stores Number of parallel processing NoC.Source, a NoC type array, stores the Source NoC Numbers a part of NoC structure.Destination, a NoC type array, stores the Destination NoC Numbers a part of NoC structure.Payload, a part of NoC structure that holds the data generated as Source NoC.
[5][6][7][8]ng any data, whenever the source has a packet to be sent, it transmits the data.The need for storing entire packets in a switch in case of conventional packet switching makes the buffer requirement high in these cases.In a SoC environment, the requirement is that switches should not consume a large fraction of silicon area compared to the IP blocks[5][6][7][8].InputOutput: Payload.
[3]ge3: If the linked NoC is the destination that is, NoC-0 for SE[0], NoC-1 for SE[1]NoC-3 for SE[2], NoC-4 for SE[3]then transfer the "Info" of the Switching Element to their respective linked NoC * / * Stage2: If the communicating NoC are on the same side then check the source and respective destination number of the Switching Element and transferring it to the next Switching Element having empty "Info" value * / * If "Info" of first linked Switching Element is not empty then transfer the packet to the "Info" of Second linked Switching Element * / * If the "Info" of first linked Switching Element is not empty then transfer the packet to "Info" of Second linked Switching Element * / Algorithm 2: Continued.* * If the linked NoC is not the destination then transfer the "Info" of Switching Element (SE) to the next empty Switching Element.If not then destroy the packet * / * Stage4: Transfer the Info of the Switching Element (that could not be transferred in the previous stage) to the destination NoC * /

Table 8 :
Intermediate SE information while data moves from source NoC to destination NoC at first step of worst case.

Table 9 :
Intermediate SE information while data moves from source NoC to destination NoC at second step of worst case.

Table 10 :
The payload successfully reached to its desired destination NoC.
channel as the preceding data flit, and no packet reordering is required at destinations.If a certain flit faces a busy channel,

Table 11 :
The following source destination pair of NoC is used to show 100% efficiency of the NiP (based on PNN, used as IoC).All source destination pair delivered the information to the respective NoC.The following data does not reflect the parallel communication among the NoCs using packet and wormhole switching algorithms.
Here PS represents packet switching and WS represents wormhole switching.

Table 12 :
The following source destination pairs of NoCs are used to show 100% efficiency of the NiP (based on PNN, used as IoC).All source destination pairs delivered the payload information to the respective NoC.The following data does reflect the parallel communication among the NoCs using packet and wormhole switching algorithms.

Table 14 :
Initial source and destination information used to set up communication between NoCs.

Table 15 :
Intermediate SE information while data moves from source NoC to destination NoC at second step of best case.

Table 16 :
Intermediate SE information while data moves from source NoC to destination NoC at third step of best case.

Table 17 :
Intermediate SE information while data is moves from source NoC to destination NoC (in between some information has been reached to the desired destination successfully) at fourth step of best case.

Table 18 :
The payload successfully reached to its desired destination NoC.
2.3.2.Fault-Tolerant Wormhole Switched Algorithm for Dynamic Communication among NoCs Using Irregular PNN.The run time Complexity of Algorithm3: NoC WS IRREGULAR PNN is O(n) 2 .

Table 19 :
Initial data used to set up communication between NoCs.

Table 20 :
Intermediate SE information while data is moves from source NoC to destination NoC (in between some information destroyed before reaching to the destination successfully) at second step of worst case.

Table 21 :
Intermediate SE information while data moves from source NoC to destination NoC at third step of worst case.

Table 22 :
Intermediate SE information while data moves from source NoC to destination NoC at fourth step of worst case.

Table 23 :
The payload successfully reached to its desired destination NoC.

Table 24 :
The following source destination pair is used to show 100% efficiency of the NiP (based on HXN, used as IoC).All source destination pair delivered the information to the respective NoC.The following data does not reflect the parallel communication among the NoCs using packet and wormhole switching algorithms.

Table 25 :
The following source destination pairs of NoCs are used to show 100% efficiency of the NiP (based on HXN, used as IoC).All source destination pairs delivered the payload information to the respective NoC.The following data does reflect the parallel communication among the NoCs using packet and wormhole switching algorithms.

Table 26 :
The following source destination pairs of NoCs are used to show 75% and 82.5% efficiency of the NiP (based on HXN, used as IoC).

Table 27 :
Values of MTTR of comparative networks.

Table 28 :
The comparison of the existing and proposed parallel communication model.
Here the Markov chain model is represented with three conservative states: state A represents the no-fault state; state B represents the single-fault state, while state C is the two-fault state.The IoC network can tolerate more than one fault.Now it is assumed that if the MIN reaches State C, it has failed.Since the schemes presented here can tolerate more than one faulty switch in many cases, this model should give a lower bound for the MTTF of the system [9]: