17
RELIABLE NETWORK ON CHIP DESIGN ANMOL SHAHANI 829905938

Presentation reliable NoC

Embed Size (px)

DESCRIPTION

Reliable Network On Chip

Citation preview

Page 1: Presentation reliable NoC

RELIABLE NETWORK ON CHIP DESIGN

ANMOL SHAHANI829905938

Page 2: Presentation reliable NoC

Why Fault Tolerance?Offers many advantages:

◦Avoids costly packet retransmissions◦Avoids catastrophic data loss◦Can increase chip yield◦Allows higher speed operation

In NoC specifically◦Ensures success of interconnect◦Grows in importance as technology

scales

Page 3: Presentation reliable NoC

Fault Classes Transient faults (or soft errors) : Random appearance and

disappearance Alpha particles, Cosmic-ray-induced neutrons etc.

Intermittent faults: appear only under certain conditions like Occur repeatedly at the same location Tend to occur in bursts Replacement of the faulty component removes the fault

Permanent faults (or Hard errors): occur always but may be masked

Static (occurring at manufacture-time) Process Variability (PV), Manufacturing imperfections Dynamic (occurring at run-time,) Electro-Migration (EM), Negative Bias Temperature

Instability (NBTI), Oxide breakdown, Stress-Induced Voiding (SIV), Hot Carrier Injection (HCI), etc.

Page 4: Presentation reliable NoC

Making NoC’s ReliableCurrent Methods

T-error tolerant NoC designError Control

◦Error detection and correction codes◦HBH retransmission mechanism

• Reliable task mappingFault tolerant rerouting

Page 5: Presentation reliable NoC

Timing error tolerant NoC design

Page 6: Presentation reliable NoC

Error correction and detection

Page 7: Presentation reliable NoC

Power consumption Analysis

Power consumption of the schemes

Page 8: Presentation reliable NoC

Power consumption Observations

The ee-par scheme has higher power consumption than ee-crc and hybrid scheme.

The flit based scheme incurs more power consumption because as the no. of flits per packet increases the useful bits decreases.

The packet buffer requirements impact the power consumption. Hence, as the number of hops increases, the power overhead of ss-flit scheme increases.

Page 9: Presentation reliable NoC

HBH Retransmission Scheme

Advantages•Avoids deadlock•Eliminates the need to provide escape channel to the destination node.

Page 10: Presentation reliable NoC

Reliability Aware Task Mapping

Page 11: Presentation reliable NoC

Fault tolerant route generation

Switch Design to support multipath routing with In order packet delivery

Page 12: Presentation reliable NoC

Resilience against NBTI

Fig. adaptive router architecture

Page 13: Presentation reliable NoC
Page 14: Presentation reliable NoC

ROBUST: SELF HEALING ROUTER

Universal Logic Block Crossbar protection using multiple ULB blocks

Advantages

It has higher silicon protection factor and a higher reliability improvement factor.

Page 15: Presentation reliable NoC

Future challenges◦ All the schemes presented to improve the reliability

of the NoC architecture have power overhead associated with them. This increases the power dissipated which can reduce the mean time to failure (MTTF).

◦ All the techniques should be thermal aware in order to prevent the above mentioned phenomena.

◦ Instead of evenly wearing out all cores in MPSoCs, a method should be deigned to self heal failed cores.

◦ Most error resilient schemes today focus primarily on making router, links fault tolerant. There should be some focus on making memories more reliable

Page 16: Presentation reliable NoC

Conclusion The ideas presented in this paper make the NoC

architecture resilient to permanent and intermittent errors. To improve the reliability several techniques like t-error tolerant mechanism, self healing router architecture, reliability driven task mapping, deadlock recovery mechanism, error detection and correction schemes are employed. Several techniques make use of redundancy in hardware component which is good in terms of area since because of “dark silicon” it is impossible to turn on every component on the die anyways. However, most techniques increase the power consumption in the NoC architecture which is by far the only drawback in using them. Designing systems to make them resilient to errors is very crucial in exploiting the advantages of using Network on chips.

Page 17: Presentation reliable NoC

References [1] M. Yang, T. Li, Y. Jiang, and Y. Yang, “Fault-tolerant routing schemes in RDT(2,2,1)/-based interconnection network for

networks-on-chip designs,” [2] Jacques Henri Collet, Ahmed Louri, Vivek Tulsidas Bhat, Pavan Poluri, “ROBUST: A new Self-healing Fault-Tolerant NoC

Router” [3] Theocharis Theocharides, Luca Benini, Giovanni De Micheli, N. Vijaykrishnan, Mary Jane Irwin, “Analysis of Error

Recovery Schemes for Networks-on-Chips”. [4] Rutuparna Tamhankar, “TERROR: RELIABLE AND EFFICIENT LINK DESIGN FOR NETWORK ON CHIPS” [5] Armin Alaghi, Mahshid Sedghi, Naghmeh Karimi, Mahmood Fathy, Zainalabedin Navabi, “Reliable NoC Architecture

Utilizing a Robust Rerouting Algorithm”. [6] Srinivasan Murali, “METHODOLOGIES FOR RELIABLE AND EFFICIENT DESIGN OF NETWORKS ON CHIPS” [7] Xin Fu1, Tao Li, José A. B. Fortes,” Architecting Reliable Multi-core Network-on-Chip for Small Scale Processing

Technology” [8] Avijit Dutta and Nur A. Touba,” Reliable Network-on-Chip Using a Low Cost Unequal Error Protection Code” [9] Deepthi chamkur .V , Vijayakumar.T, “Reliable Routing & Deadlock free massive NoC Design with Fault Tolerance

based on combinatorial application.”. [10] Luca Benini, Giovanni De Micheli, “Powering Networks on Chips: Energy-efficient and reliable interconnect design

for SoCs”. [11] Haidar M. Harmanani and Rana Farah, “A Method for Efficient Mapping and Reliable Routing for NoC Architectures

with Minimum Bandwidth and Area “. [12] Yin-He Han Hang Lu Lei Zhang, “RevivePath: Resilient Network-on-Chip Design Through Data Path Salvaging of

Router” [13] Anup Das, Akash Kumar and Bharadwaj Veeravalli,“Reliability-Driven Task Mapping for Lifetime Extension of

Networks-on-Chip Based Multiprocessor Systems”. [14] Avijit Dutta and Nur A. Touba, ”Reliable Network-on-Chip Using a Low Cost Unequal Error Protection Code”. [15] Deepthi chamkur .V , Vijayakumar.T,” Reliable Routing & Deadlock free massive NoC Design with Fault Tolerance

based on combinatorial application.” [16] M.H. Neishaburi, Zeljko Zilic,” NISHA: A fault-tolerant NoC router enabling deadlock-free Interconnection of Subnets

in Hierarchical Architectures”. [17] Yu Ren , Leibo Liu , Shouyi Yin , Jie Han , Qinghua Wua, Shaojun Wei, “A fault tolerant NoC architecture using quad-

spare mesh topology and dynamic reconfiguration”. [18] Mehdi Modarressi , Marjan Asadinia , Hamid Sarbazi-Azad,” Using task migration to improve non-contiguous

processor allocation in NoC-based CMPs”. [19] Cristian Grecu, Lorena Anghel, Partha P. Pande, André Ivanov, Resve Saleh,” Essential Fault-Tolerance Metrics for

NoC Infrastructures”. [20] Young Hoon Kang, Taek-Jun Kwon, Jeffrey Draper,” Fault-Tolerant Flow Control in On-Chip Networks”.