1. FastTrack: Leveraging Heterogeneous FPGA Wires to Design Low-Cost High-Performance Soft NoCs
- Author
-
Nachiket Kapre and Tushar Krishna
- Subjects
010302 applied physics ,Router ,Computer science ,Network packet ,business.industry ,Dataflow ,Bandwidth (signal processing) ,Clock rate ,Overlay network ,02 engineering and technology ,01 natural sciences ,020202 computer hardware & architecture ,Network on a chip ,Embedded system ,0103 physical sciences ,Hardware_INTEGRATEDCIRCUITS ,0202 electrical engineering, electronic engineering, information engineering ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,Field-programmable gate array ,business - Abstract
Networks-on-Chip (NoCs) implemented on FPGAs have to be designed differently from ASICs to fully exploit the unique architectural features and properties of the FPGA fabric. The FPGA-friendly bufferless, deflection routed Hoplite NoC is almost an order of magnitude smaller and runs at a faster operating frequency than competing classic buffered FPGA NoCs. It is able achieve this by sacrificing NoC link utilization that suffers due to the cost of packet deflections and associated high latency traversals. In this paper, we address these shortcomings by developing FastTrack, which is an FPGA-optimized, high-radix NoC that exploits the segmented interconnect structure of modern FPGAs. We adapt the NoC organization to use express bypass links in the NoC to skip multiple router stages in a single cycle. Our FastTrack design can be tuned to use different express link lengths for performance, and supports depopulation strategies for controlling the balance between FPGA LUT and wiring cost. For the Xilinx Virtex-7 485T FPGA, an 8×8 FastTrack NoC is 1.7โ 2.5× larger than a base Hoplite NoC, but operates at almost the same clock frequency. FastTrack delivers throughput and latency improvements across a range of statistical workloads (2.5×), and traces extracted from FPGA accelerator case studies such as Sparse Matrix-Vector Multiplication (2.5×), Graph Analytics (2.8×), Token LU Factorization Dataflow (1.4×) and Multi-processor overlay applications (2×). FastTrack also shows energy efficiency improvements by factors of up to 2.2× over baseline Hoplite due to higher sustained rates and high speed operation of express links made possible by fast FPGA interconnect.
- Published
- 2018
- Full Text
- View/download PDF