Back to Search Start Over

Improving Deep Learning with a customizable GPU-like FPGA-based accelerator

Authors :
Alessandro Cilardo
Edoardo Fusella
Mirko Gagliardi
Gagliardi, Mirko
Fusella, E.
Cilardo, A.
Source :
PRIME
Publication Year :
2018
Publisher :
IEEE, 2018.

Abstract

An ever increasing number of challenging appli- cations are being approached using Deep Learning, obtaining impressive results in a variety of different domains. However, state-of-the-art accuracy requires deep neural networks with a larger number of layers and a huge number of different filters with millions of weights. GPU- and FPGA-based architectures have been proposed as a possible solution for facing this enormous demand of computing resources. In this paper, we investigate the adoption of different architectural features, i.e., SIMD paradigm, multithreading, and non-coherent on-chip memory for Deep Learning oriented FPGA-based accelerator designs. Experimental results on a Xilinx Virtex-7 FPGA show that the SIMD paradigm and multithreading can lead to an improvement in the execution time up to $5{\times }$and $3 . 5{\times }$, respectively. A further enhancement up to $1 . 75{\times }$can be obtained using a non-coherent on-chip memory.

Details

Database :
OpenAIRE
Journal :
2018 14th Conference on Ph.D. Research in Microelectronics and Electronics (PRIME)
Accession number :
edsair.doi.dedup.....1aeba6c91378b7dcfc1a6ef5c0b0365c
Full Text :
https://doi.org/10.1109/prime.2018.8430335