Back to Search Start Over

WinoNN: Optimizing FPGA-Based Convolutional Neural Network Accelerators Using Sparse Winograd Algorithm.

Authors :
Wang, Xuan
Wang, Chao
Cao, Jing
Gong, Lei
Zhou, Xuehai
Source :
IEEE Transactions on Computer-Aided Design of Integrated Circuits & Systems. Nov2020, Vol. 39 Issue 11, p4290-4302. 13p.
Publication Year :
2020

Abstract

In recent years, a variety of accelerators on FPGAs have been proposed to speed up the convolutional neural network (CNN) in many domain-specific application fields. Besides, some optimization algorithms, such as fast algorithms and network sparsity, have greatly reduced the theoretical computational workload of CNN inference. There are currently a few accelerators on FPGAs that support both the fast Winograd algorithm (WinoA) and network sparsity to minimize the amount of computation. However, on the one hand, these architectures feed data into processing elements (PEs) in units of blocks, some boundary losses caused by sparse irregularities cannot be avoided. On the other hand, these works have not discussed the design space exploration under the sparse condition. In this article, we propose a novel accelerator called WINONN. We fully discuss the challenges faced by supporting WinoA, weight sparsity, and activation sparsity simultaneously. To minimize the online encoding overhead caused by activation sparsity, an efficient encoding format called multibit mask (MBM) is proposed. To handle the irregularities of sparse data, we proposed a novel Scatter-Compute-Gather method in hardware design, combined with a freely sliding buffer to achieve fine-grained data loading to minimize the boundary waste. Finally, we combine a theoretical analysis and experimental method to explore the design space, allowing WINONN to get the best performance on a specific FPGA. Our high scalability design enables us to deploy sparse Winograd accelerators on very small embedded FPGAs, which is not supported in previous works. The experimental results on VGG16 show that we achieve the highest digital signal processing unit (DSP) efficiency and highest energy efficiency compared with the state-of-the-art sparse architectures. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02780070
Volume :
39
Issue :
11
Database :
Academic Search Index
Journal :
IEEE Transactions on Computer-Aided Design of Integrated Circuits & Systems
Publication Type :
Academic Journal
Accession number :
146914760
Full Text :
https://doi.org/10.1109/TCAD.2020.3012323