1. High-performance Convolutional Neural Network Accelerator Based on Systolic Arrays and Quantization
- Author
-
Shengli Lu, Wei Pang, Li Yufeng, Hao Liu, and Luo Jihe
- Subjects
Contextual image classification ,business.industry ,Computer science ,Quantization (signal processing) ,Clock rate ,02 engineering and technology ,Convolutional neural network ,Object detection ,020202 computer hardware & architecture ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Field-programmable gate array ,business ,Electrical efficiency ,Computer hardware ,Digital signal processing - Abstract
In recent years, convolutional neural networks (CNN) has achieved great success in computer vision tasks, such as object detection, face recognition and image classification. With the diversification of CNN models, accelerators that only support a single network can no longer meet the needs of applications. Due to the computational intensiveness of convolution operations, the implementation of CNN on the FPGA platform faces many challenges. In this paper, a convolution unit based on systolic arrays is proposed in the design of CNN accelerator, and the fixed-point quantization method is adopted to save a large amount of storage resources and reduce the required transmission bandwidth, thus improving throughput and power efficiency. The performance density and power efficiency of our design can reach 0.165 GOPS/DSP and 36.3 GOPS/W under 100MHz clock frequency.
- Published
- 2019
- Full Text
- View/download PDF