Start Over

An Efficient CNN Accelerator Using Inter-Frame Data Reuse of Videos on FPGAs.

Authors :: Li, Shengzhao
Wang, Qin
Jiang, Jianfei
Sheng, Weiguang
Jing, Naifeng
Mao, Zhigang
Source :: IEEE Transactions on Very Large Scale Integration (VLSI) Systems; Nov2022, Vol. 30 Issue 11, p1587-1600, 14p
Publication Year :: 2022
Abstract: Convolutional neural networks (CNNs) have had great success when applied to computer vision technology, and many application-specific integrated circuit (ASIC) and field-programmable gate array (FPGA) CNN accelerators have been proposed. These accelerators primarily focus on the acceleration of a single input, and they are not particularly optimized for video applications. In this article, we focus on the similarities between continuous inputs in video, and we propose a YOLOv3-tiny CNN FPGA accelerator using incremental operation. The accelerator can skip the convolution operation of similar data between continuous inputs. We also use the Winograd algorithm to optimize the conv $3\times 3$ operator in the YOLOv3-tiny network to further improve the accelerator’s efficiency. Experimental results show that our accelerator achieved 74.2 frames/s on ImageNet ILSVRC2015. Compared to the original network without Winograd algorithm and incremental operation, our design provides a $4.10\times $ speedup. When compared with other YOLO network FPGA accelerators applied to video applications, our design provided a $3.13\times $ – $18.34\times $ normalized digital signal processor (DSP) efficiency and $1.10\times $ – $14.2\times $ energy efficiency. [ABSTRACT FROM AUTHOR]