Back to Search
Start Over
An Efficient CNN Accelerator Using Inter-Frame Data Reuse of Videos on FPGAs.
- Source :
- IEEE Transactions on Very Large Scale Integration (VLSI) Systems; Nov2022, Vol. 30 Issue 11, p1587-1600, 14p
- Publication Year :
- 2022
-
Abstract
- Convolutional neural networks (CNNs) have had great success when applied to computer vision technology, and many application-specific integrated circuit (ASIC) and field-programmable gate array (FPGA) CNN accelerators have been proposed. These accelerators primarily focus on the acceleration of a single input, and they are not particularly optimized for video applications. In this article, we focus on the similarities between continuous inputs in video, and we propose a YOLOv3-tiny CNN FPGA accelerator using incremental operation. The accelerator can skip the convolution operation of similar data between continuous inputs. We also use the Winograd algorithm to optimize the conv $3\times 3$ operator in the YOLOv3-tiny network to further improve the accelerator’s efficiency. Experimental results show that our accelerator achieved 74.2 frames/s on ImageNet ILSVRC2015. Compared to the original network without Winograd algorithm and incremental operation, our design provides a $4.10\times $ speedup. When compared with other YOLO network FPGA accelerators applied to video applications, our design provided a $3.13\times $ – $18.34\times $ normalized digital signal processor (DSP) efficiency and $1.10\times $ – $14.2\times $ energy efficiency. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 10638210
- Volume :
- 30
- Issue :
- 11
- Database :
- Complementary Index
- Journal :
- IEEE Transactions on Very Large Scale Integration (VLSI) Systems
- Publication Type :
- Academic Journal
- Accession number :
- 160688034
- Full Text :
- https://doi.org/10.1109/TVLSI.2022.3151788