Back to Search
Start Over
A Reconfigurable Neural Network Processor With Tile-Grained Multicore Pipeline for Object Detection on FPGA
- Source :
- IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 29:1967-1980
- Publication Year :
- 2021
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2021.
-
Abstract
- In order to improve the computational efficiency of convolutional neural networks (CNNs) for object detection on reconfigurable platforms such as field-programmable gate arrays (FPGAs), we propose a CNN processor with hierarchical pipelining and multicore reconfigurable computing based on parallel parameter constraints. First, we propose a pipelined multicore processing architecture that can adapt to computations and on-chip memory requirements of different convolutional layers. We present the design of a CNN processor that can configure the computing units while adjusting the interconnection of multicore to improve the utilization of reconfigurable computing resources. Second, we propose an elastic on-chip buffer and a data access approach by dynamically configuring addresses to better utilize on-chip memory. Meanwhile, we present a cross-layer feature map fusion strategy based on computing near memory (CNM) to reduce off-chip memory accesses. Finally, we propose a scheduling algorithm for pipelined tasks to improve the computational efficiency and throughput of the proposed processor. For evaluation, the well-known object detection methods (RetinaNet-ResNet-50, MobileNetV2-SSDLite, and YOLOv3) performed using the proposed CNN processor on the ZCU102 platform and reached the throughput of 1503, 1066, and 809 GOPS and the computational efficiency of 0.79, 0.62, and 0.36 GOPS/DSP, respectively. The designed processor realized a better tradeoff between computing efficiency and detection accuracy compared with the recently proposed object detection CNNs on FPGA.
- Subjects :
- Multi-core processor
Computer architecture
Hardware and Architecture
Computer science
Pipeline (computing)
System on a chip
Electrical and Electronic Engineering
Field-programmable gate array
Throughput (business)
Convolutional neural network
Software
Reconfigurable computing
Object detection
Subjects
Details
- ISSN :
- 15579999 and 10638210
- Volume :
- 29
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Very Large Scale Integration (VLSI) Systems
- Accession number :
- edsair.doi...........98cecf970c89513edb4e051c978a1f6a