Back to Search
Start Over
SNAP: An Efficient Sparse Neural Acceleration Processor for Unstructured Sparse Deep Neural Network Inference
- Source :
- IEEE Journal of Solid-State Circuits. 56:636-647
- Publication Year :
- 2021
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2021.
-
Abstract
- Recent developments in deep neural network (DNN) pruning introduces data sparsity to enable deep learning applications to run more efficiently on resource- and energy-constrained hardware platforms. However, these sparse models require specialized hardware structures to exploit the sparsity for storage, latency, and efficiency improvements to the full extent. In this work, we present the sparse neural acceleration processor (SNAP) to exploit unstructured sparsity in DNNs. SNAP uses parallel associative search to discover valid weight (W) and input activation (IA) pairs from compressed, unstructured, sparse W and IA data arrays. The associative search allows SNAP to maintain a 75% average compute utilization. SNAP follows a channel-first dataflow and uses a two-level partial sum (psum) reduction dataflow to eliminate access contention at the output buffer and cut the psum writeback traffic by 22 $\times $ compared with state-of-the-art DNN accelerator designs. SNAP’s psum reduction dataflow can be configured in two modes to support general convolution (CONV) layers, pointwise CONV, and fully connected layers. A prototype SNAP chip is implemented in a 16-nm CMOS technology. The 2.3-mm2 test chip is measured to achieve a peak effectual efficiency of 21.55 TOPS/W (16 b) at 0.55 V and 260 MHz for CONV layers with 10% weight and activation densities. Operating on a pruned ResNet-50 network, the test chip achieves a peak throughput of 90.98 frames/s at 0.80 V and 480 MHz, dissipating 348 mW.
- Subjects :
- Pointwise
Artificial neural network
Dataflow
Computer science
business.industry
Deep learning
020208 electrical & electronic engineering
02 engineering and technology
Chip
Computational science
Convolution
Reduction (complexity)
CMOS
0202 electrical engineering, electronic engineering, information engineering
Artificial intelligence
Electrical and Electronic Engineering
business
Throughput (business)
Subjects
Details
- ISSN :
- 1558173X and 00189200
- Volume :
- 56
- Database :
- OpenAIRE
- Journal :
- IEEE Journal of Solid-State Circuits
- Accession number :
- edsair.doi...........e2d898d37c53548689c6b54bd0bd1e29
- Full Text :
- https://doi.org/10.1109/jssc.2020.3043870