Benchmark of the Compute-in-Memory-Based DNN Accelerator With Area Constraint.

Authors :: Lu, Anni
Peng, Xiaochen
Luo, Yandong
Yu, Shimeng
Source :: IEEE Transactions on Very Large Scale Integration (VLSI) Systems; Sep2020, Vol. 28 Issue 9, p1945-1952, 8p
Publication Year :: 2020
Abstract: Compute-in-memory (CIM) is a promising computing paradigm to accelerate the inference of deep neural network (DNN) algorithms due to its high processing parallelism and energy efficiency. Prior CIM-based DNN accelerators mostly consider full custom design, which assumes that all the weights are stored on-chip. For lightweight smart edge devices, this assumption may not hold. In this article, CIM-based DNN accelerators are designed and benchmarked under different chip area constraints. First, a scheduling strategy and dataflow for DNN inference is investigated when only part of the weights can be stored on-chip. Two weight reload schemes are evaluated: 1) reload partial weights and reuse input/output feature maps and 2) load a batch of input and reuse the partial weights on-chip across the batch. Then, system-level performance benchmark is performed for the inference of ResNet-18 on ImageNet data set. The design tradeoffs with different area constraints, dataflow, and device technologies [static random access memory (SRAM) versus ferroelectric field-effect transistor (FeFET)] are discussed. [ABSTRACT FROM AUTHOR]