Back to Search Start Over

Low Latency Implementations of CNN for Resource-Constrained IoT Devices.

Authors :
Mujtaba, Ahmed
Lee, Wai-Kong
Hwang, Seong Oun
Source :
IEEE Transactions on Circuits & Systems. Part II: Express Briefs; Dec2022, Vol. 69 Issue 12, p5124-5128, 5p
Publication Year :
2022

Abstract

Convolutional Neural Network (CNN) inference on a resource-constrained Internet-of-Things (IoT) device (i.e., ARM Cortex-M microcontroller) requires careful optimization to reduce the timing overhead. We propose two novel techniques to improve the computational efficiency of CNNs by targeting low-cost microcontrollers. Our techniques utilize on-chip memory and minimize redundant operations, yielding low-latency inference results on complex quantized models such as MobileNetV1. On the ImageNet dataset for per-layer quantization, we reduce inference latency and Multiply-and-Accumulate (MAC) per cycle by 22.4% and 22.9%, respectively, compared to the state-of-the-art mixed-precision CMix-NN library. On the CIFAR-10 dataset for per-channel quantization, we reduce inference latency and MAC per cycle by 31.7% and 31.3%, respectively. The achieved low-latency inference results can improve the user experience and save power budget in resource-constrained IoT devices. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15497747
Volume :
69
Issue :
12
Database :
Complementary Index
Journal :
IEEE Transactions on Circuits & Systems. Part II: Express Briefs
Publication Type :
Academic Journal
Accession number :
160688948
Full Text :
https://doi.org/10.1109/TCSII.2022.3205029