Back to Search Start Over

ReApprox-PIM: Reconfigurable Approximate Lookup-Table (LUT)-Based Processing-in-Memory (PIM) Machine Learning Accelerator

Authors :
Bavikadi, Sathwika
Sutradhar, Purab Ranjan
Indovina, Mark A.
Ganguly, Amlan
Dinakarrao, Sai Manoj Pudukotai
Source :
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems; August 2024, Vol. 43 Issue: 8 p2288-2300, 13p
Publication Year :
2024

Abstract

Convolutional neural networks (CNNs) have achieved significant success in various applications. Numerous hardware accelerators are introduced to accelerate CNN execution with improved energy efficiency compared to traditional software implementations. Despite the achieved success, deploying traditional hardware accelerators for bulky CNNs on current and emerging smart devices is impeded by limited resources, including memory, power, area, and computational capabilities. Recent works introduced processing-in-memory (PIM), a non-Von-Neumann architecture, which is a promising approach to tackle the problem of data movement between logic and memory blocks. However, as observed from the literature, the existing PIM architectures cannot congregate all the computational operations due to limited programmability and flexibility. Furthermore, the capabilities of the PIM are challenged by the limited available on-chip memory. To enable faster computations and address the limited on-chip memory constraints, this work introduces a novel reconfigurable approximate computing (AC)-based PIM, termed reconfigurable approximate PIM (ReApprox-PIM). The proposed ReApprox-PIM is capable of addressing the two challenges mentioned above in the following manner: 1) it utilizes a programmable lookup-table (LUT)-based processing architecture that can support different AC techniques via programmability and 2) followed by resource-efficient, fast CNN computing via the implementation of highly optimized AC techniques. This results in improved computing footprint, operational parallelism, and reduced computational latency and power consumption compared to prior PIMs relying on exact computations for CNN inference acceleration at a minimal sacrifice of accuracy. We have evaluated the proposed ReApprox-PIM on various CNN architectures, for inference applications, including standard LeNet, AlexNet, ResNet-18, −34, and −50. Our experimental results show that the ReApprox-PIM achieves a speedup of <inline-formula> <tex-math notation="LaTeX">$1.63\times $ </tex-math></inline-formula> with <inline-formula> <tex-math notation="LaTeX">$1.66\times $ </tex-math></inline-formula> lower area for the processing components compared to the existing PIM architectures. Furthermore, the proposed ReApprox-PIM achieves <inline-formula> <tex-math notation="LaTeX">$2.5\times $ </tex-math></inline-formula> higher energy efficiency and <inline-formula> <tex-math notation="LaTeX">$1.3\times $ </tex-math></inline-formula> higher throughput compared to the state-of-the-art LUT-based PIM architectures.

Details

Language :
English
ISSN :
02780070
Volume :
43
Issue :
8
Database :
Supplemental Index
Journal :
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Publication Type :
Periodical
Accession number :
ejs66994852
Full Text :
https://doi.org/10.1109/TCAD.2024.3367822