Back to Search Start Over

A Ferroelectric-Based Volatile/Non-Volatile Dual-Mode Buffer Memory for Deep Neural Network Accelerators.

Authors :
Luo, Yandong
Luo, Yuan-Chun
Yu, Shimeng
Source :
IEEE Transactions on Computers. Sep2022, Vol. 71 Issue 9, p2088-2101. 14p.
Publication Year :
2022

Abstract

Deep neural network (DNN) inference and training produce a large amount of intermediate data. To achieve high energy efficiency, sufficient on-chip buffer is preferred to reduce the energy and time consuming off-chip DRAM access. However, SRAM buffer suffers from large area cost and high standby power due to its large cell size and high leakage current. Although embedded DRAM (eDRAM) offers higher memory density, its energy consumption is high due to frequent refresh operation, which is induced by the short refresh interval (40∼100μs). In this paper, a dual-mode buffer memory based on the CMOS compatible HfZrO2 ferroelectric material is proposed for DNN accelerators. It can operate in both volatile eDRAM mode and non-volatile ferroelectric RAM (FeRAM) mode. The functionality of the proposed dual-mode memory bit-cell design is verified using SPICE simulation with the multi-domain Preisach physical model. A data-lifetime-aware memory mode configuration protocol is proposed to optimize the buffer access energy for both DNN inference and training. Detailed circuitry and architectural support for the dual-mode memory are presented. For DNN training with ferroelectric-field-effect-transistor (FeFET) and SRAM-based compute-in-memory (CIM) accelerator, the proposed dual-mode buffer design improves the overall energy efficiency by 92.2%∼98.7%, 44.1%∼47.6%, 12.6%∼13.0% compared to baseline designs using SRAM buffer with the same buffer area, eDRAM and FeRAM with the same buffer capacity, respectively. For DNN inference with tensor-processing-unit (TPU)-like systolic array, the energy efficiency during computing is improved by 40.7%∼45.6%, 18.4%∼29.6% compared to the designs with eDRAM and FeRAM buffer, respectively. By storing the persistent data using the non-volatile mode, the energy efficiency of systolic array is improved by 2.3×∼5.5× over SRAM-based design when standby is frequent. The chip area overhead of the dual-mode buffer design is 5.2%, 4.1% and 7.2% for FeFET-based-CIM, SRAM-based-CIM and systolic-array-based accelerators using eDRAM buffer, respectively. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00189340
Volume :
71
Issue :
9
Database :
Academic Search Index
Journal :
IEEE Transactions on Computers
Publication Type :
Academic Journal
Accession number :
158561773
Full Text :
https://doi.org/10.1109/TC.2021.3122872