Start Over

SHARP: An adaptable, energy-efficient accelerator for recurrent neural networks

Authors :: Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Universitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors
Yazdani Aminabadi, Reza
Ruwase, Olatunji
Zhang, Minjia
He, Yuxiong
Arnau Montañés, José María
González Colás, Antonio María
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Universitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors
Yazdani Aminabadi, Reza
Ruwase, Olatunji
Zhang, Minjia
He, Yuxiong
Arnau Montañés, José María
González Colás, Antonio María
Publication Year :: 2023
Abstract: The effectiveness of Recurrent Neural Networks (RNNs) for tasks such as Automatic Speech Recognition has fostered interest in RNN inference acceleration. Due to the recurrent nature and data dependencies of RNN computations, prior work has designed customized architectures specifically tailored to the computation pattern of RNN, getting high computation efficiency for certain chosen model sizes. However, given that the dimensionality of RNNs varies a lot for different tasks, it is crucial to generalize this efficiency to diverse configurations. In this work, we identify adaptiveness as a key feature that is missing from today’s RNN accelerators. In particular, we first show the problem of low resource utilization and low adaptiveness for the state-of-the-art RNN implementations on GPU, FPGA, and ASIC architectures. To solve these issues, we propose an intelligent tiled-based dispatching mechanism for increasing the adaptiveness of RNN computation, in order to efficiently handle the data dependencies. To do so, we propose Sharp as a hardware accelerator, which pipelines RNN computation using an effective scheduling scheme to hide most of the dependent serialization. Furthermore, Sharp employs dynamic reconfigurable architecture to adapt to the model’s characteristics. Sharp achieves 2×, 2.8×, and 82× speedups on average, considering different RNN models and resource budgets, compared to the state-of-the-art ASIC, FPGA, and GPU implementations, respectively. Furthermore, we provide significant energy reduction with respect to the previous solutions, due to the low power dissipation of Sharp (321 GFLOPS/Watt).<br />This work has been supported by the CoCoUnit ERC Advanced Grant of the EU’s Horizon 2020 program (grant No 833057), the Spanish State Research Agency (MCIN/AEI) under grant PID2020-113172RB-I00, and the ICREA Academia program.<br />Peer Reviewed<br />Postprint (author's final draft)

Details

Database :: OAIster
Notes :: application/pdf, English
Publication Type :: Electronic Resource
Accession number :: edsoai.on1390664911
Document Type :: Electronic Resource

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

SHARP: An adaptable, energy-efficient accelerator for recurrent neural networks

Abstract

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

SHARP: An adaptable, energy-efficient accelerator for recurrent neural networks

Abstract

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources