Start Over

Automatic Generation of Fast and Accurate Performance Models for Deep Neural Network Accelerators

Authors :: Lübeck, Konstantin
Jung, Alexander Louis-Ferdinand
Wedlich, Felix
Müller, Mika Markus
Peccia, Federico Nicolás
Thömmes, Felix
Steinmetz, Jannik
Biermaier, Valentin
Frischknecht, Adrian
Bernardo, Paul Palomero
Bringmann, Oliver
Publication Year :: 2024
Abstract: Implementing Deep Neural Networks (DNNs) on resource-constrained edge devices is a challenging task that requires tailored hardware accelerator architectures and a clear understanding of their performance characteristics when executing the intended AI workload. To facilitate this, we present an automated generation approach for fast performance models to accurately estimate the latency of a DNN mapped onto systematically modeled and concisely described accelerator architectures. Using our accelerator architecture description method, we modeled representative DNN accelerators such as Gemmini, UltraTrail, Plasticine-derived, and a parameterizable systolic array. Together with DNN mappings for those modeled architectures, we perform a combined DNN/hardware dependency graph analysis, which enables us, in the best case, to evaluate only 154 loop kernel iterations to estimate the performance for 4.19 billion instructions achieving a significant speedup. We outperform regression and analytical models in terms of mean absolute percentage error (MAPE) compared to simulation results, while being several magnitudes faster than an RTL simulation.<br />Comment: Accepted version for: ACM Transactions on Embedded Computing Systems

Subjects :: Computer Science - Performance
Computer Science - Artificial Intelligence
Computer Science - Hardware Architecture
Computer Science - Machine Learning

Details

Database :: arXiv
Publication Type :: Report
Accession number :: edsarx.2409.08595
Document Type :: Working Paper

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Automatic Generation of Fast and Accurate Performance Models for Deep Neural Network Accelerators

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Automatic Generation of Fast and Accurate Performance Models for Deep Neural Network Accelerators

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources