Enhancing the Utilization of Dot-Product Engines in Deep Learning Accelerators

Authors :: Armin Runge
Taha Soliman
Leonardo Ecco
Source :: IPDPS Workshops
Publication Year :: 2020
Publisher :: IEEE, 2020.
Abstract: In the last couple of years, the goal of deploying Deep Neural Networks (DNNs) in the embedded domain has led to the development of several dedicated DNN Hardware Accelerators (HWAs), many of which rely on a Dot-Product Engine (DPE)-based architecture. A DPE is a hardware block that receives two input vectors of the same size and produces one scalar value. Nevertheless, when the actual input vector does not match the DPEs native input-size, the DPE becomes underutilized. This is particularly observed for Convolutional Neural Networks (CNNs). In this article, we introduce kernel linearization, a technique to mitigate the underutilization of DPE-based HWA for DNNs. To demonstrate the benefits of this method, we implemented AEXelerator, a DPE-based HWA prototyped on FPGA and validated using automotive relevant DNNs. Our evaluation shows that our architecture achieves a high DPE utilization even in the presence of large vector-size mismatches.

Subjects :: Kernel (linear algebra)
Computer engineering
Linearization
business.industry
Computer science
Deep learning
Automotive industry
Dot product
Artificial intelligence
business
Field-programmable gate array
Convolutional neural network
Block (data storage)

Database :: OpenAIRE
Journal :: 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
Accession number :: edsair.doi...........d804fafec99b1a8681c74fafab2b8451
Full Text :: https://doi.org/10.1109/ipdpsw50202.2020.00142