An End-to-End Workflow to Efficiently Compress and Deploy DNN Classifiers on SoC/FPGA.

Authors :: Molina, Romina Soledad
Morales, Ivan Rene
Crespo, Maria Liz
Costa, Veronica Gil
Carrato, Sergio
Ramponi, Giovanni
Source :: IEEE Embedded Systems Letters; Sep2024, Vol. 16 Issue 3, p255-258, 4p
Publication Year :: 2024
Abstract: Machine learning (ML) models have demonstrated discriminative and representative learning capabilities over a wide range of applications, even at the cost of high-computational complexity. Due to their parallel processing capabilities, reconfigurability, and low-power consumption, systems on chip based on a field programmable gate array (SoC/FPGA) have been used to face this challenge. Nevertheless, SoC/FPGA devices are resource-constrained, which implies the need for optimal use of technology for the computation and storage operations involved in ML-based inference. Consequently, mapping a deep neural network (DNN) architecture to a SoC/FPGA requires compression strategies to obtain a hardware design with a good compromise between effectiveness, memory footprint, and inference time. This letter presents an efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization (BO) with an ensemble of compression techniques. [ABSTRACT FROM AUTHOR]

Full Text Access

Tools