1. The position-based compression techniques for DNN model.
- Author
-
Tang, Minghua, Russo, Enrico, and Palesi, Maurizio
- Subjects
- *
ARTIFICIAL neural networks , *HUFFMAN codes , *ENERGY consumption - Abstract
In deep neural network (DNN) accelerators, it is expensive to transfer model parameters from the main memory to the processing elements. Data movement accounts for a large number of the inference latency and energy consumption. In this paper, we present three position-based techniques to compress the DNN model parameters. The techniques could lead to significant energy and performance improvement. The three presented compression techniques are lossless. The first technique takes into consideration the regularly repeat property of the DNN weights to compress them. The second technique saves the relative distance between weights instead of the weights to compress the model. The third technique applies Huffman coding on the relative distance based on the second technique. The proposed techniques are assessed on several DNNs. The results show that, the first technique could decrease 38% of latency and 36% energy, respectively. The second technique could decrease 41% of latency and 39% energy, respectively. The third technique could decrease 45% of latency and 43% energy, respectively. Applying Huffman code could achieve additional 7% reduction in both latency and energy based on the second technique. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF