Complete vector quantization of feedforward neural networks.

Authors :: Floropoulos, Nikolaos
Tefas, Anastasios
Source :: Neurocomputing. Nov2019, Vol. 367, p55-63. 9p.
Publication Year :: 2019
Abstract: Deep neural networks are widely used to solve several difficult machine learning tasks due to their impressive performance on standard benchmarking datasets. Most of the state of the art neural architectures contain a staggering amount of parameters and have many layers. Thus, they are computationally intensive and memory demanding models. This fact prohibits their deployment in devices with limited computational resources such as smartphones and unmanned vehicles. Recently, a growing interest in developing methods that can compress and accelerate these networks emerged. In this paper, we propose the use of complete vector quantization for neural model compression and acceleration. More specifically, we show that it is possible to use product quantization with common subdictionaries to quantize both the parameters and the activations of the neural network without compromising significantly the network accuracy. The proposed method removes the need for multiplications in order to compute the neural preactivations and provides opportunities for acceleration using lookup tables. [ABSTRACT FROM AUTHOR]