A lightweight deep neural network model and its applications based on channel pruning and group vector quantization.

Authors :: Huang, Mingzhong
Liu, Yan
Zhao, Lijie
Wang, Guogang
Source :: Neural Computing & Applications. Dec2023, p1-14.
Publication Year :: 2023
Abstract: Deep convolutional neural networks (DCNNs) contain millions of parameters and require a tremendous amount of computation; therefore, they cannot be well supported by resource-constrained edge devices. We propose a two-stage model compression method to alleviate this restriction: channel pruning and group vector quantization (CP-GVQ). By channel pruning, many channels of the DCNNs layers are pruned to reduce the model size and improve model inference speed. Based on vector quantization (VQ), GVQ is proposed to compress DCNNs, and it uses group codebooks and code matrices to represent the parameters of grouped layers; the model size is reduced greatly. CP-GVQ not only dramatically decreases model size but also improves inference speed. In each stage, it is necessary to fine-tune the model to recover the original accuracy. When applied to the filament indices classification model of microscopic images of activated sludge, the classification accuracy decreased marginally from 0.99 to 0.97, but the model size was decreased by 99% and the inference speed was improved by 42%. [ABSTRACT FROM AUTHOR]

Full Text Access

Tools