1. TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
- Author
-
Muhammad Shafique, Saad Rehman, Ali Hassan, Muhammmad Abdullah Hanif, and Dilshad Sabir
- Subjects
General Computer Science ,reduced workload ,Convolutional neural network ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Convolution ,Reduction (complexity) ,020204 information systems ,ComputingMethodologies_SYMBOLICANDALGEBRAICMANIPULATION ,0202 electrical engineering, electronic engineering, information engineering ,winograd transform ,General Materials Science ,0105 earth and related environmental sciences ,Mathematics ,Quantization (signal processing) ,General Engineering ,Order (ring theory) ,Approximation algorithm ,particle of swarm convolution layer optimization ,Kernel (image processing) ,tile quantization approximation ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,symmetry approximation ,lcsh:TK1-9971 ,Algorithm ,MNIST database - Abstract
Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a hybrid convolution method in conjunction with a Particle of Swarm Convolution Layer Optimization (PSCLO) algorithm. The hybrid convolution is an approximation that exploits the inherent symmetry of filter termed as symmetry approximation and Winograd algorithm structure termed as tile quantization approximation. PSCLO optimizes the balance between workload reduction and accuracy degradation for each convolution layer by selecting fine-tuned thresholds to control each approximation’s intensity. The proposed methods have been evaluated on ImageNet, MNIST, Fashion-MNIST, SVHN, and CIFAR-10 datasets. The proposed techniques achieved $\sim 5.28\text{x}$ multiplicative workload reduction without significant accuracy degradation ( $\sim 1.08\text{x}$ less multiplicative workload as compared to state-of-the-art Winograd CNN pruning. For LeNet, $\sim 3.87\text{x}$ and $\sim 3.93\text{x}$ was the multiplicative workload reduction for MNIST and Fashion-MNIST datasets. The additive workload reduction was $\sim 2.5\text{x}$ and $\sim 2.56\text{x}$ for the respective datasets. There is no significant accuracy loss for MNIST and Fashion-MNIST dataset.
- Published
- 2021