Back to Search
Start Over
Compact and Computationally Efficient Representation of Deep Neural Networks
- Publication Year :
- 2020
-
Abstract
- At the core of any inference procedure in deep neural networks are dot product operations, which are the component that require the highest computational resources. A common approach to reduce the cost of inference is to reduce its memory complexity by lowering the entropy of the weight matrices of the neural network, e.g., by pruning and quantizing their elements. However, the quantized weight matrices are then usually represented either by a dense or sparse matrix storage format, whose associated dot product complexity is not bounded by the entropy of the matrix. This means that the associated inference complexity ultimately depends on the implicit statistical assumptions that these matrix representations make about the weight distribution, which can be in many cases suboptimal. In this paper we address this issue and present new efficient representations for matrices with low entropy statistics. These new matrix formats have the novel property that their memory and algorithmic complexity are implicitly bounded by the entropy of the matrix, consequently implying that they are guaranteed to become more efficient as the entropy of the matrix is being reduced. In our experiments we show that performing the dot product under these new matrix formats can indeed be more energy and time efficient under practically relevant assumptions. For instance, we are able to attain up to x42 compression ratios, x5 speed ups and x90 energy savings when we convert in a lossless manner the weight matrices of state-of-the-art networks such as AlexNet, VGG-16, ResNet152 and DenseNet into the new matrix formats and benchmark their respective dot product operation.<br />17 pages, 14 figures
- Subjects :
- FOS: Computer and information sciences
Computer Science - Machine Learning
Computer Networks and Communications
Computer science
Entropy
Machine Learning (stat.ML)
02 engineering and technology
Machine Learning (cs.LG)
Entropy (classical thermodynamics)
Matrix (mathematics)
Deep Learning
Statistics - Machine Learning
Artificial Intelligence
0202 electrical engineering, electronic engineering, information engineering
Entropy (information theory)
Neural and Evolutionary Computing (cs.NE)
Entropy (energy dispersal)
Entropy (arrow of time)
Sparse matrix
Lossless compression
Artificial neural network
Entropy (statistical thermodynamics)
Computer Science - Neural and Evolutionary Computing
Dot product
Computer Science Applications
020201 artificial intelligence & image processing
Neural Networks, Computer
Algorithm
Software
Entropy (order and disorder)
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....c6042587b57c5d11623ba75377c94224