Start Over

Overview of the Neural Network Compression and Representation (NNR) Standard.

Authors :: Kirchhoffer, Heiner
Haase, Paul
Samek, Wojciech
Muller, Karsten
Rezazadegan-Tavakoli, Hamed
Cricri, Francesco
Aksu, Emre B.
Hannuksela, Miska M.
Jiang, Wei
Wang, Wei
Liu, Shan
Jain, Swayambhoo
Hamidi-Rad, Shahab
Racape, Fabien
Bailer, Werner
Source :: IEEE Transactions on Circuits & Systems for Video Technology. May2022, Vol. 32 Issue 5, p3203-3216. 14p.
Publication Year :: 2022
Abstract: Neural Network Coding and Representation (NNR) is the first international standard for efficient compression of neural networks (NNs). The standard is designed as a toolbox of compression methods, which can be used to create coding pipelines. It can be either used as an independent coding framework (with its own bitstream format) or together with external neural network formats and frameworks. For providing the highest degree of flexibility, the network compression methods operate per parameter tensor in order to always ensure proper decoding, even if no structure information is provided. The NNR standard contains compression-efficient quantization and deep context-adaptive binary arithmetic coding (DeepCABAC) as core encoding and decoding technologies, as well as neural network parameter pre-processing methods like sparsification, pruning, low-rank decomposition, unification, local scaling and batch norm folding. NNR achieves a compression efficiency of more than 97% for transparent coding cases, i.e. without degrading classification quality, such as top-1 or top-5 accuracies. This paper provides an overview of the technical features and characteristics of NNR. [ABSTRACT FROM AUTHOR]