Understanding Convolutional Neural Networks with Information Theory: An Initial Exploration

Authors :: Yu, Shujian
Wickstrøm, Kristoffer
Jenssen, Robert
Principe, Jose C.
Publication Year :: 2018
Abstract: The matrix-based Renyi's \alpha-entropy functional and its multivariate extension were recently developed in terms of the normalized eigenspectrum of a Hermitian matrix of the projected data in a reproducing kernel Hilbert space (RKHS). However, the utility and possible applications of these new estimators are rather new and mostly unknown to practitioners. In this paper, we first show that our estimators enable straightforward measurement of information flow in realistic convolutional neural networks (CNN) without any approximation. Then, we introduce the partial information decomposition (PID) framework and develop three quantities to analyze the synergy and redundancy in convolutional layer representations. Our results validate two fundamental data processing inequalities and reveal some fundamental properties concerning the training of CNN.<br />Comment: Paper accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS). Code for 1) estimating information quantities, 2) plotting the information plane, and 3) selecting convolutional filters, is available from (MATLAB) https://drive.google.com/drive/folders/1DJYshWIiijKWrFKrztW9FgTzGfMV3D8M?usp=sharing or (Python) https://github.com/Wickstrom/InfExperiment

Subjects :: Computer Science - Machine Learning
Computer Science - Information Theory
Statistics - Machine Learning

Tools