1. Average Redundancy for Known Sources: Ubiquitous Trees in Source Coding
- Author
-
Wojciech Szpankowski
- Subjects
complex asymptotics ,mellin transform ,distribution modulo $1$ ,redundancy ,khodak code ,tunstall code ,huffman code ,data compression ,source coding ,prefix codes ,kraft's inequality ,shannon lower bound ,[info.info-dm] computer science [cs]/discrete mathematics [cs.dm] ,[math.math-ds] mathematics [math]/dynamical systems [math.ds] ,[math.math-co] mathematics [math]/combinatorics [math.co] ,Mathematics ,QA1-939 - Abstract
Analytic information theory aims at studying problems of information theory using analytic techniques of computer science and combinatorics. Following Hadamard's precept, these problems are tackled by complex analysis methods such as generating functions, Mellin transform, Fourier series, saddle point method, analytic poissonization and depoissonization, and singularity analysis. This approach lies at the crossroad of computer science and information theory. In this survey we concentrate on one facet of information theory (i.e., source coding better known as data compression), namely the $\textit{redundancy rate}$ problem. The redundancy rate problem determines by how much the actual code length exceeds the optimal code length. We further restrict our interest to the $\textit{average}$ redundancy for $\textit{known}$ sources, that is, when statistics of information sources are known. We present precise analyses of three types of lossless data compression schemes, namely fixed-to-variable (FV) length codes, variable-to-fixed (VF) length codes, and variable-to-variable (VV) length codes. In particular, we investigate average redundancy of Huffman, Tunstall, and Khodak codes. These codes have succinct representations as $\textit{trees}$, either as coding or parsing trees, and we analyze here some of their parameters (e.g., the average path from the root to a leaf).
- Published
- 2008
- Full Text
- View/download PDF