A Mathematical Formalism of Infinite Coding for the Compression of Stochastic Process

Loading...
Thumbnail Image
Penn collection
Technical Reports (CIS)
Degree type
Discipline
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Atteson, Kevin
Contributor
Abstract

As mentioned in [5, page 6], there are two basic models for sources of data in information theory: finite length sources, that is, sources which produce finite length strings, and infinite length sources, which produce infinite length strings. Finite length sources provide a better model for files, for instance, since files consist of finite length strings of symbols. Infinite length sources provide a better model for communication lines which provide a string of symbols which, if not infinite, typically have no readily apparent end. In fact, even in some cases in which the data is finite, it is convenient to use the infinite length source model. For instance, the widely used adaptive coding techniques (see, for instance [5]) typically use arithmetic coding which implicitly assumes an infinite length source (although practical implementations make modifications so that it may be used with finite length strings). In this paper, we formalize the notion of encoding an infinite length source. While such infinite codes are used intuitively throughout the literature, their mathematical formalization reveals certain subtleties which might otherwise be overlooked. For instance, it turns out that the pure arithmetic code for certain sources has not only unbounded but infinite delay (that is, it is necessary to see a complete infinite source string before being able to determine even one bit of the encoded string in certain cases). Fortunately, such cases occur with zero probability. The formalization presented here leads to a better understanding of infinite coding and a methodology for designing better infinite codes for adaptive data compression (see [1]).

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
1994-05-25
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-94-27.
Recommended citation
Collection