A Mathematical Formalism of Infinite Coding for the Compression of Stochastic Process

Atteson, Kevin

A Mathematical Formalism of Infinite Coding for the Compression of Stochastic Process

Files

94_27.pdf (397.31 KB)

Penn collection

Technical Reports (CIS)

Permalink

https://repository.upenn.edu/handle/20.500.14332/7265

View all metadata

Author

Atteson, Kevin

Abstract

As mentioned in [5, page 6], there are two basic models for sources of data in information theory: finite length sources, that is, sources which produce finite length strings, and infinite length sources, which produce infinite length strings. Finite length sources provide a better model for files, for instance, since files consist of finite length strings of symbols. Infinite length sources provide a better model for communication lines which provide a string of symbols which, if not infinite, typically have no readily apparent end. In fact, even in some cases in which the data is finite, it is convenient to use the infinite length source model. For instance, the widely used adaptive coding techniques (see, for instance [5]) typically use arithmetic coding which implicitly assumes an infinite length source (although practical implementations make modifications so that it may be used with finite length strings). In this paper, we formalize the notion of encoding an infinite length source. While such infinite codes are used intuitively throughout the literature, their mathematical formalization reveals certain subtleties which might otherwise be overlooked. For instance, it turns out that the pure arithmetic code for certain sources has not only unbounded but infinite delay (that is, it is necessary to see a complete infinite source string before being able to determine even one bit of the encoded string in certain cases). Fortunately, such cases occur with zero probability. The formalization presented here leads to a better understanding of infinite coding and a methodology for designing better infinite codes for adaptive data compression (see [1]).

Publication date

1994-05-25

Comments

University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-94-27.

Collection

Reports