Jump to content

英文维基 | 中文维基 | 日文维基 | 草榴社区

Binary entropy function

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by InternetArchiveBot (talk | contribs) at 15:38, 20 July 2017 (Rescuing 1 sources and tagging 0 as dead. #IABot (v1.4.2)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Entropy of a Bernoulli trial as a function of binary outcome probability, called the binary entropy function.

In information theory, the binary entropy function, denoted or , is defined as the entropy of a Bernoulli process with probability of one of two values. Mathematically, the Bernoulli trial is modelled as a random variable that can take on only two values: 0 and 1, which are mutually exclusive and exhaustive.

If , then and the entropy of (in shannons) is given by

,

where is taken to be 0. The logarithms in this formula are usually taken (as shown in the graph) to the base 2. See binary logarithm.

When , the binary entropy function attains its maximum value. This is the case of an unbiased coin flip.

is distinguished from the entropy function in that the former takes a single real number as a parameter whereas the latter takes a distribution or random variable as a parameter. Sometimes the binary entropy function is also written as . However, it is different from and should not be confused with the Rényi entropy, which is denoted as .

Explanation

In terms of information theory, entropy is considered to be a measure of the uncertainty in a message. To put it intuitively, suppose . At this probability, the event is certain never to occur, and so there is no uncertainty at all, leading to an entropy of 0. If , the result is again certain, so the entropy is 0 here as well. When , the uncertainty is at a maximum; if one were to place a fair bet on the outcome in this case, there is no advantage to be gained with prior knowledge of the probabilities. In this case, the entropy is maximum at a value of 1 bit. Intermediate values fall between these cases; for instance, if , there is still a measure of uncertainty on the outcome, but one can still predict the outcome correctly more often than not, so the uncertainty measure, or entropy, is less than 1 full bit.

Derivative

The derivative of the binary entropy function may be expressed as the negative of the logit function:

.

Taylor series

The Taylor series of the binary entropy function in a neighborhood of 1/2 is

for .

See also

References

  • MacKay, David J. C.. Information Theory, Inference, and Learning Algorithms Cambridge: Cambridge University Press, 2003. ISBN 0-521-64298-1