Deriving Entropy from Self-Information

In the last post Aliens and Self-Information we derived the formula for self-information

The self-information of an event with probability is defined as:

It quantifies how surprising or informative an event is. Less probable events carry more information.

Entropy as Expected Value of Self-Information

Entropy measures the average uncertainty or information content in a probability distribution over all possible outcomes of a random variable . It is defined as the expected value of self-information:

Substituting :

Interpretation

The units depend on the base of the logarithm:

is maximized when all outcomes are equally likely ().

Intuition

Examples:

  1. If (certainty), , and .
  2. If is uniform, is maximized because uncertainty is highest.

This derivation links the concept of individual surprise (self-information) to the broader idea of total uncertainty in a system (entropy).