Deriving Entropy from Self-Information
In the last post Aliens and Self-Information we derived the formula for self-information
The self-information
It quantifies how surprising or informative an event is. Less probable events carry more information.
Entropy as Expected Value of Self-Information
Entropy
Substituting
Interpretation
The units depend on the base of the logarithm:
- Base 2: Bits (common in information theory).
- Base
: Nats (used in natural sciences).
Intuition
- Self-information measures the surprise of an individual event.
- Entropy aggregates this measure, weighted by the probability of each event. Events with higher probabilities contribute less to entropy because they are less surprising.
Examples:
- If
(certainty), , and . - If
is uniform, is maximized because uncertainty is highest.
This derivation links the concept of individual surprise (self-information) to the broader idea of total uncertainty in a system (entropy).