1 of 2

Entropy

Definition

Entropy, denoted $H(X)$ , is a key metric in BayesiaLab for measuring the uncertainty associated with the probability distribution of a variable ${X}$ .

Entropy is expressed in bits and defined as follows:

$H(X) = - \sum\limits_{x \in X} {p(x){{\log }_2}\left( {p(x)} \right)}$

The Entropy of a variable ${X}$ can also be understood as the sum of the Expected Log-Losses of its states.

Example

Let's assume we have four containers, A through D, which are filled with balls that can be either blue or red.

Container A is filled exclusively with blue balls.
Container B has an equal amount of red and blue balls.
In Container C, 10.89% of all balls are blue, and the remainder is red.
Container D only holds red balls.
Within each container, the order of balls is entirely random.
A volunteer who already knows the proportions of red and blue balls in each container now randomly draws one ball from each container. What is his degree of uncertainty regarding the ball color at the moment of each draw?
Needless to say, with Containers A and D, there is no uncertainty at all. From Containers A and D, he will draw a blue and red ball, respectively, with perfect certainty. What about the degree of certainty or, rather, uncertainty for Containers B and C?
The concept of Entropy can formally represent the degree of uncertainty.
We use the binary variable ${X}$ to represent the color of the ball.
Using the definition of Entropy from above, we can compute the Entropy value applicable to each draw.

We can also plot Entropy as a function of the probability of drawing a red ball.

We see that Entropy reaches its maximum value for $P\left( {X = red} \right) = P\left( {X = blue} \right) = 0.5$ , i.e., when drawing a red or a blue ball is equally probable. A 50/50 mix of red and blue balls is indeed the situation with the highest possible degree of uncertainty.

Maximum Entropy as a Function of the Number of States

This was an example of a variable with two states only. As we introduce more possible states, e.g., another ball color, the maximum possible Entropy increases.

More specifically, the maximum value of Entropy increases logarithmically with the number of states ${S}$ of node ${X}$ .

${H_{\max }}(X) = {\log _2}({S_X})$

where ${S_X}$ is the number of states of the variable ${X}$ .

As a result, one cannot compare the Entropy values of variables with different numbers of states.

To make Entropy comparable, the Normalized Entropy metric is available, which takes into account the Maximum Entropy.

Normalized Entropy

Normalized Entropy is a metric that takes into account the maximum possible value of Entropy and returns a normalized measure of the uncertainty associated with the variable:

Example

In this new example, we now compare the variables X1 and X2, which each represent ball colors:

X1 ∈ {blue, red}
X2 ∈ {blue, red, green, yellow, purple, orange, brown, black}

Normalized Entropy allows us to compare the degree of uncertainty even though these two variables have different numbers of states, i.e., two versus eight states:

Usage

In BayesiaLab, the values of Entropy and Normalized Entropy can be accessed in a number of ways:

You can also sort the Monitors in the Monitor Panel according to their Normalized Entropy via Monitor Context Menu > Sort > Normalized Entropy.

The Normalized Entropy is also available as a Node Analysis metric for Size and Color in the 2D and 3D Mapping Tools.
In Function Nodes, Entropy and Normalized Entropy are available as Inference Functions in the Equation tab.
- Entropy: Entropy(?X1?, False)
- Normalized Entropy: Entropy(?X1?, True)

Demo Network

Normalized Entropy

Normalized Entropy is a metric that takes into account the maximum possible value of Entropy and returns a normalized measure of the uncertainty associated with the variable:

${H_N}(X) = \frac{{H(X)}}{{{{\log }_2}({S_X})}}$

Example

In this new example, we now compare the variables X1 and X2, which each represent ball colors:

X1 ∈ {blue, red}
X2 ∈ {blue, red, green, yellow, purple, orange, brown, black}

Normalized Entropy allows us to compare the degree of uncertainty even though these two variables have different numbers of states, i.e., two versus eight states:

Usage

In BayesiaLab, the values of Entropy and Normalized Entropy can be accessed in a number of ways:

In Validation Mode , with the Information Mode activated, hovering over a Monitor with your cursor will bring up a Tooltip that includes Entropy and Normalized Entropy.

You can also sort the Monitors in the Monitor Panel according to their Normalized Entropy via Monitor Context Menu > Sort > Normalized Entropy.

The Normalized Entropy is also available as a Node Analysis metric for Size and Color in the 2D and 3D Mapping Tools.
In Function Nodes, Entropy and Normalized Entropy are available as Inference Functions in the Equation tab.
- Entropy: Entropy(?X1?, False)
- Normalized Entropy: Entropy(?X1?, True)

Demo Network

NormalizedEntropy.xbl

Entropy

Definition

Entropy, denoted $H(X)$ , is a key metric in BayesiaLab for measuring the uncertainty associated with the probability distribution of a variable ${X}$ .

Entropy is expressed in bits and defined as follows:

$H(X) = - \sum\limits_{x \in X} {p(x){{\log }_2}\left( {p(x)} \right)}$

The Entropy of a variable ${X}$ can also be understood as the sum of the Expected Log-Losses of its states.

Example

Let's assume we have four containers, A through D, which are filled with balls that can be either blue or red.

Container A is filled exclusively with blue balls.
Container B has an equal amount of red and blue balls.
In Container C, 10.89% of all balls are blue, and the remainder is red.
Container D only holds red balls.
Within each container, the order of balls is entirely random.
A volunteer who already knows the proportions of red and blue balls in each container now randomly draws one ball from each container. What is his degree of uncertainty regarding the ball color at the moment of each draw?
Needless to say, with Containers A and D, there is no uncertainty at all. From Containers A and D, he will draw a blue and red ball, respectively, with perfect certainty. What about the degree of certainty or, rather, uncertainty for Containers B and C?
The concept of Entropy can formally represent the degree of uncertainty.
We use the binary variable ${X}$ to represent the color of the ball.
Using the definition of Entropy from above, we can compute the Entropy value applicable to each draw.

Container A

Container B

Container C

Container D

We can also plot Entropy as a function of the probability of drawing a red ball.

Maximum Entropy as a Function of the Number of States

This was an example of a variable with two states only. As we introduce more possible states, e.g., another ball color, the maximum possible Entropy increases.

More specifically, the maximum value of Entropy increases logarithmically with the number of states ${S}$ of node ${X}$ .

${H_{\max }}(X) = {\log _2}({S_X})$

where ${S_X}$ is the number of states of the variable ${X}$ .

As a result, one cannot compare the Entropy values of variables with different numbers of states.

To make Entropy comparable, the Normalized Entropy metric is available, which takes into account the Maximum Entropy.