Normalized Entropy is a metric that takes into account the maximum possible value of Entropy and returns a normalized measure of the uncertainty associated with the variable:
In this new example, we now compare the variables X1 and X2, which each represent ball colors:
X1 ∈ {blue, red}
X2 ∈ {blue, red, green, yellow, purple, orange, brown, black}
Normalized Entropy allows us to compare the degree of uncertainty even though these two variables have different numbers of states, i.e., two versus eight states:
In BayesiaLab, the values of Entropy and Normalized Entropy can be accessed in a number of ways:
You can also sort the Monitors in the Monitor Panel according to their Normalized Entropy via Monitor Context Menu > Sort > Normalized Entropy
.
The Normalized Entropy is also available as a Node Analysis metric for Size and Color in the 2D and 3D Mapping Tools.
In Function Nodes, Entropy and Normalized Entropy are available as Inference Functions in the Equation tab.
Entropy: Entropy(?X1?, False)
Normalized Entropy: Entropy(?X1?, True)
X1 | X2 |
---|---|
In Validation Mode , with the Information Mode activated, hovering over a Monitor with your cursor will bring up a Tooltip that includes Entropy and Normalized Entropy.
Entropy, denoted , is a key metric in BayesiaLab for measuring the uncertainty associated with the probability distribution of a variable .
Entropy is expressed in bits and defined as follows:
The Entropy of a variable can also be understood as the sum of the Expected Log-Losses of its states.
Let's assume we have four containers, A through D, which are filled with balls that can be either blue or red.
Container A is filled exclusively with blue balls.
Container B has an equal amount of red and blue balls.
In Container C, 10.89% of all balls are blue, and the remainder is red.
Container D only holds red balls.
Within each container, the order of balls is entirely random.
A volunteer who already knows the proportions of red and blue balls in each container now randomly draws one ball from each container. What is his degree of uncertainty regarding the ball color at the moment of each draw?
Needless to say, with Containers A and D, there is no uncertainty at all. From Containers A and D, he will draw a blue and red ball, respectively, with perfect certainty. What about the degree of certainty or, rather, uncertainty for Containers B and C?
The concept of Entropy can formally represent the degree of uncertainty.
We use the binary variable to represent the color of the ball.
Using the definition of Entropy from above, we can compute the Entropy value applicable to each draw.
We can also plot Entropy as a function of the probability of drawing a red ball.
We see that Entropy reaches its maximum value for , i.e., when drawing a red or a blue ball is equally probable. A 50/50 mix of red and blue balls is indeed the situation with the highest possible degree of uncertainty.
This was an example of a variable with two states only. As we introduce more possible states, e.g., another ball color, the maximum possible Entropy increases.
More specifically, the maximum value of Entropy increases logarithmically with the number of states of node .
where is the number of states of the variable .
As a result, one cannot compare the Entropy values of variables with different numbers of states.
To make Entropy comparable, the Normalized Entropy metric is available, which takes into account the Maximum Entropy.
Container A | Container B | Container C | Container D |
---|---|---|---|