信息熵
\[ H(X) = -\sum_{i=1}^{m}{P(x_{i})\cdot\log_{2}{P(x_{i})}} \]
条件熵
\[ \begin{aligned} &H(Y|X)\\ &= \sum_{x\in X}{P(x)\cdot H(Y|X=x)}\\ &= -\sum_{x\in X}\sum_{y\in Y}{P(x, y)\cdot \log_{2}{P(y|x)}} \end{aligned} \]
信息增益
信息增益的含义即在条件X下,信息Y的不确定性的减少程度。
\[ InfoGain(Y, X) = H(Y) - H(Y|X) \]
例题
例如:
X | Y |
---|---|
1 | 2 |
1 | 2 |
2 | 3 |
2 | 4 |
H(X)
, H(Y)
\[ \begin{aligned} &H(X) = -(0.5\log_{2}{0.5} + 0.5\log_{2}{0.5}) = 1\\ &H(Y) = -(0.5\log_{2}{0.5} + 0.25\log_{2}{0.25}\times2 = 1.5 \end{aligned} \]
H(Y|X)
- 法一
\[ \begin{aligned} &H(Y|X=1) = 0, H(Y|X=2) = 1\\ &H(Y|X) = 0.5\times0 + 0.5\times1 = 0.5 \end{aligned} \]
- 法二
\[ \begin{aligned} &P(x=1, y=2)\log_{2}{P(y=2|x=1)} = 0.5 \times 0 = 0\\ &P(x=2, y=3)\log_{2}{P(y=3|x=2)} = 0.25 \times (-1) = -0.25\\ &P(x=2, y=4)\log_{2}{P(y=4|x=2)} = 0.25 \times (-1) = -0.25\\ &H(Y|X) = -(0 - 0.25 - 0.25) = 0.5 \end{aligned} \]
InfoGain
\[ InfoGain(Y, X) = H(Y) - H(Y|X) = 1.5 - 0.5 = 1 \]