信息熵

信息熵

\[ H(X) = -\sum_{i=1}^{m}{P(x_{i})\cdot\log_{2}{P(x_{i})}} \]

条件熵

\[ \begin{aligned} &H(Y|X)\\ &= \sum_{x\in X}{P(x)\cdot H(Y|X=x)}\\ &= -\sum_{x\in X}\sum_{y\in Y}{P(x, y)\cdot \log_{2}{P(y|x)}} \end{aligned} \]

信息增益

信息增益的含义即在条件X下,信息Y的不确定性的减少程度。

\[ InfoGain(Y, X) = H(Y) - H(Y|X) \]

例题

例如:

X Y
1 2
1 2
2 3
2 4

H(X), H(Y)

\[ \begin{aligned} &H(X) = -(0.5\log_{2}{0.5} + 0.5\log_{2}{0.5}) = 1\\ &H(Y) = -(0.5\log_{2}{0.5} + 0.25\log_{2}{0.25}\times2 = 1.5 \end{aligned} \]

H(Y|X)

  • 法一

\[ \begin{aligned} &H(Y|X=1) = 0, H(Y|X=2) = 1\\ &H(Y|X) = 0.5\times0 + 0.5\times1 = 0.5 \end{aligned} \]

  • 法二

\[ \begin{aligned} &P(x=1, y=2)\log_{2}{P(y=2|x=1)} = 0.5 \times 0 = 0\\ &P(x=2, y=3)\log_{2}{P(y=3|x=2)} = 0.25 \times (-1) = -0.25\\ &P(x=2, y=4)\log_{2}{P(y=4|x=2)} = 0.25 \times (-1) = -0.25\\ &H(Y|X) = -(0 - 0.25 - 0.25) = 0.5 \end{aligned} \]

InfoGain

\[ InfoGain(Y, X) = H(Y) - H(Y|X) = 1.5 - 0.5 = 1 \]