> [!definition] > > Let $X$ and $Y$ be two random [[Variable|variables]] whose range are of the same size, their relative [[Entropy|entropy]]/[[Information|information]] theoretical distance is a measure of the difference between them. Define the relative entropy, $D(X, Y)$ of $Y$ from $X$[^1] by > $ > D(X, Y) = \sum_{j = 1}^{n}p_j \log_2(p_j) - \sum_{j = 1}^{n}p_j \log_2(q_j) > = \sum_{j = 1}^{n}p_j \log_2\paren{\frac{p_j}{q_j}} > $ > [!theorem] > > $D(X, Y) \ge 0$ with equality if and only if $X$ and $Y$ are identically [[Probability Distribution|distributed]]. > > *Proof*. Using [[Gibbs Inequality]], > $ > \begin{align*} > -\sum_{j = 1}^{n}p_j \log_2(p_j) &\le -\sum_{j =1}^{n}p_j \log_2(q_j) \\ > 0 &\le \sum_{j = 1}^{n}p_j \log_2(p_j) - \sum_{j =1}^{n}p_j \log_2(q_j) \\ > D(X, Y) &\ge 0 > \end{align*} > $ > [!theorem] > > If $Y$ has a [[Uniform Distribution|uniform distribution]], then, > $ > D(X, Y) = \log_2(n) - H(X) > $ > > *Proof*. > $ > \begin{align*} > D(X, Y) &= \sum_{j = 1}^{n}p_j \log_2(p_j) - \sum_{j = 1}^{n}p_j \log_2(q_j) \\ > D(X, Y) &= -H(X) - \sum_{j = 1}^{n}p_j \log_2\paren{\frac{1}{n}} \\ > D(X, Y) &= -H(X) - \log_2\paren{\frac{1}{n}}\sum_{j = 1}^{n}p_j \\ > D(X, Y) &= -H(X) + \log_2(n) \\ > D(X, Y) &= \log_2(n) - H(X) > \end{align*} > $ > [!theorem] > > Let $W$ be a random [[Vector|vector]] $(X, Y)$, with its distribution being the [[Joint Distribution|joint distribution]] of $X$ and $Y$, and let $Z$ be the random vector with its distribution being the joint distribution of $X$ and $Y$ if they were independent ($p_{jk} = p_j q_k$), then the relative entropy of $Z$ from $W$ is the [[Mutual Information|mutual information]] of $X$ and $Y$. > $ > D(W, Z) = I(X, Y) > $ > > *Proof.* > $ > \begin{align*} > D(W, Z) &= \sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk} \log_2(p_{jk}) > - \sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk} \log_2(q_{jk}) \\ > &= \sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk} \log_2(p_{jk}) > - \sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk} \log_2(p_jq_k) \\ > &= -H(X, Y) > - \sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk} \log_2(p_j) > - \sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk} \log_2(q_k)\\ > &= -H(X, Y) + H(X) + H(Y)\\ > &= - H_X(Y) - H(X) + H(X) + H(Y)\\ > &= H(Y) - H_X(Y) = I(X, Y) > \end{align*} > $ [^1]: This measure is not symmetrical! $D(X, Y) \ne D(Y, X)$