> [!definition] > > The *joint entropy* $H(X, Y)$ of two [[Random Variable|random variables]] $X$ and $Y$ defined on the same [[Probability|probability]] space is the combined [[Entropy|entropy]] (uncertainty) due to the ignorance of both variables, calculated as: > $ > H(X, Y) = -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_{jk}) > $ > > If the random variables are [[Probabilistic Independence|dependent]], the joint entropy decreases as they encode [[Information|information]] about each other. > [!definition] > > $ > H(X_0, \cdots X_N) = -\sum_{i_*}p(i_0, \cdots, i_n)\log_2(p(i_0, \cdots i_n)) > $ > The joint entropy of multiple random variables is their combined entropy due to ignorance of all of them. > [!theorem] > > The joint entropy of two random variables is the sum of the entropy of one and the [[Conditional Entropy|conditional entropy]] of the other. > $ > H(X, Y) = H(X) + H_X(Y) > $ > *Proof*. > > $ > \begin{align*} > H(X, Y) &= -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_{jk}) \\ > &= -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_{j}p_j(y_k))\\ > &= -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_{j}) -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_j(y_k))\\ > &= -\sum_{j = 1}^{n}p_{j}\log_2(p_{j}) -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_j(y_k))\\ > &= H(X) + H_X(Y) > \end{align*} > $