> [!definition]
>
> The *joint entropy* $H(X, Y)$ of two [[Random Variable|random variables]] $X$ and $Y$ defined on the same [[Probability|probability]] space is the combined [[Entropy|entropy]] (uncertainty) due to the ignorance of both variables, calculated as:
> $
> H(X, Y) = -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_{jk})
> $
>
> If the random variables are [[Probabilistic Independence|dependent]], the joint entropy decreases as they encode [[Information|information]] about each other.
> [!definition]
>
> $
> H(X_0, \cdots X_N) = -\sum_{i_*}p(i_0, \cdots, i_n)\log_2(p(i_0, \cdots i_n))
> $
> The joint entropy of multiple random variables is their combined entropy due to ignorance of all of them.
> [!theorem]
>
> The joint entropy of two random variables is the sum of the entropy of one and the [[Conditional Entropy|conditional entropy]] of the other.
> $
> H(X, Y) = H(X) + H_X(Y)
> $
> *Proof*.
>
> $
> \begin{align*}
> H(X, Y) &= -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_{jk}) \\
> &= -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_{j}p_j(y_k))\\
> &= -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_{j}) -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_j(y_k))\\
> &= -\sum_{j = 1}^{n}p_{j}\log_2(p_{j}) -\sum_{j = 1}^{n}\sum_{k = 1}^{m}p_{jk}\log_2(p_j(y_k))\\
> &= H(X) + H_X(Y)
> \end{align*}
> $