> [!definition]
>
> ![[gibbs.png]]
> $
> P(X = x_j) = \frac{\exp(\mu^\prime x_j)}{Z(\mu^\prime)} \quad \mu^\prime = \mu\ln(2) \quad Z(\mu^\prime) = \sum_{j = 1}^{n}\exp(\mu^\prime x_j)
> $
> The Gibbs distribution is the *assumed* [[Probability Distribution|probability distribution]] of a [[Random Variable|random variable]] if *only* its range and its [[Expectation|expected value]] is [[Information|known]], derived from the [[Principle of Maximum Entropy|principle of maximum entropy]]. The parameter $\mu$ ($\mu^\prime = \mu \ln(2)$) represents the "skewedness" of the distribution to the left or right, and as it [[Limit|approaches]] infinity, only the most extreme outcome would be possible.
> [!theorem] Reasoning
>
> Let $X$ be a random variable with range $\{x_1, x_2, \cdots, x_n\}$ and unknown probability distribution $\{p_1, p_2, \cdots, p_n\}$. Suppose that the only information given about $X$ is its [[Expectation|expected value]] $\ev(X) = E$. If $E \ne \frac{1}{n}\sum_{j = 1}^{n}x_j$, then $X$ cannot have a [[Uniform Distribution|uniform distribution]].
>
> In this case, to find the distribution that still preserves the principle of maximum entropy, optimise $H(X)$ under the following constraints:
> - $p$ is a probability distribution ($\sum_{j = 1}^{n}p_j - 1 = 0$)
> - $\ev(X) = E \Leftrightarrow \sum_{j = 1}^{n}p_j x_j - E = 0$.
>
> Using [[Lagrange Multipliers|Lagrange multipliers]], create the Lagrangian function:
> $
> \mathcal{L}(\vec{p}, \lambda, \mu)
> = -\sum_{j = 1}^{n}p_j \log_2(p_j) + \lambda\paren{\sum_{j = 1}^{n}p_j - 1} + \mu\paren{\sum_{j = 1}^{n}p_j x_j - E}
> $
>
> Solve the following $(n + 2)$ equations corresponding to $\nabla \mathcal{L} = \vec{0}$:
> $
> \begin{align*}
> \frac{\partial \mathcal{L}}{\partial p_j} &=
> -\frac{\ln p_j + 1}{\ln 2} + \lambda + \mu x_j = 0 \\
> \frac{\partial \mathcal{L}}{\partial \lambda} &= \sum_{j = 1}^{n}p_j - 1 = 0\\
> \frac{\partial \mathcal{L}}{\partial \mu} &= \sum_{j = 1}^{n}p_j x_j - E = 0
> \end{align*}
> $
>
>
> Isolating $p_j$ in the equations gives:
> $
> \begin{align*}
> \frac{\ln p_j + 1}{\ln 2} &= \lambda + \mu x_j \\
> \ln p_j + 1 &= \lambda\ln 2 + \mu x_j\ln 2 \\
> \ln p_j &= \lambda\ln 2 + \mu x_j\ln 2 - 1 \\
> p_j &= \exp\paren{\lambda\ln 2 + \mu x_j\ln 2 - 1} \\
> p_j &= \exp\paren{\lambda^\prime + \mu^\prime x_j} \quad \lambda^\prime = \lambda\ln 2 - 1 \quad \mu^\prime = \mu \ln2\\
> \end{align*}
> $
>
> Solving for $\lambda^\prime$ using the first constraint:
> $
> \begin{align*}
> \sum_{j = 1}^{n}p_j &= 1 \\
> \sum_{j = 1}^{n}\exp(\lambda^\prime + \mu^\prime x_j) &= 1 \\
> \exp(\lambda^\prime)\sum_{j = 1}^{n}\exp(\mu^\prime x_j) &= 1 \\
> \exp(\lambda^\prime) &= \frac{1}{\sum_{j = 1}^{n}\exp(\mu^\prime x_j)} \\
> \lambda^\prime &= \ln\frac{1}{\sum_{j = 1}^{n}\exp(\mu^\prime x_j)} \\
> \lambda^\prime &= -\ln\sum_{j = 1}^{n}\exp(\mu^\prime x_j) \\
> \lambda^\prime &= -\ln Z(\mu^\prime) \quad Z(\mu^\prime) = \sum_{j = 1}^{n}\exp(\mu^\prime x_j)
> \end{align*}
> $
>
> Plugging the result back in provides:
> $
> \begin{align*}
> p_j &= \exp\paren{\lambda^\prime + \mu^\prime x_j} \\
> &= \exp(\lambda^\prime)\exp(\mu^\prime x_j) \\
> &= \exp(-\ln Z(\mu^\prime))\exp(\mu^\prime x_j) \\
> &= \frac{\exp(\mu^\prime x_j)}{Z(\mu^\prime)}
> \end{align*}
> $
>
> The $\mu^\prime$ value can be calculated in terms of $E$ by solving for it using the second constraint.
> [!theoremb] Special Case where $x_j = j\quad \forall 0 \le j \le n$
>
> While the equation is very messy for random variables with arbitrary ranges, if the range is specifically $\{0, 1, \cdots, n\}$, the sum can be substantially clarified.
>
> Starting with the partition function $Z(\mu^\prime)$, which follows a [[Geometric Series|geometric series]] that can be simplified using a partial sum formula:
> $
> Z(\mu^\prime) = \sum_{j = 0}^{n}\exp(\mu^\prime j) =
> \sum_{j = 0}^{n}\exp(\mu^\prime)^j =
> \frac{1 - \exp((n + 1)\mu^\prime)}{1 - \exp(\mu^\prime)}
> $
>
> Move on to the equation for the expected value:
> $
> \begin{align*}
> E &= \sum_{j = 0}^{n}x_j p_j \\
> &= \sum_{j = 0}^{n}x_j\frac{\exp(\mu^\prime x_j)}{Z(\mu^\prime)} \\
> &= \frac{1}{Z(\mu^\prime)}\sum_{j = 0}^{n}j \exp(j\mu^\prime) \\
> &= \frac{1}{Z(\mu^\prime)}\sum_{j = 0}^{n}\frac{d}{d\mu^\prime} \exp(j\mu^\prime) \\
> &= \frac{1}{Z(\mu^\prime)}\frac{d}{d\mu^\prime}\sum_{j = 0}^{n} \exp(j\mu^\prime) \\
> &= \frac{1}{Z(\mu^\prime)}\frac{d Z(\mu^\prime)}{d\mu^\prime} \\
> &= \frac{1 - \exp(\mu^\prime)}{1 - \exp((n + 1)\mu^\prime)}
> \frac{n\exp((n + 2)\mu^\prime) - (n + 1)\exp((n + 1)\mu^\prime) + \exp(\mu^\prime)}{(1 - \exp(\mu^\prime))^2} \\
> E &= \frac{n\exp((n + 2)\mu^\prime) - (n + 1)\exp((n + 1)\mu^\prime) + \exp(\mu^\prime)}{(1 - \exp((n + 1)\mu^\prime))(1 - \exp(\mu^\prime))}
> \end{align*}
> $
>
> Since $\mu^\prime = \mu \ln 2$, the expression can be changed to use 2 as the base exponent:
> $
> E = \frac{n\cdot2^{(n + 2)\mu} - (n + 1)2^{(n + 1)\mu} + 2^\mu}{(1 - 2^{(n + 1)\mu})(1 - 2^\mu)}
> $
>
> The result function is smooth and S-shaped, most closely resembling the [[Hyperbolic Tangent|hyperbolic tangent]], which makes the inverse hyperbolic tangent a possible approximation. The precise parameters in terms of $n$ is still unknown.