> [!definition] > > $ > p(x) = (1-p)^{x - 1}p > $ > > The geometric distribution $G(p)$ is the [[Probability Distribution|probability distribution]] of a geometric [[Random Variable|random variable]] $X$ that represents the trial number on which the first success occurs with the probability $p$ in a [[Bernoulli Process|Bernoulli process]]. It has the above [[Probability Mass Function|probability mass function]]. > [!theorem] Derivation > > $X$ is a geometric random variable because the [[Sequence|sequence]] $p(1)$, $p(2)$, $p(3)$, $\cdots$ form a [[Geometric Series|geometric sequence]] with a common ratio $(1 - p)$. The sum of all probabilities $p(x)$ make an infinite series with the common ratio $(1 - p)$, such that it converges to $\frac{p}{1 - (1 - p)} = \frac{p}{p} = 1$. More generally: > $ > \begin{align*} > P(X > k) &= \sum_{x = k + 1}^{\infty}(1 - p)^{x - 1}p > = \frac{(1 - p)^k p}{p} = (1 - p)^k \\ > P(X \le k) &= 1 - (1 - p)^k > \end{align*} > $ > > Alternatively, the probability of not getting a success by $k$ tries is equal to the probability of getting $k$ failures in a row, or $(1 - p)^k$. This means that the probability of getting a success within $k$ tries is $1 - (1 - p)^k$. > [!theorem] > > $ > \ev(X) = \frac{1}{p} \quad \var X = \frac{1 - p}{p^2} > $ > *Proof*. > > The [[Expectation|expected value]] of a geometric distribution can be calculated by abusing the geometric series using the geometric series: > $ > \begin{align*} > E(X) &= \sum_{k = 0}^{\infty}k(1 - p)^{k - 1}p \\ > &= \sum_{k = 1}^{\infty}k(1 - p)^{k - 1}p \\ > &= p\sum_{k = 1}^{\infty}k(1 - p)^{k - 1} \\ > &= p\sum_{j = 1}^{\infty}\paren{\sum_{k = j}^{\infty}(1 - p)^{k - 1}} \\ > &= p\sum_{j = 0}^{\infty}\frac{(1 - p)^j}{1 - (1 - p)} \\ > &= p\sum_{j = 0}^{\infty}\frac{(1 - p)^j}{p} \\ > &= \frac{p}{p}\sum_{j = 0}^{\infty}(1 - p)^j \\ > &= \sum_{j = 0}^{\infty}(1 - p)^j \\ > &= \frac{1}{1 - (1 - p)} \\ > &= \frac{1}{p} > \end{align*} > $ > > The [[Variance|variance]] of a geometric distribution can be calculated using a similar procedure: > $ > \begin{align*} > Var(X) &= E(X^2) - \mu^2 \\ > &= \sum_{k = 0}^{\infty}k^2q^{k - 1}p - \mu^2\\ > &= p\sum_{k = 1}^{\infty}k^2q^{k - 1} - \mu^2 \\ > &= p\paren{\sum_{k = 1}^{\infty}k(k-1)q^{k - 1} + \sum_{k = 1}^{\infty}kq^{k - 1}} - \mu^2 \\ > &= p\paren{\frac{d}{dq}\sum_{k = 1}^{\infty}(k-1)q^{k} + \sum_{k = 1}^{\infty}kq^{k - 1}} - \mu^2 \\ > &= p\paren{\frac{d}{dq}q^2\sum_{k = 1}^{\infty}(k-1)q^{k-2} + \sum_{k = 1}^{\infty}kq^{k - 1}} - \mu^2 \\ > &= p\paren{\frac{d}{dq}q^2\sum_{k = 1}^{\infty}(k-1)q^{k-2} + \sum_{k = 1}^{\infty}kq^{k - 1}} - \mu^2 \\ > &= p\paren{\frac{d}{dq}q^2\sum_{k = 2}^{\infty}(k-1)q^{k-2} + \sum_{k = 1}^{\infty}kq^{k - 1}} - \mu^2 \\ > &= p\paren{\frac{d}{dq}q^2\sum_{k = 2}^{\infty}(k-1)q^{k-2} + \sum_{k = 1}^{\infty}kq^{k - 1}} - \mu^2 \\ > &= p\paren{\frac{d}{dq}q^2\paren{\frac{d}{dq}\sum_{k = 1}^{\infty}q^{k}} + \sum_{k = 1}^{\infty}kq^{k - 1}} - \mu^2 \\ > &= p\paren{\frac{d}{dq}q^2\paren{\frac{d}{dq}\paren{\frac{1}{1 - q} - 1}} + \sum_{k = 1}^{\infty}kq^{k - 1}} - \mu^2 \\ > &= p\paren{\frac{d}{dq}\frac{q^2}{(1 - q)^2} + \sum_{k = 1}^{\infty}kq^{k - 1}} - \mu^2 \\ > &= p\paren{\frac{-2q}{(1 - q)^3} + \sum_{k = 1}^{\infty}kq^{k - 1}} - \mu^2 \\ > &= p\paren{\frac{-2q}{-p^3} + \frac{1}{p^2}} - \mu^2 \\ > &= p\paren{\frac{2q}{p^3} + \frac{1}{p^2}} - \mu^2 \\ > &= \frac{2q}{p^2} + \frac{1}{p} - \frac{1}{p^2} \\ > &= \frac{1 - p}{p^2} \\ > \end{align*} > $