> [!theorem] Choosing Codewords
>
> Given a [[Binary Symmetric Channel|binary symmetric channel]] with error [[Probability|probability]] $\varepsilon < \frac{1}{2}$. Assume that the source $S$ follows the [[Bernoulli Random Variable|Bernoulli distribution]]. Sending [[Code|codewords]] of $d$ binary digits provides $2^d$ possible codewords. If only $M$ of these possibilities will be used, then they can be selected using *random coding* by imposing a [[Uniform Distribution|uniform distribution]], such that each codeword is chosen with probability $2^{-d}$. The probability of selecting $M$ codewords would be $2^{-dM}$.
> [!theorem] Average [[Hamming Distance]]
>
> Suppose that a certain codeword $y$ is received. Let $X_j$ be the [[Random Variable|random variable]] that the $j$-th *symbol* in $y$ is an error, so $X$ is Bernoulli with $p = \varepsilon$. The total number of errors in $y$ would be [[Binomial Random Variable|binomial]]
> $
> S(d) = \sum_{j = 0}^{d}X_j = B(d, \varepsilon)
> $
> Meaning that
> $
> \ev(S(d)) = \varepsilon d
> $
> [!theorem] Decision Rule
>
> To choose a [[Decision Rule|decision rule]] for decoding, see $x$ (codeword sent) and $y$ (codeword received) as [[Vector|vectors]] in a $d$-dimensional space. Consider the sphere of radius $\varepsilon d$ centred on $y$, and expand the sphere by a small amount $vd$, where $v$ is an arbitrary small number ($r = d(\varepsilon + v)$). Adopt the following decision rule: if there is only one codeword $z$ inside this sphere, then decode $y$ as $z$ (either the distortion is small, or it's so large that it went to another word). Otherwise, declare that an error has been made (the codeword sent has been distorted beyond recognition).
> [!theorem] Lemma
>
> Let $0 \le \varepsilon \lt \frac{1}{2}$ and $m \in \nat$, then
> $
> \sum_{k = 0}^{[m\varepsilon]}{m \choose k} \le 2^{mH_b(\varepsilon)}
> $
> *Proof*. Firstly, plugging $1 + 1$ into the [[Binomial Theorem|Binomial theorem]] gives
> $
> \sum_{k = 0}^{N}{m \choose k} = 2^N
> $
> Since
> $
> -\log_2(x) \ge x \quad \forall 0 \lt x \lt 0.5
> $
> $H_b(\varepsilon) \ge \varepsilon$ for $0 \le \varepsilon \lt \frac{1}{2}$.
>
> Meaning that $m\varepsilon \le mH_b(\varepsilon)$. Substituting $N$ for them gives the inequality.
> [!theorem] Bounding $P(E)$
>
> Let $A$ be the event that there are no codewords inside the sphere, and $B$ be the event that there are two or more. For the average error probability $P(E)$,
> $
> \begin{align*}
> P(E) &= P(A \cup B) \\
> &= P(A) + P(B) - P(A \cap B) \\
> &\le P(A) + P(B)
> \end{align*}
> $
>
> Let $\delta_1 \gt 0$ be a fixed number. Then if $d$ is chosen to be sufficiently large, then every codeword would be included in the sphere.
> $
> P(A) \le \delta_1
> $
>
> Let $\rho$ and $\delta_2$ be fixed, non-negative numbers, and suppose that $M = 2^{d(C - \rho)}$. Then for sufficiently large $d$, no two codewords would be in the same sphere.
> $
> P(B) \le \delta_2
> $
>
> *Proof*. Suppose that $B$ happens and there are two or more codewords in the expanded sphere. Let $x_i \in \mathcal{C}$ be the codeword with the minimum hamming distance from $y$, then
> $
> \begin{align*}
> P(B) &= P\paren{(x_i \in \text{Sphere}) \cup \paren{\bigcup_{j \ne i}(x_j \in \text{Sphere})}} \\
> &\le P\paren{\bigcup_{j \ne i}(x_j \in \text{Sphere})} \\
> &\le \sum_{j = 1, j \ne i}^{M}P(x_j \in \text{Sphere}) \\
> &= (M - 1)P(x_j \in \text{Sphere})
> \end{align*}
> $
>
> Now if $x_j \in \text{Sphere}$ if $x_j$ has up to $[r]$ errors. So for each $1 \le k \le M$, as the probability of $k$ errors is $\frac{1}{2^d}{d \choose k}$,
> $
> \begin{align*}
> P(x_j \in \text{Sphere}) &= \frac{1}{2^d}\sum_{k = 0}^{r}{d \choose k} \\
> &\le \frac{1}{2^d}H_b(\varepsilon + v) = 2^{-d(1-H_b(\varepsilon + v))}
> \end{align*}
> $
> Combining with previous equations,
> $
> P(B) \le (M - 1)2^{-d(1 - H_b(\varepsilon + v))}
> \le M2^{-d(1 - H_b(\varepsilon + v))}
> $
> Since $P(B)$ is a decreasing exponential function of $d$, setting it to be sufficient large will satisfy any upper bound.
> [!theoremb] Theorem
>
> Given $\delta, \rho \gt 0$ (however small), there exists a code such that if the transmission rate of a binary symmetric channel $R = C - \rho$, then
> $
> P(E) \lt \delta
> $