Binary Symmetric Channel - Jerry's Math Garden

> [!definition] > > ![[bscp.png|300]] > > | Send\\Receive | 0 | 1 | > | ------------- | ----------------- | ----------------- | > | 0 | $1 - \varepsilon$ | $\varepsilon$ | > | 1 | $\varepsilon$ | $1 - \varepsilon$ | > > A binary symmetric channel ($\text{BSC}_{p}$) is a [[Communication Channel|communication channel]] where the transmitter sends a bit ($0$ or $1$), and the receiver receives that bit, with that bit having a [[Probability|probability]] $\varepsilon$ of being flipped (noise). > > The bit sent ($X$) and the bit received ($Y$) are both [[Bernoulli Random Variable|Bernoulli]] [[Random Variable|random variables]] with $p$ and $q$ as their probabilities of success. > > > | Notation | Probability that... | > | ----------------------------------- | ------------------------- | > | $P(1) = p$ | 1 is sent | > | $P(0) = 1 - p$ | 0 is sent | > | $Q(1) = q$ | 1 is received | > | $Q(0) = 1 - q$ | 0 is received | > | $P_0(1)$ | 0 sent, 1 received | > | $P_1(0)$ | 1 sent, 0 received | > | $P_1(1)$ | 1 sent, 1 received | > | $P_0(0)$ | 0 sent, 0 received | > | $\varepsilon = P_0(1) + P_1(0)$ | Unsuccessful Transmission | > | $1 - \varepsilon = P_1(1) + P_0(0)$ | Successful Transmission | > [!theorem] Calculating $q$ > > If the receiver knows $p$ and $\varepsilon$, they can calculate $q$: > $ > \begin{align*} > Q(1) &= \varepsilon + p - 2\varepsilon p \\ > Q(0) &= 1 - \epsilon - p + 2\varepsilon p > \end{align*} > $ > > *Proof*. > $ > \begin{align*} > Q(1) &= P_0(1)P(0) + P_1(1)P(1) \\ > &= \varepsilon(1 - p) + (1 - \varepsilon)p \\ > &= \varepsilon - \varepsilon p + p - \varepsilon p\\ > &= p + \varepsilon - 2\varepsilon p > \end{align*} > $ > $ > \begin{align*} > Q(0) &= P_1(0)P(1) + P_0(0)P(0) \\ > &= \varepsilon p + (1 - \varepsilon)(1 - p)\\ > &= \varepsilon p + 1 - \varepsilon - p + \varepsilon p\\ > &= 1 - \varepsilon - p + 2\varepsilon p > \end{align*} > $ > [!theorem] Channel Capacity > > $ > C = 1 - H_b(\varepsilon) > $ > where $H_b(\varepsilon)$ is the [[Entropy|entropy]] of a [[Bernoulli Random Variable|Bernoulli]] random variable with probability of success $\varepsilon$. > > *Proof*. Since $C = \max(H(R) - H_S(R))$, first calculate $H(R)$ and $H_S(R)$. > $ > \begin{align*} > H(R) &= H_b(q) = -q\log_2(q) - (1 - q)\log_2(1 - q) > \end{align*} > $ > Using the definition of [[Conditional Entropy|conditional entropy]], > $ > \begin{align*} > H_S(R) &= -\sum_{j = 0}^{1}\sum_{k = 0}^{1}p_{jk}\log_2(p_{j}(k)) \\ > &= -p_{00}\log_2(p_0(0))-p_{01}\log_2(p_0(1)) > -p_{10}\log_2(p_1(0))-p_{11}\log_2(p_1(1)) \\ > &= -(1 - p)(1 - \varepsilon)\log_2(1 - \varepsilon) > -(1 - p)\varepsilon\log_2(\varepsilon) \\ > &\quad -p(1 - \varepsilon)\log_2(1 - \varepsilon) - p\varepsilon\log_2(\varepsilon)\\ > &= -(1 - \varepsilon)\paren{(1 - p)\log_2(1 - \varepsilon) + p\log_2(1 - \varepsilon)}\\ > &\quad -\varepsilon\paren{(1 - p)\log_2(\varepsilon) + p\log_2(\varepsilon)}\\ > &= -(1 - \varepsilon)\log_2(1 - \varepsilon) - \varepsilon\log_2(\varepsilon)\\ > &= H_b(\varepsilon) > \end{align*} > $ > Which makes sense, because the only unknown information about $R$ after knowing about $S$ is the noise. > > Together, > $ > C = \max(H(R) - H_S(R)) = \max(H_b(q) - H_b(\varepsilon)) > $ > Since we are maximising with respect to $q$ (we cannot control the noise), and as $H_b(q) = 1$ is the greatest when $q = \frac{1}{2}$ ([[Uniform Distribution|uniform distribution]]). > $ > C = \max(H_b(q)) - H_b(\varepsilon) = 1 - H_b(\varepsilon) > $ > > [!theorem] Correlation Between Sent and Received > > Let $X$, $Y$, and $E$ be [[Bernoulli Random Variable|Bernoulli]] random variables that represent the bit sent, received, and if the transmission is unsuccessful. The [[Pearson Correlation Coefficient|correlation coefficient]] $\rho$ between $X$ and $Y$ can be calculated in terms of $p$ and $\varepsilon$. > $ > \begin{align*} > \rho(X, Y) = \frac{(1 - 2\varepsilon)\sigma_X}{\sqrt{\sigma_X^2 + \sigma_E^2 - 4\sigma_X^2\sigma_E^2}} > \end{align*} > $ > > *Proof.* The formula for the correlation coefficient $\rho(X, Y) = \frac{\cov{X, Y}}{\sigma_X \sigma_Y}$ requires the [[Covariance|covariance]] and the [[Standard Deviation|standard deviations]] of $X$ and $Y$. > > Beginning with the covariance using the calculation trick $\cov{X, Y} = \ev(XY) - \ev(X)\ev(Y)$, consider the [[Joint Distribution|joint distribution]] of $X$ and $Y$: > $ > \begin{align*} > p_{00} &= P_0(0) = (1 - p)(1 - \varepsilon) \\ > p_{01} &= P_0(1) = (1 - p)\varepsilon \\ > p_{10} &= P_1(0) = p\varepsilon \\ > p_{11} &= P_1(1) = p(1 - \varepsilon) \\ > \end{align*} > $ > > Calculate the [[Expectation|expected value]] of $XY$ is: > $ > \begin{align*} > \ev{(XY)} &= \sum_{i = 1}^{n}\sum_{j = 1}^{m}ijp_{ij} \\ > &= 0 \cdot 0 \cdot p_{00} + 0 \cdot 1 \cdot p_{01} + 1 \cdot 0 \cdot p_{10} + 1 \cdot 1 \cdot p_{11}\\ > &= p_{11} = p(1 - \varepsilon) > \end{align*} > $ > > Calculate the expected values of $X$ and $Y$: > $ > \mu_X = p \quad \mu_Y = p + \varepsilon - 2\varepsilon p > $ > > Combine them to calculate the covariance: > $ > \begin{align*} > \cov{X, Y} &= \ev(XY) - \ev(X)\ev(Y) \\ > &= p(1 - \varepsilon) - p(p + \varepsilon - 2\varepsilon p)\\ > &= p - \varepsilon p - p^2 - \varepsilon p + 2\varepsilon p^2\\ > &= p - p^2 - 2\varepsilon p + 2\varepsilon p^2 > \end{align*} > $ > And collect terms that form $(p - p^2)$ to factor out $\sig{X}$: > $ > \begin{align*} > \cov{X, Y} &= \sig{X} - 2\varepsilon \sig{X} \\ > &= (1 - 2\varepsilon)\sig{X} > \end{align*} > $ > > Moving on to the standard deviations, calculate the [[Variance|variance]] of each random variable: > $ > \begin{align*} > \sigma_X^2 &= p(1 - p) \\ > \sigma_E^2 &= \varepsilon(1 - \varepsilon) \\ > \sigma^2_{Y} &= \mu_y(1 - \mu_y) > \end{align*} > $ > > Expand out the variance of $Y$: > $ > \begin{align*} > \sigma_Y^2 &= (p + \varepsilon - 2\varepsilon p)(1 - p - \varepsilon + 2\varepsilon p) \\ > &= -4\varepsilon^2p^2 + 4\varepsilon p^2 - p^2 + 4\varepsilon^2p - 4\varepsilon p + p - \varepsilon^2 + \varepsilon > \end{align*} > $ > And collect terms that form $(p - p^2)$ and factor out $\sig{X}$: > $ > \begin{align*} > \sigma_Y^2 &= p - p^2 + 4\varepsilon^2p-4\varepsilon^2p^2 + 4\varepsilon p^2 - 4\varepsilon p + \varepsilon - \varepsilon^2 \\ > &= \sigma_X^2 + 4\varepsilon^2\sigma_X^2 - 4\varepsilon\sigma_X^2 + \varepsilon - \varepsilon^2 > \end{align*} > $ > Then collect terms that form $(\varepsilon - \varepsilon^2)$ and factor out $\sig{E}$: > $ > \begin{align*} > \sigma_Y^2 &= \sigma_X^2 - 4\varepsilon^2\sigma_X^2\sigma_E^2 + \sigma_E^2 > \end{align*} > $ > > Combine the covariance and standard deviation to calculate the correlation coefficient: > $ > \begin{align*} > \rho(X, Y) &= \frac{\cov{X, Y}}{\sigma_X \sigma_Y} \\ > &= \frac{(1 - 2\varepsilon)\sig{X}}{\sigma_X \sqrt{\sigma_X^2 - 4\varepsilon^2\sigma_X^2\sigma_E^2 + \sigma_E^2}} \\ > &= \frac{(1 - 2\varepsilon)\sigma_X}{\sqrt{\sigma_X^2 - 4\varepsilon^2\sigma_X^2\sigma_E^2 + \sigma_E^2}} > \end{align*} > $ > [!theorem] > > For a binary symmetric channel where $0 \le \varepsilon \le \frac{1}{2}$, the minimum [[Hamming Distance|Hamming distance]] and the [[Maximum Likelihood Rule|maximum likelihood rule]] are identical [[Decision Rule|decision rules]]. > > *Proof*. Suppose that a codeword $y$ of length $m$ is received, the [[Probability|probability]] that a given codeword $x$ was sent out such that $d(x, y) = p$ is > $ > P_{x\ \text{sent}}(y) = \varepsilon^{p}(1 - \varepsilon)^{m - p} > = (1 - \varepsilon)^m\paren{\frac{\varepsilon}{1 - \varepsilon}}^p > $ > > As $\varepsilon < \frac{1}{2}$, $\frac{\varepsilon}{1 - \varepsilon} < 1$ and > $ > P_y(x\ \text{sent}) = \frac{P_{x\ \text{sent}}(y)P(x\ \text{sent})}{P(y)} > $ > > Minimising $p$ maximises $P_{x\ \text{sent}}(y)$, which leads to the same result as the maximum likelihood rule.