A hypothesis test is an [[Statistics|inferential statistics]] method for deciding to reject or not reject a [[Statistical Hypothesis|statistical hypothesis]].
The statistical significance of a test result is determined by comparing the **significance level** and its [[p-value]].
# Procedure
1. Establish the [[Null Hypothesis|null hypothesis]] $H_0$.
2. Establish an alternative [[Statistical Hypothesis|statistical hypothesis]] $H_1$, that corresponds to a change or difference.
3. Using a given level of confidence, setup the rejection, or critical region for $H_0$.
4. Compute the value of [[Z-Score|standardised]] [[Estimator|test statistics]] from [[Sample|sample]] [[Data|data]].
5. If the standardised test statistic falls in the rejection region, then there is sufficient evidence to justify the rejection of $H_0$. Otherwise, there is not enough evidence to reject $H_0$.
6. State conclusion.
# Errors
| | $H_0$ | $H_1$ |
| ------------------- | ---------------- | ----------------- |
| Do not reject $H_0$ | Correct Decision | [[Type II Error]] |
| Reject $H_0$ | [[Type I Error]] | Correct Decision |
The probability of an $\alpha$ error, $\alpha$, is known as the **significance level** of the test. The probability of not getting a $\beta$ error, $1 - \beta$, is known as the **power** of the test.
Increasing the sample size reduces the [[Standard Deviation|standard deviation]] of the [[Estimator|estimator]], and decreases the [[Probability|probability]] of both types of errors. With the same sample size, the probability of $\alpha$ errors can be decreased by decreasing $\alpha$ (makes the rejection region smaller), but the probability of $\beta$ errors can be decreased by increasing $\alpha$ (makes the nonrejection region smaller). This means that for a given sample size, reducing one type of error comes at the cost of increasing another.
# Signs
![[tails.png]]
| | Left-Tailed | Two-Tailed | Right-Tailed |
| ---------------- | -------------------- | ----------- | -------------------- |
| $H_0$ | $x \ge x_0, x = x_0$ | $x = x_0$ | $x \le x_0, x = x_0$ |
| $H_1$ | $x < x_0$ | $x \ne x_0$ | $x > x_0$ |
| Rejection Region | Left Tail | Both Tails | Right Tail |
When investigating change through a hypothesis test, the change in question is the alternative hypothesis. If the change corresponds to an increase, then $H_1: x > x_0$, if the change corresponds to a decrease, then $H_1: x < x_0$, if investigating the presence of any change, then $H_1: x \ne x_0$.
# One-Population Tests
### $z$-Test
For large [[Count|sample size]] $n$ where the [[Central Limit Theorem|central limit theorem]] kicks in, or when the population is [[Normal Distribution|normally distributed]] with known [[Standard Deviation|standard deviation]], use a $z$-test for hypothesised [[Mean|means]] and [[Bernoulli Process|proportions]]. The critical values will be $z_0 = z_\alpha$ for one-tailed tests, and $z_0 = z_{\alpha / 2}$ for two-tailed tests.
* For mean: $z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}$
* For proportions: $z = \frac{\hat{p} - p_0}{\sqrt{p_0 q_0 / n}}$
### $t$-Test
For a small sample size with unknown population standard deviation, use the $t$-test with the [[t-distribution]] for hypothesised means. The critical values will be $t_0 = t_\alpha$ for one-tailed tests, and $t_0 = t_{\alpha / 2}$ for two-tailed tests.
* $t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}$
# Two-Population Tests
[[Statistic|Statistics]] of different [[Population|populations]] may be compared using a hypothesis test.
### Two-Samples $z$-Test
$
\begin{align*}
z &= \frac{\text{Sample Difference} - \text{Hypothesis Difference}}{\text{Sample Difference Standard Deviation}} \\
&= \frac{(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2)}{\sigma_{\bar{x}_1 - \bar{x}_2}}\ \text{where}\
\sigma_{\bar{x}_1 - \bar{x}_2} = \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}} \\
&= \frac{\bar{x}_1 - \bar{x}_2}{\sigma_{\bar{x}_1 - \bar{x}_2}} \\
z &= \frac{(\hat p_1 - \hat p_2) - (p_1 - p_2)}{\sqrt{\bar p \bar q \paren{\frac{1}{n_1} + \frac{1}{n_2}}}}\\
&= \frac{\hat p_1 - \hat p_2}{\sqrt{\bar p \bar q \paren{\frac{1}{n_1} + \frac{1}{n_2}}}}\
\text{where}\ \bar{p} = \frac{x_1+x_2}{n_1+n_2}
\end{align*}
$
For large ($n_1 \ge 30$, $n_2 \ge 30$) [[Probabilistic Independence|independent]] [[Sample|samples]], use a two-sample $z$-test for the difference between means and proportions, with the same critical values as one-sampled $z$-tests. Since the normal distribution is used for the $z$-test, $n_1 \bar{p}$, $n_1 \bar{q}$, $n_2 \bar{p}$, $n_2 \bar{q}$ should all be at least $10$ for the approximation to be effective.
The **test statistic** in this case is the difference between sample means $\bar{x}_1 - \bar{x}_2$.
Since the null hypothesis in this case is that there is no difference, $H_0: \mu_1 = \mu_2$, $\mu_1 - \mu_2$ will always be $0$.
Since the standard deviation is the square root of the sum of square differences, adding up the [[Variance|variance]] (squaring the standard deviation), and square rooting them allows the combination of standard deviations when combining two populations.
### Two-Samples $t$-Test
$
t = \frac{\bar{d} - \mu_d}{s_d / \sqrt{n}} =\frac{\bar{d}}{s_d / \sqrt{n}}
$
For two [[Normal Distribution|normally distributed]], dependent samples, use a two-sample $t$-test for the difference between the two populations.
Let $d = x_1 - x_2$ be the difference between entries for a data pair, $\mu_d$ be the hypothesised mean difference between the paired data, $\bar{d}$ be the sample mean of the difference, and $s_d$ be the sample standard deviation of the difference. The **test statistic** in this case is $\bar{d}$.
Since the null hypothesis is that there is no difference, $H_0: \mu_d = 0$.
# Conclusion
If the original claim to be tested contains an equality, then it is the null hypothesis.
* If $H_0$ is rejected, state that *there is sufficient evidence to **reject** the original claim*.
* If $H_0$ is not rejected, state that *there is not sufficient evidence to reject the original claim.*
If the original claim does not contain an equality, then it is the alternative hypothesis.
* If $H_0$ is rejected, state that *there is sufficient evidence to **support** the original claim*.
* If $H_0$ is not rejected, state that *there is not sufficient evidence to support the original claim*.