A hypothesis test is an [[Statistics|inferential statistics]] method for deciding to reject or not reject a [[Statistical Hypothesis|statistical hypothesis]]. The statistical significance of a test result is determined by comparing the **significance level** and its [[p-value]]. # Procedure 1. Establish the [[Null Hypothesis|null hypothesis]] $H_0$. 2. Establish an alternative [[Statistical Hypothesis|statistical hypothesis]] $H_1$, that corresponds to a change or difference. 3. Using a given level of confidence, setup the rejection, or critical region for $H_0$. 4. Compute the value of [[Z-Score|standardised]] [[Estimator|test statistics]] from [[Sample|sample]] [[Data|data]]. 5. If the standardised test statistic falls in the rejection region, then there is sufficient evidence to justify the rejection of $H_0$. Otherwise, there is not enough evidence to reject $H_0$. 6. State conclusion. # Errors | | $H_0$ | $H_1$ | | ------------------- | ---------------- | ----------------- | | Do not reject $H_0$ | Correct Decision | [[Type II Error]] | | Reject $H_0$ | [[Type I Error]] | Correct Decision | The probability of an $\alpha$ error, $\alpha$, is known as the **significance level** of the test. The probability of not getting a $\beta$ error, $1 - \beta$, is known as the **power** of the test. Increasing the sample size reduces the [[Standard Deviation|standard deviation]] of the [[Estimator|estimator]], and decreases the [[Probability|probability]] of both types of errors. With the same sample size, the probability of $\alpha$ errors can be decreased by decreasing $\alpha$ (makes the rejection region smaller), but the probability of $\beta$ errors can be decreased by increasing $\alpha$ (makes the nonrejection region smaller). This means that for a given sample size, reducing one type of error comes at the cost of increasing another. # Signs ![[tails.png]] | | Left-Tailed | Two-Tailed | Right-Tailed | | ---------------- | -------------------- | ----------- | -------------------- | | $H_0$ | $x \ge x_0, x = x_0$ | $x = x_0$ | $x \le x_0, x = x_0$ | | $H_1$ | $x < x_0$ | $x \ne x_0$ | $x > x_0$ | | Rejection Region | Left Tail | Both Tails | Right Tail | When investigating change through a hypothesis test, the change in question is the alternative hypothesis. If the change corresponds to an increase, then $H_1: x > x_0$, if the change corresponds to a decrease, then $H_1: x < x_0$, if investigating the presence of any change, then $H_1: x \ne x_0$. # One-Population Tests ### $z$-Test For large [[Count|sample size]] $n$ where the [[Central Limit Theorem|central limit theorem]] kicks in, or when the population is [[Normal Distribution|normally distributed]] with known [[Standard Deviation|standard deviation]], use a $z$-test for hypothesised [[Mean|means]] and [[Bernoulli Process|proportions]]. The critical values will be $z_0 = z_\alpha$ for one-tailed tests, and $z_0 = z_{\alpha / 2}$ for two-tailed tests. * For mean: $z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}$ * For proportions: $z = \frac{\hat{p} - p_0}{\sqrt{p_0 q_0 / n}}$ ### $t$-Test For a small sample size with unknown population standard deviation, use the $t$-test with the [[t-distribution]] for hypothesised means. The critical values will be $t_0 = t_\alpha$ for one-tailed tests, and $t_0 = t_{\alpha / 2}$ for two-tailed tests. * $t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}$ # Two-Population Tests [[Statistic|Statistics]] of different [[Population|populations]] may be compared using a hypothesis test. ### Two-Samples $z$-Test $ \begin{align*} z &= \frac{\text{Sample Difference} - \text{Hypothesis Difference}}{\text{Sample Difference Standard Deviation}} \\ &= \frac{(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2)}{\sigma_{\bar{x}_1 - \bar{x}_2}}\ \text{where}\ \sigma_{\bar{x}_1 - \bar{x}_2} = \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}} \\ &= \frac{\bar{x}_1 - \bar{x}_2}{\sigma_{\bar{x}_1 - \bar{x}_2}} \\ z &= \frac{(\hat p_1 - \hat p_2) - (p_1 - p_2)}{\sqrt{\bar p \bar q \paren{\frac{1}{n_1} + \frac{1}{n_2}}}}\\ &= \frac{\hat p_1 - \hat p_2}{\sqrt{\bar p \bar q \paren{\frac{1}{n_1} + \frac{1}{n_2}}}}\ \text{where}\ \bar{p} = \frac{x_1+x_2}{n_1+n_2} \end{align*} $ For large ($n_1 \ge 30$, $n_2 \ge 30$) [[Probabilistic Independence|independent]] [[Sample|samples]], use a two-sample $z$-test for the difference between means and proportions, with the same critical values as one-sampled $z$-tests. Since the normal distribution is used for the $z$-test, $n_1 \bar{p}$, $n_1 \bar{q}$, $n_2 \bar{p}$, $n_2 \bar{q}$ should all be at least $10$ for the approximation to be effective. The **test statistic** in this case is the difference between sample means $\bar{x}_1 - \bar{x}_2$. Since the null hypothesis in this case is that there is no difference, $H_0: \mu_1 = \mu_2$, $\mu_1 - \mu_2$ will always be $0$. Since the standard deviation is the square root of the sum of square differences, adding up the [[Variance|variance]] (squaring the standard deviation), and square rooting them allows the combination of standard deviations when combining two populations. ### Two-Samples $t$-Test $ t = \frac{\bar{d} - \mu_d}{s_d / \sqrt{n}} =\frac{\bar{d}}{s_d / \sqrt{n}} $ For two [[Normal Distribution|normally distributed]], dependent samples, use a two-sample $t$-test for the difference between the two populations. Let $d = x_1 - x_2$ be the difference between entries for a data pair, $\mu_d$ be the hypothesised mean difference between the paired data, $\bar{d}$ be the sample mean of the difference, and $s_d$ be the sample standard deviation of the difference. The **test statistic** in this case is $\bar{d}$. Since the null hypothesis is that there is no difference, $H_0: \mu_d = 0$. # Conclusion If the original claim to be tested contains an equality, then it is the null hypothesis. * If $H_0$ is rejected, state that *there is sufficient evidence to **reject** the original claim*. * If $H_0$ is not rejected, state that *there is not sufficient evidence to reject the original claim.* If the original claim does not contain an equality, then it is the alternative hypothesis. * If $H_0$ is rejected, state that *there is sufficient evidence to **support** the original claim*. * If $H_0$ is not rejected, state that *there is not sufficient evidence to support the original claim*.