> [!definition]
>
> A [[Stochastic Process|stochastic process]] $(X_n, n \in \integer^+)$ is known as a **Markov chain** if the [[Conditional Probability|conditional probabilities]]
> $
> P(X_{n + 1} = k_{n + 1}|X_n = k_n, \cdots, X_1 = k_1, X_0 = k_0) =
> P(X_{n + 1} = k_{n + 1}|X_{n} = k_n)
> $
> for all $k \in S$, $n \in \nat$.
>
> $P(X_{n + 1} = j|X_n = i)$ is called the one-step **transition probability**, denoted as
> $
> P_{ij} = P_{ij}^{n, n+1} = P(X_{n + 1} = j|X_n = i)
> $
> When it is the same for all values of $n$, the change with each step does not depend on time, the Markov chain is known as *homogenous*.
> [!definition]
>
> A square [[Matrix|matrix]] that collects together all $P_{i,j}$s is known as the **transition probability matrix** of the Markov chain. Since each entry in the matrix is a [[Probability|probability]],
> - $P_{i, j} \ge 0 \forall i, j \in S$
> - $\sum_{j = 1}^{\infty}P_{i, j} = 1 \forall i \in S$
>
> Any square matrix that satisfies the above conditions is known as a *stochastic* matrix, as it can be used to model a Markov chain.
> [!definition]
>
> If for some state $s \in S$, $P_{s, s} = 1$, then it is an *absorbing* state, one that once entered, does not provide a way out.
> [!theorem]
>
> If $(X_n, n \in \nat)$ is a Markov chain, then
> $
> P(X_0 = i_0, X_1 = i+1, \cdots, X_n = i_n) =
> \pi_{i_0}\prod_{k = 0}^{n - 1}P_{i_{k}, i_{k + 1}}
> $
> where $P(X_0 = i_0) = \pi_{i_0}$ denotes the probability of the initial condition.
>
> *Proof*. By chain-product on conditional probabilities.
> [!theoremb] Chapman-Kolmogorov Theorem
>
> The $n$-step transition matrix encodes the transition probability of the chain after $n$ steps, which can be calculated as
> $
> P^{n} = P^n
> $
> using [[Matrix|matrix]] multiplication.
>
> *Reasoning.* Given a specific state $X_i = j$, it can be represented as a column vector $\langle \cdots, 1, \cdots \rangle$ where position $j$ has probability one, which allows the extraction of the $j$-th column from any matrix with compatible size. We [[Matrix Transpose|transpose]] the transition probability matrix, such that each column represents the [[Probability Distribution|probability distribution]] of the next state given the current state, which allows the extraction of the next step through the [[Matrix Vector Product|matrix vector product]].
>
> This does not apply to just one specific situation though, but given any column vector as a probability distribution, matrix vector multiplication allows calculating the probability distribution of the next state, including using column vectors from the matrix itself. Naturally, this allows the incredibly efficient calculation of $n$-step transition matrices, simply as the product of the transposes, transposed back, which comes back to the product itself.
> [!definition]
>
> Let $X = (X_n, n \in \integer^+)$ be a Markov chain with transition probability matrix $P$. A probability [[Vector|vector]] $\rho$ is an *invariant distribution* for $X$ if
> $
> \rho = \rho P
> $
> which can be found as an [[Eigenvalue and Eigenvector|eigenvector]] of the transition probability matrix.
> [!definition]
>
> Suppose that $\gamma$ is the initial distribution of a Markov chain. The chain is in *detailed balance* if the probability of starting at point $i$ and then moving to $j$ is the same as the reverse:
> $
> \gamma_i P_{ij} = \gamma_j P_{ji}
> $
> which represents an equilibrium.
> [!definition]
>
> Let $X = (X_n, n \in \integer^+)$ be a Markov chain. If regardless of the initial conditions
> $
> \exists \pi^{\infty} = \pi_0\limv{n}P^{n} \quad \forall \pi_0
> $
> as time goes on, the probabilities converge to the same distribution, then this Markov chain has a *limiting distribution* $\pi^\infty$.
>
> $\pi^\infty$ then is a probability vector representing the long-term distribution of the chain. Naturally, $\pi^\infty$ would be an invariant distribution if the limit exists.
> [!definition]
>
> A Markov chain is *irreducible* if for any $i, j \in S$, $\exists m_1, m_2 > 0$ where $P_{ij}^{m_1} > 0$ and $P_{ij}^{m_2} > 0$. In other words, it is possible to go from an state to any other state.
> [!definition]
>
> A Markov chain is *aperiodic* if $\exists r > 0: P^n_{ii} > 0$ for all $n \ge r$.
> [!theorem]
>
> Let $X = (X_n, n \in \mathbb{X}^+)$ be a stationary Markov chain with transition matrix $P$ and initial distribution $\pi^0$. Then, the entropy rate
> $
> h(X) = H(X_1|X_0)
> $
> *Proof*. Since the process is stationary and $\pi^0$, the [[Information|information]] gained at each step is simply $H(X_1|X_0)$, which is equal to $h(X)$ by definition.