> [!definition] > > Let $\cx, \cy$ be [[Banach Space|Banach spaces]], $U \subset \cx$ be [[Open Set|open]], and $f: U \to \cy$ be a [[Derivative|differentiable]] function, then > $ > Df: U \to L(\cx, \cy) > $ > Since the space of [[Bounded Linear Map|bounded linear maps]] is complete, differentiating $Df$ yields the **second derivative** > $ > D^2f: U \to L(\cx, L(\cx, \cy)) > $ > As maps in $L(\cx, L(\cx, \cy))$ are separately continuous (with respect to $x$ inputs), we identify $L(\cx, L(\cx, \cy)) = L^2(\cx, \cy)$ as the space of continuous [[Multilinear Map|bilinear]] maps. > [!theorem] > > Let $p \in \nat$, then define inductively the $p$-th derivative > $ > D^pf(x) = D(D^{p - 1}f)(x) > $ > with $D^pf(x) \in L^p(\cx, \cy)$ is continuous and [[Multilinear Map|multilinear]]. If > $ > D^kf: U \to L^k(\cx, \cy) > $ > exists and is continuous for each $k \le p$, then $f \in C^p$. > [!theorem] > > Let $\seqf{v_k}$ be fixed elements of $\cx$. If $f$ is $p$ times differentiable on $U$, and let > $ > g(x) = D^{n - 1}f(x)(v_2, \cdots, v_n) > $ > then $g$ is differentiable on $U$, and > $ > Dg(x)(v) = D^pf(x)(v, \cdots, v_n) > $ > *Proof*. Consider $g$ as the composition between > $ > D^{n - 1}f: U \to L^{n - 1}(\cx, \cy) \quad \lambda: L^{n - 1}(\cx, \cy) \to \cy > $ > where $\lambda$ is the evaluation map at $(v_2, \cdots, v_n)$. This makes $\lambda$ continuous and linear, which allows differentiating the decomposition > $ > D(\lambda \circ D^{n - 1}f) = \lambda \circ D^{n}f > $ > Therefore > $ > Dg(x)(v) = (\lambda \circ D^nf)(x)(v) = (D^nf(x)v)(v_2, \cdots, v_n) > $ > [!theorem] > > Let $f: U \to \cy$ be $p$ times differentiable and $\lambda: \cy \to \mathcal{Z}$ be a bounded linear map. Then for any $x \in U$, > $ > D^p(\lambda \circ f)(x) = \lambda \circ D^pf(x) > $ > *Proof*. Induction, pulling out one layer at a time. # Higher Derivatives are Symmetric > [!theorem] > > Let $U \subset \cx$ be open, $f: U \to \cy$ be twice differentiable with $D^2f$ being [[Continuity|continuous]]. Then for each $x \in U$, the bilinear map $D^2f(x)$ is symmetric for all $v, w \in \cx$. > > *Proof*. Let $r > 0$ such that $B(x, 2r) \subset U$. Let $v, w \in \cx$ such that $\norm{v}, \norm{w} < r$. Denote > $ > g(x) = f(x + v) - f(x) > $ > Then > $ > \begin{align*} > &f(x + v + w) - f(x + w) - f(x + v) + f(x) \\ > &= g(x + w) - g(x) \\ > &= \int_0^1Dg(x + tw)(w) dt \\ > &= \int_0^1\braks{Df(x + v + tw) - Df(x + tw)}(w)dt \\ > &= \int_0^1 \int_0^1 D^2f(x + sv + tw) \cdot (v) ds \cdot (w)dt > \end{align*} > $ > by applying the [[Mean Value Theorem|mean value theorem]] twice. Let > $ > \psi(sv, tw) = D^2f(x + sv + tw) - D^2f(x) > $ > then > $ > \begin{align*} > g(x + w) - g(x) &= \int_0^1 \int_0^1 D^2f(x + sv + tw)(v, w) ds dt \\ > &= \int_0^1\int_0^1D^2f(x)(v, w)dsdt \\ > &+ \int_0^1\int_0^1\psi(sv, tw)(v, w)dsdt \\ > &= D^2f(x)(v, w) + \underbrace{\int_0^1\int_0^1\psi(sv, tw)(v, w)dsdt}_{\phi(v, w)} > \end{align*} > $ > where > $ > \norm{\phi(v, w)} \le \sup_{s, t}\norm{\phi(sv, tw)} \cdot \norm{v} \cdot \norm{w} > $ > Swapping the role of $v$ and $w$ in the above example, we can work with > $ > \begin{align*} > g_w(x) &= f(x + w) - f(x) > \end{align*} > $ > and > $ > \begin{align*} > &f(x + v + w) - f(x + w) - f(x + v) + f(x) \\ > &= g_w(x + v) - g_w(x) \\ > &= D^2f(x)(w, v) + \phi_w(v, w) > \end{align*} > $ > where > $ > \norm{\phi_w(v, w)} \le \sup_{s, t} \norm{\psi_w(sv, tw)} \cdot \norm{v} \cdot \norm{w} > $ > The two separate ways of writing the same expression yields > $ > D^2f(x)(v, w) - D^2f(x)(w, v) = \phi(v, w) - \phi_w(v, w) > $ > where since $D^2$ is continuous, > $ > \begin{align*} > \lim_{(v, w) \to 0}\phi(v, w) &= \lim_{(v, w) \to 0}\phi_w(v, w) \\ > &= D^2f(x + v + w) - D^2f(x) \\ > &= 0 > \end{align*} > $ > Meaning that $D^2f(x)(v, w) - D^2f(x)(w, v) = 0$. > [!theorem] > > Let $f \in C^p$ on $U$. Then for each $x \in U$, the map $D^pf(x)$ is symmetric. > > *Proof*. With induction on $p$. Suppose that $D^{p - 1}f(x)$ is symmetric and let $g = D^{p - 2}f$, then > $ > D^2g(x)(v, w) = D^2g(x)(w, v) > $ > Since $D^pf = D^2D^{p - 2}F$, > $ > \begin{align*} > D^pf(x)(v_1, \cdots, v_p) &= (D^2D^{p - 2}f(x))(v_1, v_2) \cdot (v_3, \cdots, v_p) \\ > &= (D^2D^{p - 2}f(x))(v_2, v_1) \cdot (v_3, \cdots, v_p) \\ > &= D^pf(x)(v_2, v_1, \cdots, v_p) > \end{align*} > $ > we can swap the first two inputs to the function. By the inductive hypothesis, we can also permute the last $p - 1$ inputs to the function. > > As any permutation in $S_p$ can be written as $(12) \cdot \sigma$ for some $\sigma \in S_{p - 1}$, permutations do not affect the value of $D^pf(x)$.