18.4 Higher Derivatives

Definition 18.4.1 ($n$-Fold Differentiability). Let $E, F$ be TVSs over $K \in \RC$ with $F$ being separated, $\sigma \subset B(E)$ be an upward-directed family that contains all finite sets, $\mathcal{H}\subset B_{\sigma}(E; F)$ be a subspace, and $\mathcal{R}_{\sigma} = \mathcal{R}_{\sigma}(E; F)$.

Let $U \subset E$ be open, $f: U \to F$, $x_{0} \in U$, and $n > 1$, then $f$ is $n$-fold $\sigma$-differentiable at $x_{0}$ if

  1. There exists $V \in \cn_{E}(x_{0})$ such that $f$ is $(n-1)$-fold differentiable on $V$.

  2. The derivative $D_{\sigma}^{n-1}f: U \to B^{n-1}_{\sigma}(E; F)$ is derivative at $x_{0}$.

In which case, $D_{\sigma}(D_{\sigma}^{n-1}f)(x_{0}) \in L(E; B^{n-1}_{\sigma}(E; F))$ is the $n$-fold $\sigma$-derivative of $f$ at $x_{0}$.

The mapping $f: U \to F$ is $n$-fold $\sigma$-differentiable on $U$ if it is $n$-fold $\sigma$-differentiable at every point in $U$. Under the identification $B_{\sigma}(E; B^{n-1}_{\sigma}(E; F)) = B_{\sigma}^{n}(E; F)$ given by Proposition 8.11.3,

\[D_{\sigma}^{n}f: U \to B^{n-1}_{\sigma}(E; F)\]

is the $n$-fold $\sigma$-derivative of $f$ at $x_{0}$.

Theorem 18.4.2. Let $E, F$ be Banach spaces, $U \subset E$ be open, $n \in \natp$, and $f: U \to F$ be a function $n$-times Fréchet-differentiable at $x \in U$, then $D^{n}f(x) \in L^{n}(E; F)$ is symmetric.

Proof. First suppose that $n = 2$. Let $r > 0$ such that $B(x, 2r) \subset U$, and define $A: B_{E}(0, r) \times B_{E}(0, r) \to F$ by

\[A(h, k) = f(x + h + k) - f(x + h) - f(x + k) + f(x)\]

then there exists $r_{1} \in \mathcal{R}_{B(E)}$ such that

\begin{align*}A(h, k)&= Df(x + h)(k) + Df(x)(k) \\&+ [f(x + h + k) - f(x + h) - Df(x + h)(k)] \\&- [f(x + k) - f(x) - Df(x)(k)] \\&= D^{2}f(x)(h, k) + r_{1}(h) \cdot Df(x)(k)&+ [f(x + h + k) - f(x + h) - Df(x + h)(k)] \\&- [f(x + k) - f(x) - Df(x)(k)] \\\end{align*}

Let $B_{h}: B_{E}(0, r) \to F$ be defined by

\[B_{h}(k) = f(x + h + k) - f(x + k) - Df(x + h)(k) + Df(x)(k)\]

then

\[B_{h}(k) - B_{h}(0) = f(x + h + k) - f(x + k) - Df(x + h)(k) + Df(x)(k) -f(x + h) + f(x)\]

Now, there exists $r_{2}, r_{3} \in \mathcal{R}_{B(E)}$ such that for any $k \in B(0, r)$,

\begin{align*}DB_{h}(k)&= Df(x + h + k) - Df(x + k) - Df(x + h) + Df(x) \\&= D^{2}f(x)(h + k) + Df(x) - D^{2}f(x)(h) - Df(x) - D^{2}f(x)(k) + r_{2}(k) + r_{3}(h) \\&=r_{2}(k) + r_{3}(h)\end{align*}

By the Mean Value Theorem,

\[\norm{B_h(k) - B_h(0)}_{F} \le \norm{k}_{E} \cdot o(\norm{k}_{E} + \norm{h}_{E})\]

As the above argument is symmetric,

\[\norm{D^2f(x)(h, k) - D^2f(x)(k, h)}_{F} \le \norm{k}_{E} \cdot o(\norm{k}_{E} + \norm{h}_{E})\]

so $D^{2}f(x)(h, k) - D^{2}f(x)(k, h) = 0$.

Now suppose that the proposition holds for $n$. Identify $L^{n}(E; F) = L^{2}(E; L^{n-2}(E; F))$, then for any $\seqf[n]{x_j}\subset E$,

\[Df(x)(x_{1}, \cdots, x_{n}) = Df(x)(x_{1}, x_{2})(x_{3}, \cdots, x_{n}) = Df(x)(x_{2}, x_{1})(x_{3}, \cdots, x_{n}) = Df(x)(x_{2}, x_{1}, x_{3}, \cdots, x_{n})\]

Since any element $\sigma \in S_{n}$ that does not fix $x_{1}$ is the composition of the transposition $(12)$ and an element that fixes $x_{1}$, $Df(x)$ is symmetric.$\square$

Theorem 18.4.3. Let $E$ be a topological vector space over $K \in \RC$, $\sigma \subset B(E)$ be an upward-directed system that includes all bounded sets contained in finite-dimensional spaces, $F$ be a separated locally convex space over $K$, $U \subset E$ be open, and $f: E \to F$ be $n$-fold $\sigma$-differentiable at $x_{0} \in U$, then $D_{\sigma}^{n}f(x_{0}) \in B_{\sigma}^{n}(E; F)$ is symmetric.

Proof. Let $\seqf{h_j}\subset E$, $E_{0}$ be the subspace generated by $\seqf{h_j}$, and $g = f|_{E_0 \cap U}: E_{0} \cap U \to F$. Since $\sigma$ includes all bounded sets contained in finite-dimensional spaces, for any $\phi \in F^{*}$, the mapping $\phi \circ g: E_{0} \cap U \to K$ is $n$-times Fréchet-differentiable, with

\[D_{B(E_0)}^{n}(\phi \circ g)(x_{0}) = \phi \circ D_{\sigma}^{n} g(x_{0})\]

by the chain rule. By Theorem 18.4.2, $\phi \circ D_{\sigma}^{n} g(x_{0}) \in L^{n}(E_{0}; K)$ is symmetric. As this holds for any $\seqf{h_j}\subset E$ and $\phi \in F^{*}$, $D_{\sigma}^{n} g(x_{0}) \in B_{\sigma}^{n}(E; F)$ is symmetric by the Hahn-Banach theorem.$\square$

Proposition 18.4.4 (Power Rule). Let $E$ be a topological vector space, $\sigma \subset B(E)$ be an upward-directed family that includes bounded sets contained in finite-dimensional subspaces, $F$ be a Hausdorff locally convex space, and

\[T \in \underbrace{L(E; L(E; \cdots L(E; F) \cdots ))}_{n \text{ times}}\subset B^{n}(E; F)\]

be symmetric. For any $x \in E$ and $1 \le k \le n$, let $x^{(k)}$ denote the tuple of $x$ repeated $k$ times, then:

  1. The mapping $f: E \to F \quad x \mapsto T(x^{(n)})$ is infinitely $\sigma$-differentiable on $E$.

  2. For each $1 \le k \le n$ and $x, h \in E$,

    \[Df(x)(h_{1}, \cdots, h_{k}) = \frac{n!}{(n-k)!}T(x^{(n-k)}, h_{1}, \cdots, h_{k})\]

    In particular, $D^{k}f = n! \cdot T$.

  3. For each $k > n$ and $x \in E$, $Df(x) = 0$.

Proof. Suppose inductively that (2) holds for $0 \le k \le n$. Let $G = B^{k}_{\sigma}(E; F)$, then $D^{k}_{\sigma} f \in B^{n-k}_{\sigma}(E; G)$ under the identification $B^{n}_{\sigma}(E; F) = B^{n-k}_{\sigma}(E; B^{k}_{\sigma}(E; F))$ in Proposition 8.11.3. By Theorem 18.4.3, $D^{k}_{\sigma} f$ is also symmetric, so using the Binomial formula,

\begin{align*}D^{k}_{\sigma} f(x + h)&= \sum_{r = 0}^{n-k}{n - k \choose r}D^{k}_{\sigma} f(x^{(n-k-r)}, h^{(r)}) \\&= f(x) + (n-k)D^{k}_{\sigma} f(x^{(n-k-1)}, h) \\&+ \underbrace{\sum_{r = 2}^{n-k}{n - k \choose r}D^k_\sigma f(x^{(n-k-r)}, h^{(r)})}_{r(h)}\end{align*}

For each $k \ge 2$, let $A \in \sigma$ and $U \in \cn_{F}(0)$, then since $D^{k}_{\sigma} f \in B^{n-k}_{\sigma}(E; F)$, there exists $t > 0$ such that

\[\frac{D^{k}_{\sigma} f(x^{(n-k)}, (sA)^{(k)})}{t}= s^{k-1}D^{k}_{\sigma} f(x^{(n-k)}, A^{(k)}) \subset U\]

for all $s \in (0, t)$. Hence $r \in \mathcal{R}_{\sigma}(E; G)$, and

\[D^{k+1}_{\sigma} f(x + h) = f(x) + \frac{n!}{(n-k-1)!}T(x^{(n-k-1)}, h_{1}, \cdots, h_{k+1}) + r(h)\]

by the inductive hypothesis.

(3): Since $D^{n}_{\sigma} f$ is constant, $D^{k}_{\sigma} f = 0$ for all $k > n$.$\square$