> [!Quote] > Using the chain rule is like peeling an onion: you have to deal with each layer at a time, and if it is too big you will start crying. > [!theorem] > > Let $E, F, G$ be [[Banach Space|Banach spaces]], $U \subset E$ and $V \subset F$ be [[Open Set|open]], and $f: U \to V$, $g: V \to G$ be maps. Let $x \in U$. If $f$ is [[Derivative|differentiable]] at $x$ and $g$ is differentiable at $f(x)$, then > $ > D(g \circ f)(x) = Dg(f(x)) \circ Df(x) > $ > *Proof*. Denote, > $ > k(y) = f(x + y) - f(x) = Df(x)y + o_f(y) > $ > then > $ > \begin{align*} > &g(f(x + h)) - g(f(x)) \\ > &= Dg(f(x))k(y) + o_g(k(y)) \\ > &= Dg(f(x))(Df(x)y + o_f(y)) + o_g(k(y)) \\ > &= Dg(f(x)) \circ Df(x) y + \underbrace{Dg(f(x))(o_f(y)) + o_g(k(y))}_{o(y)} > \end{align*} > $ > [!Definition] > > If $g$ is [[Derivative|differentiable]] at $x$, and $f$ is differentiable at $g(x)$, then the composite [[Function|function]] $f(g(x))$ is differentiable at $x$, and its [[Derivative|derivative]] is given by the product > > $ > \frac{d}{dx}{\left(f(g(x))\right)} = \left(\frac{d}{dx}f(x)\right)(g(x)) \cdot \frac{d}{dx}g(x) > $ > > Or if $y = f(u)$ and $u = g(x)$ are both differentiable functions, then: > > $ > \frac{dy}{dx} = \frac{dy}{du}\frac{du}{dx} > $ > [!theorem] Proof > > If $y = f(x)$ and $x$ changes from $a$ to $a + \Delta{x}$, the increment of $y$, $\Delta{y}$ can be defined as follows: > > $ > \Delta{y} = f(a + \Delta{x}) - f(a) > $ > > And thus, the derivative of $f(a)$ is: > > $ > \lim_{\Delta{x} \to 0}{\frac{\Delta{y}}{\Delta{x}}} = \frac{dy}{dx} = \frac{d}{dx}f(x) > $ > > Denote $\varepsilon$ as the difference between the difference quotient and the derivative: > > $ > \lim_{\Delta{x} \to 0}\varepsilon = \lim_{\Delta{x} \to 0}\left(\frac{\Delta{y}}{\Delta{x}} - \frac{d}{dx}f(a)\right) = \frac{d}{dx}f(a) - \frac{d}{dx}f(a) = 0 > $ > > But: > > $ > \begin{align*} > \varepsilon &= \frac{\Delta{y}}{\Delta{x}} - \frac{d}{dx}f(a) \\ > \varepsilon\Delta{x} &= \Delta{y} - \Delta{x}\frac{d}{dx}f(a) \\ > \Delta{y} &= \Delta{x}\frac{d}{dx}f(a) + \varepsilon\Delta{x} > \end{align*} > $ > > Let $\varepsilon$ be a continuous function of $\Delta{x}$ that approaches 0 as $\Delta{x}$ approaches 0. Thus, for a differentiable function $f$: > > $ > \Delta{y} = \Delta{x}\frac{d}{dx}f(a) + \varepsilon\Delta{x} \quad \text{where} \ \varepsilon \to 0 \ \text{as} \ \Delta{x} \to 0 > $ > > Suppose $u = g(x)$ is differentiable at $a$, and $y = f(u)$ is differentiable at $b = g(a)$. If $\Delta{x}$ is an increment in $x$, $\Delta{u}$ is an increment in $u$, and $\Delta{y}$ is an increment in $y$, then: > > $ > \Delta{u} = \left(\frac{d}{dx}g(x)\right)(a) \cdot \Delta{x} + \varepsilon_1 \Delta{x} > = \Delta{x}\left(\left(\frac{d}{dx}g(x)\right)(a) + \varepsilon_1\right) > \quad \text{where} \ \varepsilon_1 \to 0 \ \text{as} \ \Delta{x} \to 0 > $ > > Similarly: > > $ > \Delta{y} = \left(\frac{d}{dx}f(x)\right)(b) \cdot \Delta{u} + \varepsilon_2 \Delta{u} > = \Delta{u}\left(\left(\frac{d}{dx}f(x)\right)(b) + \varepsilon_2\right) > \quad \text{where} \ \varepsilon_2 \to 0 \ \text{as} \ \Delta{u} \to 0 > $ > > Substituting the expression for $\Delta{u}$: > > $ > \Delta{y} = \Delta{x}\left(\left(\frac{d}{dx}g(x)\right)(a) + \varepsilon_1\right) > \left(\left(\frac{d}{dx}f(x)\right)(b) + \varepsilon_2\right) > $ > > So: > > $ > \frac{\Delta{y}}{\Delta{x}} = \left(\left(\frac{d}{dx}g(x)\right)(a) + \varepsilon_1\right) > \left(\left(\frac{d}{dx}f(x)\right)(b) + \varepsilon_2\right) > $ > > Since as $\Delta{x} \ to 0$, $\Delta{u} \to 0$. Thus, $\varepsilon_1 \to 0$ and $\varepsilon_2 \to 0$ as $\Delta{x} \to 0$. > > $ > \begin{align*} > \frac{dy}{dx} > &= \lim_{\Delta{x} \to 0}{\frac{\Delta{y}}{\Delta{x}}} \\ > &= \lim_{\Delta{x} \to 0}{\left(\left(\frac{d}{dx}g(x)\right)(a) + \varepsilon_1\right) > \left(\left(\frac{d}{dx}f(x)\right)(b) + \varepsilon_2\right)} \\ > &= \left(\left(\frac{d}{dx}g(x)\right)(a) + 0\right) > \left(\left(\frac{d}{dx}f(x)\right)(b) + 0\right) \\ > &= \left(\frac{d}{dx}g(x)\right)(a) \cdot > \left(\frac{d}{dx}f(x)\right)(b) \\ > &= \left(\frac{d}{dx}g(x)\right)(a) \cdot > \left(\frac{d}{dx}f(x)\right)(g(a)) > \end{align*} > $ > [!definition] > > Let $f(x_1, \cdots, x_n): \real^n \to \real$ be a differentiable function of $n$ variables, where $x_k = g_k(t)$ are continuous functions, then > $ > \frac{df}{dt} = \sum_{k = 1}^{n}\frac{\partial f}{\partial x_k}\frac{dg_k}{dt} > $ > [!definition] > > Let $f(x_1, \cdots, x_n): \real^n \to \real$ be a differentiable function of $n$ variables, where $x_k = g_k(t_1, \cdots, t_n)$ are differentiable functions, then > $ > \frac{\partial f}{\partial t_k} = \sum_{k = 1}^{n}\frac{\partial f}{\partial x_k}\frac{\partial g_k}{\partial t_k} > $