Theorem 18.3.5 (Fenchel’s Inequality).label Let $\dpn{E, F}{\lambda}$ be a duality over $\real$ and $f: E \to (-\infty, \infty]$ with $f \ne \infty$, then for any $x \in E$ and $\phi \in F$,
\[\dpn{x, \phi}{\lambda}\le f(x) + f^{*}(\phi)\]
with equality if and only if $\phi \in \partial f(x)$.
Proof. Let $x \in E$ and $\phi \in F$. Assume without loss of generality that $f(x) < \infty$, then
\[f(x) + f^{*}(\phi) \ge f(x) + \dpn{x, \phi}{\lambda}- f(x) = \dpn{x, \phi}{\lambda}\]
Now let $x \in E$ and $\phi \in F$ such that $\dpn{x, \phi}{\lambda}= f(x) + f^{*}(\phi)$, then $f(x), f^{*}(\phi) < \infty$ and
\begin{align*}f(x) + f^{*}(\phi)&= \dpn{x, \phi}{\lambda}\\ f^{*}(\phi)&= \dpn{x, \phi}{\lambda}- f(x)\end{align*}
By definition,
\[\dpn{x, \phi}{\lambda}- f(x) = f^{*}(\phi) = \sup_{h \in E}\dpn{x+h, \phi}{\lambda}- f(x+h)\]
so for every $h \in E$,
\begin{align*}\dpn{x, \phi}{\lambda}- f(x)&\ge \dpn{x+h, \phi}{\lambda}- f(x+h) \\ f(x + h) - f(x)&\ge \dpn{h, \phi}{\lambda}\end{align*}
Thus $\phi$ satisfies the subgradient inequality, and $\phi \in \partial f(x)$.
On the other hand, if $f(x) < \infty$ and $\phi \in \partial f(x)$, then $f(x + h) - f(x) \ge \dpn{h, \phi}{\lambda}$ for all $h \in E$, and
\begin{align*}f^{*}(\phi)&= \sup_{h \in E}\dpn{x+h, \phi}{\lambda}- f(x+h) = \dpn{x, \phi}{\lambda}- f(x) \\ f^{*}(\phi) + f(x)&= \dpn{x, \phi}{\lambda}\end{align*}
$\square$
Comments
Lemma 0.0.1 (Fenchel’s Inequality).label Let $f: E\to (-\infty,\infty]$ be proper. Then, for every $(x,\phi)\in E\times E^{*}$,
Moreover, equality holds iff $x\in \operatorname{dom}(f)$ and $\phi\in \partial f(x)$.
Proof. (a) We prove Fenchel’s inequality for arbitrary $(x,\phi)\in E\times F$.
Nitpick: Observe that $f\neq\infty$ implies $f^{*}(\phi)\neq -\infty$ for every $\phi\in F$, so the addition $f(x) + f^{*}\phi)$ makes sense in $(-\infty,\infty]$.
Step 1.
Remark 0.0.1.label Let $X,Y$ be non-empty subsets, and let $\theta:X\times Y\to \real$ and $\eta: X\times Y\to (-\infty,\infty]$. If our goal is to prove the pointwise everywhere inequality
then it suffices to consider all $(x,y)\in \operatorname{dom}(\eta)$.
This means, we may assume $(x,\phi)\in \operatorname{dom}(f)\times \operatorname{dom}(f^{*})$. Now, the effective domain of $f^{*}$ may actually be empty, and in that case, there is nothing to prove.
Step 2. Assume $\operatorname{dom}(f^{*})$ is non-empty, and $(x,\phi)\in \operatorname{dom}(f)\times \operatorname{dom}(f^{*})$. Prove Fenchel’s inequality.
(b) We first derive equivalent conditions for equality within (Fenchel) under the assumption
Nitpick: Since $f(x), f^{*}(\phi)>-\infty$, both $f(x), f^{*}(\phi)$ are real numbers, which allow for subtraction. In what follows ES = Equivalent Statement
(ES 1.) Since equalities in $\real$ are preserved under multiplication by $-1$, we may swap supremum for infimum (over a non-empty set) to get
(ES 2.) Using the lower-bound property of the infimum, this is equivalent to
(ES 3.) Now, because $\phi(y) \in \real$ for every $y\in E$, we may subtract to obtain
Thus, we have proven:
Completing the Proof. The proof is NOT complete here, but it will be complete once we show
If $(x,\phi)\in E\times E^{*}$, satisfies $\phi(x) = f(x) + f^{*}(\phi)$, then (A) holds.
For every $x\in \operatorname{dom}(f)$, if $\phi$ is a subgradient at $f(x)$, then $(x,\phi)$ satisfies (A).
The first bullet is obvious: if we have an equality of reals, and that the summation on the right hand side makes sense ($f^{*}(\phi)>-\infty$), then both $f(x)$ and $f^{*}(\phi)$ must also be finite.
As for the second bullet: Observe that the notion of a subgradient is only defined for those $x\in \operatorname{dom}(f)$. The rest follows from rearranging the subgradient inequality to see
$\square$