Theorem 18.3.5: Fenchel's Inequality | Jerry's Digital Garden

Theorem 18.3.5 (Fenchel’s Inequality).label Let $\dpn{E, F}{\lambda}$ be a duality over $\real$ and $f: E \to (-\infty, \infty]$ with $f \ne \infty$, then for any $x \in E$ and $\phi \in F$,

\[\dpn{x, \phi}{\lambda}\le f(x) + f^{*}(\phi)\]

with equality if and only if $\phi \in \partial f(x)$.

Proof. Let $x \in E$ and $\phi \in F$. Assume without loss of generality that $f(x) < \infty$, then

\[f(x) + f^{*}(\phi) \ge f(x) + \dpn{x, \phi}{\lambda}- f(x) = \dpn{x, \phi}{\lambda}\]

Now let $x \in E$ and $\phi \in F$ such that $\dpn{x, \phi}{\lambda}= f(x) + f^{*}(\phi)$, then $f(x), f^{*}(\phi) < \infty$ and

\begin{align*}f(x) + f^{*}(\phi)&= \dpn{x, \phi}{\lambda}\\ f^{*}(\phi)&= \dpn{x, \phi}{\lambda}- f(x)\end{align*}

By definition,

\[\dpn{x, \phi}{\lambda}- f(x) = f^{*}(\phi) = \sup_{h \in E}\dpn{x+h, \phi}{\lambda}- f(x+h)\]

so for every $h \in E$,

\begin{align*}\dpn{x, \phi}{\lambda}- f(x)&\ge \dpn{x+h, \phi}{\lambda}- f(x+h) \\ f(x + h) - f(x)&\ge \dpn{h, \phi}{\lambda}\end{align*}

Thus $\phi$ satisfies the subgradient inequality, and $\phi \in \partial f(x)$.

On the other hand, if $f(x) < \infty$ and $\phi \in \partial f(x)$, then $f(x + h) - f(x) \ge \dpn{h, \phi}{\lambda}$ for all $h \in E$, and

\begin{align*}f^{*}(\phi)&= \sup_{h \in E}\dpn{x+h, \phi}{\lambda}- f(x+h) = \dpn{x, \phi}{\lambda}- f(x) \\ f^{*}(\phi) + f(x)&= \dpn{x, \phi}{\lambda}\end{align*}

$\square$

Comments

cce

Yesterday at 12:52

Lemma 0.0.1 (Fenchel’s Inequality).label Let $f: E\to (-\infty,\infty]$ be proper. Then, for every $(x,\phi)\in E\times E^{*}$,

\begin{equation}\phi(x) \leq f(x) + f^{*}(\phi) \tag{Fenchel}\end{equation}

Moreover, equality holds iff $x\in \operatorname{dom}(f)$ and $\phi\in \partial f(x)$.

Proof. (a) We prove Fenchel’s inequality for arbitrary $(x,\phi)\in E\times F$.

Nitpick: Observe that $f\neq\infty$ implies $f^{*}(\phi)\neq -\infty$ for every $\phi\in F$, so the addition $f(x) + f^{*}\phi)$ makes sense in $(-\infty,\infty]$.

Step 1.

Remark 0.0.1.label Let $X,Y$ be non-empty subsets, and let $\theta:X\times Y\to \real$ and $\eta: X\times Y\to (-\infty,\infty]$. If our goal is to prove the pointwise everywhere inequality

\[\theta(x,y) \leq \eta(x,y),\]

then it suffices to consider all $(x,y)\in \operatorname{dom}(\eta)$.

This means, we may assume $(x,\phi)\in \operatorname{dom}(f)\times \operatorname{dom}(f^{*})$. Now, the effective domain of $f^{*}$ may actually be empty, and in that case, there is nothing to prove.

Step 2. Assume $\operatorname{dom}(f^{*})$ is non-empty, and $(x,\phi)\in \operatorname{dom}(f)\times \operatorname{dom}(f^{*})$. Prove Fenchel’s inequality.

(b) We first derive equivalent conditions for equality within (Fenchel) under the assumption

\begin{equation}(x,\phi)\in \operatorname{dom}(f)\times \operatorname{dom}(f^{*})\tag{A}\end{equation}

Nitpick: Since $f(x), f^{*}(\phi)>-\infty$, both $f(x), f^{*}(\phi)$ are real numbers, which allow for subtraction. In what follows ES = Equivalent Statement

(ES 1.) Since equalities in $\real$ are preserved under multiplication by $-1$, we may swap supremum for infimum (over a non-empty set) to get

\[\phi(-x) = \inf_{y\in E}\bigl(f(y) - f(x) - \phi(y)\bigr)\]

(ES 2.) Using the lower-bound property of the infimum, this is equivalent to

\[\forall y\in E,\: \phi(-x) \leq \bigl( f(y) - f(x) \bigr) - \phi(y).\]

(ES 3.) Now, because $\phi(y) \in \real$ for every $y\in E$, we may subtract to obtain

\[\forall y\in E,\: \phi(y-x) \leq f(y) - f(x).\]

Thus, we have proven:

\[\forall (x,\phi)\in \operatorname{dom}(f)\times \operatorname{dom}(f^{*}),\quad \phi\in \partial f(x) \iff \phi(x) = f(x) + f^{*}(x).\]

Completing the Proof. The proof is NOT complete here, but it will be complete once we show

If $(x,\phi)\in E\times E^{*}$, satisfies $\phi(x) = f(x) + f^{*}(\phi)$, then (A) holds.
For every $x\in \operatorname{dom}(f)$, if $\phi$ is a subgradient at $f(x)$, then $(x,\phi)$ satisfies (A).

The first bullet is obvious: if we have an equality of reals, and that the summation on the right hand side makes sense ($f^{*}(\phi)>-\infty$), then both $f(x)$ and $f^{*}(\phi)$ must also be finite.

As for the second bullet: Observe that the notion of a subgradient is only defined for those $x\in \operatorname{dom}(f)$. The rest follows from rearranging the subgradient inequality to see

\[\phi\in \partial f(x) \iff \forall y\in E,\: f(y) \geq \phi(y) - \biggl(\phi(x) - f(x)\biggr).\]

$\square$

Jerry's Digital Garden

Comments

Post a Comment

Jerry's Digital Garden