4. Lecture 4: Convex Conjugates, and Marginal Duality
Lecture 3 showed strong duality first in the polyhedral world of linear
programming. Lecture 4 now passes from linear inequality certificates to the
duality of convex functions and their epigraphs. The main geometric idea is
that the conjugate records affine functions lying below f. For each
\xi\in E^*, the affine function
x\mapsto \langle \xi,x\rangle-f^*(\xi)
is the tightest such affine lower bound with slope \xi. Therefore the
biconjugate
f^{**}(x)=\sup_{\xi\in E^*}\{\langle \xi,x\rangle-f^*(\xi)\}
is the supremum of all affine functions lying below f. Geometrically, those
affine lower bounds are exactly the affine halfspaces that contain
\operatorname{epi}(f), so Fenchel--Moreau will be proved below from the fact
that a closed convex set is the intersection of its supporting halfspaces.
4.1. Conjugates and Biconjugation
Let f:E\to \mathbb{R}\cup\{+\infty\} be proper. Its convex conjugate is
defined by
\forall \xi\in E^*,\qquad f^*(\xi):=\sup_{x\in E}\{\langle \xi,x\rangle-f(x)\}.
Under the present convention, this keeps f^* valued in
\mathbb{R}\cup\{+\infty\}. If one allowed an improper function such as
f\equiv +\infty, then the same formula would give f^*\equiv -\infty,
which lies outside the current extended-value convention.
Let
\iota_E:E\to E^{**},
\qquad
(\iota_E(x))(\xi):=\langle \xi,x\rangle.
Since E is finite-dimensional, \iota_E is an isomorphism. We therefore
view the biconjugate back on E through this natural identification and write
f^{**}(x):=(f^*)^*(\iota_E(x))
=
\sup_{\xi\in E^*}\{\langle \xi,x\rangle-f^*(\xi)\}
\qquad (x\in E).
Let f:E\to \mathbb{R}\cup\{+\infty\}. The closure of f, denoted
\operatorname{cl} f, is the function characterized by
\operatorname{epi}(\operatorname{cl} f)=\operatorname{cl}(\operatorname{epi}(f)).
We say that f is closed if \operatorname{epi}(f) is a closed subset of
E\times \mathbb{R}, equivalently if \operatorname{cl} f=f.
In finite dimensions, this is equivalent to lower semicontinuity.
Let f:E\to \mathbb{R}\cup\{+\infty\} be proper. Then
\forall x\in E,\ \forall \xi\in E^*,\qquad
f(x)+f^*(\xi)\ge \langle \xi,x\rangle.
If in addition f is closed and convex, then for every x\in E and every
\xi\in E^*,
f(x)+f^*(\xi)=\langle \xi,x\rangle
\iff
\xi\in \partial f(x).
Proof
By definition of the conjugate,
f^*(\xi)=\sup_{z\in E}\{\langle \xi,z\rangle-f(z)\}\ge \langle \xi,x\rangle-f(x),
which rearranges to
f(x)+f^*(\xi)\ge \langle \xi,x\rangle.
Assume now that f is proper, closed, and convex. Equality holds if and only
if
f^*(\xi)=\langle \xi,x\rangle-f(x).
By the definition of f^*, this is equivalent to
\forall z\in E,\qquad \langle \xi,z\rangle-f(z)\le \langle \xi,x\rangle-f(x),
that is,
\forall z\in E,\qquad f(z)\ge f(x)+\langle \xi,z-x\rangle.
This is exactly the statement that \xi\in \partial f(x).
Formal Statement and Proof
Lean theorems: Lecture04.lem_l4_fy, Lecture04.lem_l4_fy_eq_iff_subgradient.
theorem Lecture04_lem_l4_fy
{E : Type*} [NormedAddCommGroup E] [NormedSpace ℝ E]
{f : E → EReal} (hf : IsProper f) (x : E) (ξ : Covector E) :
f x + fenchelConjugate f ξ ≥ (((ξ x : ℝ) : EReal)) := E:Type u_1inst✝¹:NormedAddCommGroup Einst✝:NormedSpace ℝ Ef:E → ERealhf:IsProper fx:Eξ:Covector E⊢ f x + fenchelConjugate f ξ ≥ ↑(ξ x)
All goals completed! 🐙
theorem Lecture04_lem_l4_fy_eq_iff_subgradient
{E : Type*} [NormedAddCommGroup E] [NormedSpace ℝ E]
{f : E → EReal} (hf : IsProper f) {x : E} {ξ : Covector E} :
f x + fenchelConjugate f ξ = (((ξ x : ℝ) : EReal)) ↔
Lecture02.IsERealSubgradient f x ξ := E:Type u_1inst✝¹:NormedAddCommGroup Einst✝:NormedSpace ℝ Ef:E → ERealhf:IsProper fx:Eξ:Covector E⊢ f x + fenchelConjugate f ξ = ↑(ξ x) ↔ Lecture02.IsERealSubgradient f x ξ
All goals completed! 🐙
By definition, f^*(\xi) is the smallest shift that makes the affine
function z\mapsto \langle \xi,z\rangle-f^*(\xi) lie below f for every
z. Evaluating at z=x gives the inequality, and equality means that this
affine lower bound touches f at x, which is exactly the subgradient
condition.
Let f:E\to \mathbb{R}\cup\{+\infty\} be proper. Then
\forall x\in E,\qquad f^{**}(x)\le f(x).
Proof
For every x\in E and every \xi\in E^*, Lemma 4.1 gives
\langle \xi,x\rangle-f^*(\xi)\le f(x).
Taking the supremum over \xi yields
f^{**}(x)\le f(x)
\qquad \forall x\in E.
Formal Statement and Proof
Lean theorem: Lecture04_lem_l4_biconj_le.
theorem Lecture04_lem_l4_biconj_le
{E : Type*} [NormedAddCommGroup E] [NormedSpace ℝ E]
{f : E → EReal} (hf : IsProper f) (x : E) :
fenchelBiconjugate f x ≤ f x := E:Type u_1inst✝¹:NormedAddCommGroup Einst✝:NormedSpace ℝ Ef:E → ERealhf:IsProper fx:E⊢ fenchelBiconjugate f x ≤ f x
E:Type u_1inst✝¹:NormedAddCommGroup Einst✝:NormedSpace ℝ Ef:E → ERealhf:IsProper fx:E⊢ ⨆ x_1, ↑((evalAtCovector x) x_1) - fenchelConjugate f x_1 ≤ f x
E:Type u_1inst✝¹:NormedAddCommGroup Einst✝:NormedSpace ℝ Ef:E → ERealhf:IsProper fx:E⊢ ∀ (i : Covector E), ↑((evalAtCovector x) i) - fenchelConjugate f i ≤ f x
E:Type u_1inst✝¹:NormedAddCommGroup Einst✝:NormedSpace ℝ Ef:E → ERealhf:IsProper fx:Eξ:Covector E⊢ ↑((evalAtCovector x) ξ) - fenchelConjugate f ξ ≤ f x
have hfy :
(((evalAtCovector x ξ : ℝ) : EReal))
≤ f x + fenchelConjugate f ξ := E:Type u_1inst✝¹:NormedAddCommGroup Einst✝:NormedSpace ℝ Ef:E → ERealhf:IsProper fx:Eξ:Covector E⊢ ↑((evalAtCovector x) ξ) ≤ f x + fenchelConjugate f ξ
All goals completed! 🐙
All goals completed! 🐙
This can be read as a weak-duality statement for biconjugation:
f^{**} always gives a lower bound on f. Unlike Theorem 4.3, this
direction does not require convexity or closedness.
Let E be a finite-dimensional real normed space, and let
f:E\to \mathbb{R}\cup\{+\infty\} be proper, closed, and convex. Then
\forall x\in E,\qquad f^{**}(x)=f(x).
Proof
By Lemma 4.2, it remains to prove the reverse inequality
f(x)\le f^{**}(x)
\qquad \forall x\in E.
Fix x_0\in E and t_0<f(x_0). Then (x_0,t_0)\notin \operatorname{epi}(f).
Because f is proper, closed, and convex, the epigraph
\operatorname{epi}(f) is a nonempty closed convex subset of
E\times \mathbb{R}. By Theorem 2.6, there exists a closed halfspace
containing \operatorname{epi}(f) but not (x_0,t_0).
Since \operatorname{epi}(f) is upward closed in the t-direction, we may
take that halfspace in the form
H_{\xi,a}:=\{(x,t)\in E\times \mathbb{R}: t\ge \langle \xi,x\rangle-a\}
for some \xi\in E^* and a\in \mathbb{R}. Thus
\operatorname{epi}(f)\subseteq H_{\xi,a}
\qquad\text{and}\qquad
t_0<\langle \xi,x_0\rangle-a.
The containment \operatorname{epi}(f)\subseteq H_{\xi,a} is equivalent to
saying that the affine function
\ell_{\xi,a}(x):=\langle \xi,x\rangle-a
is a global affine lower bound for f, that is,
\ell_{\xi,a}(x)\le f(x)
\qquad \forall x\in E.
For fixed slope \xi, the smallest admissible intercept is exactly
f^*(\xi), because
\ell_{\xi,a}(x)\le f(x)\ \forall x
\iff
a\ge \sup_{x\in E}\{\langle \xi,x\rangle-f(x)\}=f^*(\xi).
Hence every affine lower bound of slope \xi lies below the tight affine
function of the same slope,
x\mapsto \langle \xi,x\rangle-f^*(\xi).
In particular,
t_0<\langle \xi,x_0\rangle-a\le \langle \xi,x_0\rangle-f^*(\xi)\le f^{**}(x_0).
Since this holds for every t_0<f(x_0), we conclude that
f(x_0)\le f^{**}(x_0).
Combined with Lemma 4.2, this gives
\forall x\in E,\qquad f^{**}(x)=f(x).
Formal Statement and Proof
Lean theorem: Lecture04.thm_l4_fm.
theorem Lecture04_thm_l4_fm
{E : Type*} [NormedAddCommGroup E] [InnerProductSpace ℝ E] [FiniteDimensional ℝ E]
(f : E → EReal) (hf_proper : IsProper f)
(hf_closed : LowerSemicontinuous f)
(hf_convex : EConvexOn Set.univ f) :
fenchelBiconjugate f = f := E:Type u_1inst✝²:NormedAddCommGroup Einst✝¹:InnerProductSpace ℝ Einst✝:FiniteDimensional ℝ Ef:E → ERealhf_proper:IsProper fhf_closed:LowerSemicontinuous fhf_convex:EConvexOn Set.univ f⊢ fenchelBiconjugate f = f
All goals completed! 🐙
More generally, if f is proper and convex, then
(\operatorname{cl} f)^*=f^*.
Applying Theorem 4.3 to \operatorname{cl} f therefore gives
f^{**}=(\operatorname{cl} f)^{**}=\operatorname{cl} f.
4.2. Basic Conjugate Pairs
With the structural theorem in place, it is useful to collect a few conjugate pairs that will recur later in the course.
Let p,q>1 satisfy \frac1p+\frac1q=1, and define
f(x):=\frac{|x|^p}{p} on \mathbb{R}. Then
f^*(\xi)=\frac{|\xi|^q}{q} for \xi\in \mathbb{R}. Thus
\left(\frac{|x|^p}{p}\right)^*=\frac{|\xi|^q}{q}.
This is the basic one-dimensional power-law conjugate pair. Applying Lemma 4.1 to this pair gives
x\xi\le \frac{|x|^p}{p}+\frac{|\xi|^q}{q}
for all x,\xi\in \mathbb{R}, and in particular
ab\le \frac{a^p}{p}+\frac{b^q}{q}
for a,b\ge 0. This is the classical Young inequality, which explains the
Young part of the name Fenchel--Young inequality.
Proof
By definition,
f^*(\xi)=\sup_{x\in \mathbb{R}}\left\{x\xi-\frac{|x|^p}{p}\right\}.
The maximizer has the same sign as \xi, so it is enough to consider
x\ge 0 and \xi\ge 0. For fixed \xi\ge 0, set
\varphi_\xi(x):=x\xi-\frac{x^p}{p} on x\ge 0. Since
\varphi_\xi'(x)=\xi-x^{p-1}, the unique critical point is
x=\xi^{1/(p-1)}=\xi^{q-1}. Substituting gives
f^*(\xi)
=
\xi\cdot \xi^{q-1}-\frac{\xi^{p(q-1)}}{p}
=
\xi^q-\frac{\xi^q}{p}
=
\frac{\xi^q}{q}.
By symmetry, f^*(\xi)=\frac{|\xi|^q}{q} for all
\xi\in \mathbb{R}. Applying Lemma 4.1 to this pair then gives the classical
Young inequality stated above.
Let C\subseteq E. The indicator function of C is the function
\delta_C:E\to \mathbb{R}\cup\{+\infty\} defined by
\delta_C(x):=
\begin{cases}
0,&x\in C,\\
+\infty,&x\notin C.
\end{cases}
Let C\subseteq E be nonempty, and define its support function by
\sigma_C(\xi):=\sup_{x\in C}\langle \xi,x\rangle
\qquad (\xi\in E^*).
Then
(\delta_C)^*=\sigma_C.
In one dimension, if C=[-1,1]\subseteq \mathbb{R}, then
\sigma_C(\xi)=\sup_{x\in[-1,1]} \xi x=|\xi|.
Thus
(\delta_{[-1,1]})^*=|\cdot|.
Since |\cdot| is proper, closed, and convex, Theorem 4.3 then gives
|\cdot|^*=\delta_{[-1,1]}.
So the pair |x| and \delta_{[-1,1]} is the one-dimensional special case
of the general indicator/support-function correspondence.
Let \|\cdot\| be a norm on E, and let \|\cdot\|_* be its dual norm on
E^*, defined by
\|\xi\|_*:=\sup_{\|x\|\le 1}\langle \xi,x\rangle.
Then
\|\cdot\|^*=\delta_{B_*},
\qquad
B_*:=\{\xi\in E^*:\|\xi\|_*\le 1\}.
Indeed, if \|\xi\|_*\le 1, then
\langle \xi,x\rangle-\|x\|\le \|\xi\|_*\|x\|-\|x\|\le 0
\qquad \forall x\in E,
so \|\cdot\|^*(\xi)=0. If \|\xi\|_*>1, choose x_0\in E such that
\langle \xi,x_0\rangle>\|x_0\|; then
\langle \xi,tx_0\rangle-\|tx_0\|
=
t\bigl(\langle \xi,x_0\rangle-\|x_0\|\bigr)\to +\infty
\qquad (t\to+\infty),
so \|\cdot\|^*(\xi)=+\infty.
Let f(x)=e^x on \mathbb{R}. Then
f^*(\xi)=
\begin{cases}
\xi\log \xi-\xi,&\xi\ge 0,\\
+\infty,&\xi<0,
\end{cases}
with the convention 0\log 0:=0. This is the basic exponential/entropy
conjugate pair.
The local Verso page contains a one-dimensional conjugate animation. The
reader can edit a convex function f:\mathbb{R}\to\mathbb{R} directly while
watching the supporting line and the conjugate graph update numerically.
4.3. Marginal Duality and Applications
For subsets A,B\subseteq Y, write
A-B:=\{a-b:a\in A,\ b\in B\}=A+(-B).
Let X and U be finite-dimensional real vector spaces, and let
\Phi:X\times U\to \mathbb{R}\cup\{+\infty\}
be convex. Define the marginal value function of \Phi as
p:U\to \mathbb{R}\cup\{\pm\infty\},
\qquad
p(u):=\inf_{x\in X}\Phi(x,u).
Here p may take the value -\infty, even though \Phi itself only takes
values in \mathbb{R}\cup\{+\infty\}. Then the following hold.
-
For every
y\in U^*,p^*(y)=\Phi^*(0,y).Consequently, for every
u\in Uand everyy\in U^*,p(u)\ge -\Phi^*(0,y)+\langle y,u\rangle.In particular,
\sup_{y\in U^*}\{-\Phi^*(0,y)\}\le p(0). -
If
p(0)=-\infty, then\sup_{y\in U^*}\{-\Phi^*(0,y)\}=-\infty. -
If
p(0)>-\inftyand\partial p(0)\neq\varnothing, thenp(0)=\max_{y\in U^*}\{-\Phi^*(0,y)\}.In particular, strong duality and dual attainment hold.
-
Define
D:=\{u\in U:\exists x\in X\text{ such that }\Phi(x,u)<+\infty\}=\operatorname{dom} p.If
p(0)\in \mathbb{R}and0\in \operatorname{ri}(D), thenp(0)=\max_{y\in U^*}\{-\Phi^*(0,y)\}.
Proof
Because \Phi is convex, its epigraph is a convex subset of
X\times U\times \mathbb{R}. The epigraph of p is the projection
\operatorname{epi}(p)=\{(u,t)\in U\times \mathbb{R}:\exists x\in X\text{ with }\Phi(x,u)\le t\}.
Hence \operatorname{epi}(p) is convex, so p is convex.
Fix y\in U^*. By definition,
\begin{aligned}
p^*(y)
&=\sup_{u\in U}\{\langle y,u\rangle-p(u)\} \\
&=\sup_{u\in U}\sup_{x\in X}\{\langle y,u\rangle-\Phi(x,u)\} \\
&=\sup_{x\in X,\ u\in U}\{\langle 0,x\rangle+\langle y,u\rangle-\Phi(x,u)\} \\
&=\Phi^*(0,y).
\end{aligned}
This proves the identity in part (1). For every u\in U, the definition of
p^*(y) gives
p^*(y)=\sup_{v\in U}\{\langle y,v\rangle-p(v)\}\ge \langle y,u\rangle-p(u).
Using p^*(y)=\Phi^*(0,y), we obtain
p(u)+\Phi^*(0,y)\ge \langle y,u\rangle,
which is exactly
p(u)\ge -\Phi^*(0,y)+\langle y,u\rangle.
Assume now that p(0)=-\infty. For every M>0, choose x_M\in X such that
\Phi(x_M,0)\le -M.
Then, for every y\in U^*,
\Phi^*(0,y)\ge \langle 0,x_M\rangle+\langle y,0\rangle-\Phi(x_M,0)\ge M.
Since M is arbitrary, \Phi^*(0,y)=+\infty for every y, and hence
\sup_{y\in U^*}\{-\Phi^*(0,y)\}=-\infty.
This proves part (2).
Assume next that p(0)>-\infty and choose y\in \partial p(0). Then
\forall u\in U,\qquad p(u)\ge p(0)+\langle y,u\rangle.
Equivalently,
\forall u\in U,\qquad \langle y,u\rangle-p(u)\le -p(0).
Taking the supremum over u gives
p^*(y)\le -p(0).
On the other hand, evaluating at u=0 gives
p^*(y)\ge \langle y,0\rangle-p(0)=-p(0).
Hence
p^*(y)=-p(0).
Using part (1) of Theorem 4.4,
p(0)=-p^*(y)=-\Phi^*(0,y).
Combined with the weak-duality inequality from part (1), this yields
p(0)=\max_{v\in U^*}\{-\Phi^*(0,v)\},
with the maximum attained at the chosen y. This proves part (3).
Finally, define
D:=\{u\in U:\exists x\in X\text{ such that }\Phi(x,u)<+\infty\}.
If u\in D, then there exists x\in X such that \Phi(x,u)<+\infty, so
p(u)=\inf_{x'\in X}\Phi(x',u)\le \Phi(x,u)<+\infty,
hence u\in \operatorname{dom} p. Conversely, if
u\in \operatorname{dom} p, then p(u)<+\infty, so not all values
\Phi(x,u) can equal +\infty; therefore there exists x\in X such that
\Phi(x,u)<+\infty, and hence u\in D. Thus
D=\operatorname{dom} p.
Assume now that p(0)\in \mathbb{R} and
0\in \operatorname{ri}(D)=\operatorname{ri}(\operatorname{dom} p). By
convexity of p, it is enough to show that p never takes the value
-\infty on \operatorname{dom} p. Fix u\in \operatorname{dom} p. Since
0\in \operatorname{ri}(\operatorname{dom} p), there exists
\lambda\in(0,1) such that \lambda u\in \operatorname{dom} p. Then
p(0) and p(\lambda u) are both finite, and convexity gives
p(\lambda u)\le \lambda p(u)+(1-\lambda)p(0).
Therefore p(u)>-\infty. Hence
p:U\to \mathbb{R}\cup\{+\infty\} is proper and convex, with
0\in \operatorname{ri}(\operatorname{dom} p). By Theorem 2.10, we obtain
\partial p(0)\neq\varnothing. Part (3) now applies and proves part (4).
Formal Statement and Proof
Lean theorems: Lecture04.thm_l4_marginal_duality_part1,
Lecture04.thm_l4_marginal_duality_part2,
Lecture04.thm_l4_marginal_duality_part3,
Lecture04.thm_l4_marginal_duality.
theorem Lecture04_thm_l4_marginal_duality_part1
{X Y : Type*} [NormedAddCommGroup X] [NormedSpace ℝ X]
[NormedAddCommGroup Y] [NormedSpace ℝ Y]
(Φ : X × Y → EReal) :
(∀ y : Covector Y,
fenchelConjugate (marginal (X := X) (Y := Y) Φ) y =
fenchelConjugate Φ (prodSndCovector (X := X) y)) ∧
(∀ u : Y, ∀ y : Covector Y,
affineMinus y (marginal (X := X) (Y := Y) Φ) u
≤ fenchelConjugate Φ (prodSndCovector (X := X) y)) ∧
marginalDualValue (X := X) (Y := Y) Φ ≤
marginal (X := X) (Y := Y) Φ (0 : Y) := X:Type u_1Y:Type u_2inst✝³:NormedAddCommGroup Xinst✝²:NormedSpace ℝ Xinst✝¹:NormedAddCommGroup Yinst✝:NormedSpace ℝ YΦ:X × Y → EReal⊢ (∀ (y : Covector Y), fenchelConjugate (marginal Φ) y = fenchelConjugate Φ (prodSndCovector y)) ∧
(∀ (u : Y) (y : Covector Y), affineMinus y (marginal Φ) u ≤ fenchelConjugate Φ (prodSndCovector y)) ∧
marginalDualValue Φ ≤ marginal Φ 0
All goals completed! 🐙
theorem Lecture04_thm_l4_marginal_duality_part2
{X Y : Type*} [NormedAddCommGroup X] [NormedSpace ℝ X]
[NormedAddCommGroup Y] [NormedSpace ℝ Y]
(Φ : X × Y → EReal)
(hprimal : marginal (X := X) (Y := Y) Φ (0 : Y) = ⊥) :
marginalDualValue (X := X) (Y := Y) Φ = ⊥ := X:Type u_1Y:Type u_2inst✝³:NormedAddCommGroup Xinst✝²:NormedSpace ℝ Xinst✝¹:NormedAddCommGroup Yinst✝:NormedSpace ℝ YΦ:X × Y → ERealhprimal:marginal Φ 0 = ⊥⊢ marginalDualValue Φ = ⊥
All goals completed! 🐙
theorem Lecture04_thm_l4_marginal_duality_part3
{X Y : Type*} [NormedAddCommGroup X] [NormedSpace ℝ X]
[NormedAddCommGroup Y] [NormedSpace ℝ Y]
(Φ : X × Y → EReal)
(hzero_bot : marginal (X := X) (Y := Y) Φ (0 : Y) ≠ ⊥)
(hzero_top : marginal (X := X) (Y := Y) Φ (0 : Y) ≠ ⊤)
{y : Covector Y}
(hsub : Lecture02.IsERealSubgradient
(marginal (X := X) (Y := Y) Φ) (0 : Y) y) :
marginal (X := X) (Y := Y) Φ (0 : Y) =
marginalDualValue (X := X) (Y := Y) Φ ∧
marginalDualValue (X := X) (Y := Y) Φ =
marginalDualObjective (X := X) (Y := Y) Φ y := X:Type u_1Y:Type u_2inst✝³:NormedAddCommGroup Xinst✝²:NormedSpace ℝ Xinst✝¹:NormedAddCommGroup Yinst✝:NormedSpace ℝ YΦ:X × Y → ERealhzero_bot:marginal Φ 0 ≠ ⊥hzero_top:marginal Φ 0 ≠ ⊤y:Covector Yhsub:Lecture02.IsERealSubgradient (marginal Φ) 0 y⊢ marginal Φ 0 = marginalDualValue Φ ∧ marginalDualValue Φ = marginalDualObjective Φ y
All goals completed! 🐙
theorem Lecture04_thm_l4_marginal_duality
{X Y : Type*} [NormedAddCommGroup X] [NormedSpace ℝ X]
[NormedAddCommGroup Y] [NormedSpace ℝ Y] [FiniteDimensional ℝ Y]
(Φ : X × Y → EReal) (hΦ_convex : EConvexOn Set.univ Φ)
(hzero_bot : marginal (X := X) (Y := Y) Φ (0 : Y) ≠ ⊥)
(hri : (0 : Y) ∈ intrinsicInterior ℝ
(effectiveDomain (marginal (X := X) (Y := Y) Φ))) :
∃ y : Covector Y,
marginal (X := X) (Y := Y) Φ (0 : Y) =
marginalDualValue (X := X) (Y := Y) Φ ∧
marginalDualValue (X := X) (Y := Y) Φ =
marginalDualObjective (X := X) (Y := Y) Φ y := X:Type u_1Y:Type u_2inst✝⁴:NormedAddCommGroup Xinst✝³:NormedSpace ℝ Xinst✝²:NormedAddCommGroup Yinst✝¹:NormedSpace ℝ Yinst✝:FiniteDimensional ℝ YΦ:X × Y → ERealhΦ_convex:EConvexOn Set.univ Φhzero_bot:marginal Φ 0 ≠ ⊥hri:0 ∈ intrinsicInterior ℝ (effectiveDomain (marginal Φ))⊢ ∃ y, marginal Φ 0 = marginalDualValue Φ ∧ marginalDualValue Φ = marginalDualObjective Φ y
All goals completed! 🐙
One especially useful specialization of Theorem 4.4 is the template
\inf_{x\in X}\{f(x)+g(Ax)\}.
We postpone that reduction to Exercise 4.1. The worked applications below instead compute the relevant perturbation conjugates directly from Theorem 4.4.
Consider the linear program
\inf\{c^\top x:Ax\ge b,\ x\ge 0\},
with A\in \mathbb{R}^{m\times n}, b\in \mathbb{R}^m, and
c\in \mathbb{R}^n. Define the perturbation
\Phi(x,u):=c^\top x+\delta_{\mathbb{R}_+^n}(x)+\delta_{\mathbb{R}_+^m}(Ax-b-u),
\qquad (x,u)\in \mathbb{R}^n\times \mathbb{R}^m.
Then
p(u):=\inf_{x\in \mathbb{R}^n}\Phi(x,u)
=
\inf\{c^\top x:Ax\ge b+u,\ x\ge 0\}.
For y\in \mathbb{R}^m,
\Phi^*(0,y)=
\begin{cases}
-b^\top y,&A^\top y\le c,\ y\ge 0,\\
+\infty,&\text{otherwise}.
\end{cases}
Therefore Theorem 4.4 recovers the dual formula
\sup\{b^\top y:A^\top y\le c,\ y\ge 0\}
\le
\inf\{c^\top x:Ax\ge b,\ x\ge 0\},
and, if 0\in \operatorname{ri}(\operatorname{dom} p) and the primal value is
finite, it gives
\inf\{c^\top x:Ax\ge b,\ x\ge 0\}
=
\max\{b^\top y:A^\top y\le c,\ y\ge 0\}.
This recovers the usual LP dual from the general perturbation template. Lecture 3 proves a sharper strong-duality statement using polyhedral structure, without this extra relative-interior hypothesis.
Proof
The formula for p(u) is immediate from the definition of \Phi. For
y\in \mathbb{R}^m,
\begin{aligned}
\Phi^*(0,y)
&=\sup_{x\in \mathbb{R}^n,\ u\in \mathbb{R}^m}
\{y^\top u-c^\top x-\delta_{\mathbb{R}_+^n}(x)-\delta_{\mathbb{R}_+^m}(Ax-b-u)\}.
\end{aligned}
If some coordinate of y is negative, then the constraint
Ax-b-u\in \mathbb{R}_+^m only imposes an upper bound
u\le Ax-b, so sending the corresponding coordinate of u to
-\infty shows that \Phi^*(0,y)=+\infty. Hence we may restrict to
y\ge 0. For such y, the supremum over u is attained at the largest
feasible choice u=Ax-b, so
\Phi^*(0,y)
=\sup_{x\ge 0}\{y^\top(Ax-b)-c^\top x\}
=-b^\top y+\sup_{x\ge 0}\{x^\top(A^\top y-c)\}.
If A^\top y\le c, then x^\top(A^\top y-c)\le 0 for every x\ge 0, so
the supremum is 0, attained at x=0. If instead some coordinate of
A^\top y-c is positive, then scaling the corresponding basis vector shows
that the supremum is +\infty. This proves the displayed formula for
\Phi^*(0,y).
Part (1) of Theorem 4.4 then gives
\sup_{y\in \mathbb{R}^m}\{-\Phi^*(0,y)\}\le p(0),
which is exactly the weak-duality inequality
\sup\{b^\top y:A^\top y\le c,\ y\ge 0\}
\le
\inf\{c^\top x:Ax\ge b,\ x\ge 0\}.
If in addition 0\in \operatorname{ri}(\operatorname{dom} p) and
p(0)\in \mathbb{R}, then part (4) of Theorem 4.4 yields the claimed
equality and dual attainment.
Let X and Y be finite-dimensional real normed spaces, let A:X\to Y be
linear, let \lambda>0, and let
f:X\to \mathbb{R}\cup\{+\infty\} be proper, closed, and convex. Consider
\inf_{x\in X}\{f(x)+\lambda\|Ax\|\}.
Then
\inf_{x\in X}\{f(x)+\lambda\|Ax\|\}
=
\max_{\substack{y\in Y^*\\ \|y\|_*\le \lambda}}
\{-f^*(-A^*y)\}.
Proof
Set
g(u):=\lambda\|u\|,
\qquad
\Phi(x,u):=f(x)+g(Ax+u).
Define the marginal value function
p(u):=\inf_{x\in X}\Phi(x,u)=\inf_{x\in X}\{f(x)+\lambda\|Ax+u\|\}.
Because g is finite everywhere on Y, one has \operatorname{dom} g=Y.
Since f is proper, there exists x_0\in X with f(x_0)<+\infty, so for
every u\in Y,
p(u)\le f(x_0)+\lambda\|Ax_0+u\|<+\infty.
Therefore
\operatorname{dom} p=Y,
\qquad
0\in \operatorname{ri}(\operatorname{dom} p).
For y\in Y^*,
\begin{aligned}
\Phi^*(0,y)
&=\sup_{x\in X,\ u\in Y}\{\langle y,u\rangle-f(x)-g(Ax+u)\} \\
&=\sup_{x\in X,\ z\in Y}\{\langle y,z-Ax\rangle-f(x)-g(z)\} \\
&=\sup_{x\in X}\{\langle -A^*y,x\rangle-f(x)\}
+\sup_{z\in Y}\{\langle y,z\rangle-g(z)\} \\
&=f^*(-A^*y)+g^*(y).
\end{aligned}
We now compute g^*. For y\in Y^*,
g^*(y)=\sup_{u\in Y}\{\langle y,u\rangle-\lambda\|u\|\}.
If \|y\|_*\le \lambda, then for every u\in Y,
\langle y,u\rangle\le \|y\|_*\|u\|\le \lambda\|u\|,
so
\langle y,u\rangle-\lambda\|u\|\le 0.
Taking u=0 shows that the supremum is exactly 0.
If instead \|y\|_*>\lambda, then by the definition of the dual norm there
exists u_0\in Y such that
\langle y,u_0\rangle>\lambda\|u_0\|.
For every t>0,
\langle y,tu_0\rangle-\lambda\|tu_0\|
=
t\bigl(\langle y,u_0\rangle-\lambda\|u_0\|\bigr)\to +\infty.
Hence g^*(y)=+\infty. Therefore
g^*(y)=
\begin{cases}
0,&\|y\|_*\le \lambda,\\
+\infty,&\|y\|_*>\lambda.
\end{cases}
If p(0)=-\infty, then part (2) of Theorem 4.4 gives
\sup_{y\in Y^*}\{-\Phi^*(0,y)\}=-\infty.
Since 0\in Y^* and \|0\|_*=0, the feasible set
\{y\in Y^*:\|y\|_*\le \lambda\} is nonempty, so the supremum is then a
maximum, and the desired formula follows.
If instead p(0)>-\infty, then p(0)\in \mathbb{R} because
p(0)<+\infty, and part (4) of Theorem 4.4 yields
p(0)=\max_{y\in Y^*}\{-\Phi^*(0,y)\}.
In this case, since
p(0)=\inf_{x\in X}\{f(x)+\lambda\|Ax\|\}, substituting the formula for
\Phi^*(0,y) and then the formula for g^*(y) yields
\inf_{x\in X}\{f(x)+\lambda\|Ax\|\}
=
\max_{\substack{y\in Y^*\\ \|y\|_*\le \lambda}}
\{-f^*(-A^*y)\}.
4.4. Exercises
-
Exercise 4.1. Fenchel--Rockafellar specialization of Theorem 4.4.
Let
XandYbe finite-dimensional real vector spaces, letA:X\to Ybe linear, and letf:X\to \mathbb{R}\cup\{+\infty\},\qquad g:Y\to \mathbb{R}\cup\{+\infty\}be proper, closed, and convex. Do the following.
-
Define
\Phi(x,u):=f(x)+g(Ax+u)and prove that
\Phiis convex onX\times Y. -
Define
p(u):=\inf_{x\in X}\Phi(x,u)=\inf_{x\in X}\{f(x)+g(Ax+u)\}.Show that
p(0)=\inf_{x\in X}\{f(x)+g(Ax)\}. -
Prove that for every
y\in Y^*,\Phi^*(0,y)=f^*(-A^*y)+g^*(y). -
Show that
\operatorname{dom} p=\operatorname{dom} g-A(\operatorname{dom} f) =-\bigl(A(\operatorname{dom} f)-\operatorname{dom} g\bigr).Deduce that
0\in \operatorname{ri}\bigl(A(\operatorname{dom} f)-\operatorname{dom} g\bigr) \iff 0\in \operatorname{ri}(\operatorname{dom} p). -
Use part (1) of Theorem 4.4 to prove
\sup_{y\in Y^*}\{-f^*(-A^*y)-g^*(y)\} \le \inf_{x\in X}\{f(x)+g(Ax)\}. -
Assume additionally that
0\in \operatorname{ri}\bigl(A(\operatorname{dom} f)-\operatorname{dom} g\bigr)and that the primal infimum is finite. Use part (4) of Theorem 4.4 to prove
\inf_{x\in X}\{f(x)+g(Ax)\} = \max_{y\in Y^*}\{-f^*(-A^*y)-g^*(y)\}.
-
-
Exercise 4.2. Dual norm twice.
Let
\|\cdot\|be a norm onE, and let\|\cdot\|_*be its dual norm onE^*. Prove that, under the natural identificationE\simeq E^{**}, the dual norm of\|\cdot\|_*is the original norm:\|x\|_{**}=\|x\| \qquad \forall x\in E.