3. Lecture 3: Linear Programming
Lecture 2 develops separation for convex sets and cones. Lecture 3 shows how that geometry specializes to the most classical optimization model. Linear programming is the first place where one can see dual certificates, infeasibility certificates, and optimality certificates all at once.
Let x_1 denote the number of research-focused work hours this week, and let
x_2 denote the number of course-focused work hours. Suppose the week must
deliver at least 6 units of research progress and 9 units of course
preparation. The per-hour contributions are:
(units/hour)
(units/hour)
If the corresponding strain costs are 5 and 4, respectively, then the
resulting linear program is
\min\{5x_1+4x_2: 2x_1+x_2\ge 6,\ x_1+2x_2\ge 9,\ x_1\ge 0,\ x_2\ge 0\}.
It turns out the optimal solution is (x_1,x_2)=(1,4); it is feasible and
has objective value
5\cdot 1+4\cdot 4=21.
To certify that no feasible point can do better, take a positive linear combination of the two requirement inequalities:
2(2x_1+x_2\ge 6)+(x_1+2x_2\ge 9).
This gives
5x_1+4x_2\ge 21.
Thus every feasible point has objective value at least 21, and since
(1,4) attains this bound, it is optimal. The coefficients in this
certificate are already hinting at something dual: the first requirement
carries weight 2, while the second carries weight 1.
5x₁ + 4x₂ = 21, which first meets the feasible region at (1,4).
Let A\in\mathbb{R}^{m\times n}, b\in\mathbb{R}^m, and
c\in\mathbb{R}^n. The canonical primal LP is
\inf\{c^\top x: Ax\ge b,\ x\ge 0 \}.
Its dual is
\sup\{b^\top y: A^\top y\le c,\ y\ge 0 \}.
Let x\in\mathbb{R}^n satisfy Ax\ge b and x\ge 0, and let
y\in\mathbb{R}^m satisfy A^\top y\le c and y\ge 0. Then
b^\top y\le c^\top x.
Every dual-feasible vector y therefore gives a valid lower bound on the
primal objective. In the graduate-student example, the positive linear
combination certificate is exactly of this kind: dual feasibility guarantees
that the weighted requirements never exceed the objective coefficients, and the
resulting right-hand side gives a global lower bound on every primal-feasible
point.
Proof
Because y\ge 0 and Ax\ge b, one has
b^\top y\le y^\top Ax.
Transposing the scalar gives
y^\top Ax=x^\top A^\top y.
Since A^\top y\le c coordinatewise and x\ge 0 coordinatewise,
x^\top A^\top y\le x^\top c.
Therefore
b^\top y\le c^\top x.
Formal Statement and Proof
Lean theorem: Lecture03.lem_l3_weak_duality.
theorem Lecture03_lem_l3_weak_duality {m n : ℕ}
{A : Matrix (Fin m) (Fin n) ℝ} {b : Fin m → ℝ} {c : Fin n → ℝ}
{x : Fin n → ℝ} {y : Fin m → ℝ}
(hx : lpReqPrimalFeasible A b x) (hy : lpReqDualFeasible A c y) :
dotProduct b y ≤ dotProduct c x := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhx:lpReqPrimalFeasible A b xhy:lpReqDualFeasible A c y⊢ b ⬝ᵥ y ≤ c ⬝ᵥ x
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhy:lpReqDualFeasible A c yhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x j⊢ b ⬝ᵥ y ≤ c ⬝ᵥ x
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y i⊢ b ⬝ᵥ y ≤ c ⬝ᵥ x
calc
dotProduct b y ≤ dotProduct (A.mulVec x) y := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y i⊢ b ⬝ᵥ y ≤ A.mulVec x ⬝ᵥ y
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y i⊢ ∀ i ∈ Finset.univ, b i * y i ≤ A.mulVec x i * y i
intro i m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y ii:Fin mhi:i ∈ Finset.univ⊢ b i * y i ≤ A.mulVec x i * y i
All goals completed! 🐙
_ = dotProduct y (A.mulVec x) := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y i⊢ A.mulVec x ⬝ᵥ y = y ⬝ᵥ A.mulVec x All goals completed! 🐙
_ = dotProduct (A.transpose.mulVec y) x := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y i⊢ y ⬝ᵥ A.mulVec x = A.transpose.mulVec y ⬝ᵥ x
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y i⊢ Matrix.vecMul y A ⬝ᵥ x = A.transpose.mulVec y ⬝ᵥ x
have hvec : Matrix.vecMul y A = A.transpose.mulVec y := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y i⊢ Matrix.vecMul y A = A.transpose.mulVec y
All goals completed! 🐙
All goals completed! 🐙
_ = dotProduct x (A.transpose.mulVec y) := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y i⊢ A.transpose.mulVec y ⬝ᵥ x = x ⬝ᵥ A.transpose.mulVec y All goals completed! 🐙
_ ≤ dotProduct x c := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y i⊢ x ⬝ᵥ A.transpose.mulVec y ≤ x ⬝ᵥ c
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y i⊢ ∀ i ∈ Finset.univ, x i * A.transpose.mulVec y i ≤ x i * c i
intro j m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y ij:Fin nhj:j ∈ Finset.univ⊢ x j * A.transpose.mulVec y j ≤ x j * c j
All goals completed! 🐙
_ = dotProduct c x := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhAx:∀ (i : Fin m), b i ≤ A.mulVec x ihx_nonneg:∀ (j : Fin n), 0 ≤ x jhAy:∀ (j : Fin n), A.transpose.mulVec y j ≤ c jhy_nonneg:∀ (i : Fin m), 0 ≤ y i⊢ x ⬝ᵥ c = c ⬝ᵥ x All goals completed! 🐙
Let
P:=\{c^\top x: Ax\ge b,\ x\ge 0\},\qquad
D:=\{b^\top y: A^\top y\le c,\ y\ge 0\}.
Then P and D are closed subsets of \mathbb R.
This is the extra input needed to upgrade weak duality and the Farkas cutoff argument into full attainment. Each objective-value set is an affine slice of a finitely generated cone, so closedness comes from finite-dimensional cone geometry rather than compactness.
Proof
For the primal side, introduce slack variables s\ge 0 and rewrite
Ax\ge b,\qquad x\ge 0,\qquad c^\top x=r
as
Ax-s=b,\qquad x\ge 0,\qquad s\ge 0,\qquad c^\top x=r.
Thus r\in P iff (b,r) lies in the image of the nonnegative orthant under
the linear map
(x,s)\longmapsto \binom{Ax-s}{c^\top x}.
Equivalently, P is an affine slice of a finitely generated cone, hence is
closed. The dual side is identical after introducing slack variables
t\ge 0 for the inequalities A^\top y\le c.
Formal Statement and Proof
Lean theorem: Lecture03.lem_l3_objective_values_closed.
theorem Lecture03_lem_l3_objective_values_closed {m n : ℕ}
(A : Matrix (Fin m) (Fin n) ℝ) (b : Fin m → ℝ) (c : Fin n → ℝ) :
IsClosed (lpReqPrimalObjectiveValues A b c) ∧
IsClosed (lpReqDualObjectiveValues A b c) := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝ⊢ IsClosed (lpReqPrimalObjectiveValues A b c) ∧ IsClosed (lpReqDualObjectiveValues A b c)
All goals completed! 🐙
Assume that the primal LP is feasible and has finite optimal value. Then the dual LP is feasible, has finite optimal value, and
\inf\{c^\top x: Ax\ge b,\ x\ge 0 \}
=
\sup\{b^\top y: A^\top y\le c,\ y\ge 0 \}.
Moreover, primal and dual optimal solutions exist.
Theorem 3.3 answers the main question left open by Lemma 3.1: how tight can a dual lower bound be? It says that, for linear programs, the best dual certificate is exactly tight. In fact, primal and dual optimal solutions exist.
Proof
Let
p^*:=\inf\{c^\top x: Ax\ge b,\ x\ge 0\}.
By hypothesis, p^*\in\mathbb{R} and the primal feasible set is nonempty.
Fix any scalar \mu<p^*. Then the system
Ax\ge b,\qquad x\ge 0,\qquad c^\top x\le \mu
is infeasible. Rewrite this as
\begin{pmatrix}
A\\
-c^\top
\end{pmatrix}x
\ge
\binom{b}{-\mu}.
By Theorem 3.4, there exists (y,\alpha)\in\mathbb{R}^m\times\mathbb{R} such
that
\begin{pmatrix}
A\\
-c^\top
\end{pmatrix}^\top
\binom{y}{\alpha}
\le 0,\qquad
\binom{y}{\alpha}\ge 0,\qquad
\binom{b}{-\mu}^\top\binom{y}{\alpha}>0.
Equivalently,
A^\top y-\alpha c\le 0,\qquad
y\ge 0,\qquad
\alpha\ge 0,\qquad
b^\top y-\alpha\mu>0.
We claim that \alpha>0. Indeed, if \alpha=0, then
A^\top y\le 0, y\ge 0, and b^\top y>0, which contradicts primal
feasibility by Theorem 3.4. Thus \alpha>0. Define
\bar y:=\frac{y}{\alpha}.
Then
A^\top \bar y\le c,\qquad \bar y\ge 0,
so \bar y is dual feasible, and
b^\top \bar y=\frac{b^\top y}{\alpha}>\mu.
Because \mu<p^* was arbitrary, this shows that for every \mu<p^* there
exists a dual-feasible point with objective value strictly larger than
\mu. Therefore the dual optimal value d^* satisfies
d^*\ge p^*.
On the other hand, weak duality gives d^*\le p^*. Hence
p^*=d^*.
To conclude attainment, let
P:=\{c^\top x:Ax\ge b,\ x\ge 0\},\qquad
D:=\{b^\top y:A^\top y\le c,\ y\ge 0\}.
By Lemma 3.2, both P and D are closed subsets of \mathbb R. The set
P is nonempty and bounded below, so p^*=\inf P\in P; hence there exists
a primal-optimal point x^\star. The previous cutoff argument shows that for
every \mu<p^* there exists a dual-feasible point with objective value
strictly larger than \mu, so d^*=p^*=\sup D. Weak duality implies
D\subseteq(-\infty,p^*], so D is also bounded above. Since D is
closed and nonempty, its supremum belongs to D. Therefore there exists a
dual-optimal point y^\star satisfying
A^\top y^\star\le c,\qquad
y^\star\ge 0,\qquad
b^\top y^\star=p^*.
Thus the dual optimum is attained.
Formal Statement and Proof
Lean theorem: Lecture03.thm_l3_lp_strong.
theorem Lecture03_thm_l3_lp_strong {m n : ℕ}
{A : Matrix (Fin m) (Fin n) ℝ} {b : Fin m → ℝ} {c : Fin n → ℝ}
(hnonempty : (lpReqPrimalObjectiveValues A b c).Nonempty)
(hbounded : BddBelow (lpReqPrimalObjectiveValues A b c)) :
∃ x : Fin n → ℝ, ∃ y : Fin m → ℝ,
lpReqPrimalOptimal A b c x ∧
lpReqDualOptimal A b c y ∧
sInf (lpReqPrimalObjectiveValues A b c) = sSup (lpReqDualObjectiveValues A b c) ∧
sInf (lpReqPrimalObjectiveValues A b c) = dotProduct c x ∧
sSup (lpReqDualObjectiveValues A b c) = dotProduct b y := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝhnonempty:(lpReqPrimalObjectiveValues A b c).Nonemptyhbounded:BddBelow (lpReqPrimalObjectiveValues A b c)⊢ ∃ x y,
lpReqPrimalOptimal A b c x ∧
lpReqDualOptimal A b c y ∧
sInf (lpReqPrimalObjectiveValues A b c) = sSup (lpReqDualObjectiveValues A b c) ∧
sInf (lpReqPrimalObjectiveValues A b c) = c ⬝ᵥ x ∧ sSup (lpReqDualObjectiveValues A b c) = b ⬝ᵥ y
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝhnonempty:(lpReqPrimalObjectiveValues A b c).Nonemptyhbounded:BddBelow (lpReqPrimalObjectiveValues A b c)x:Fin n → ℝhx:lpReqPrimalOptimal A b c x⊢ ∃ x y,
lpReqPrimalOptimal A b c x ∧
lpReqDualOptimal A b c y ∧
sInf (lpReqPrimalObjectiveValues A b c) = sSup (lpReqDualObjectiveValues A b c) ∧
sInf (lpReqPrimalObjectiveValues A b c) = c ⬝ᵥ x ∧ sSup (lpReqDualObjectiveValues A b c) = b ⬝ᵥ y
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝhnonempty:(lpReqPrimalObjectiveValues A b c).Nonemptyhbounded:BddBelow (lpReqPrimalObjectiveValues A b c)x:Fin n → ℝhx:lpReqPrimalOptimal A b c xy:Fin m → ℝhy:lpReqDualOptimal A b c yhgap:sInf (lpReqPrimalObjectiveValues A b c) = sSup (lpReqDualObjectiveValues A b c)hyVal:sSup (lpReqDualObjectiveValues A b c) = b ⬝ᵥ y⊢ ∃ x y,
lpReqPrimalOptimal A b c x ∧
lpReqDualOptimal A b c y ∧
sInf (lpReqPrimalObjectiveValues A b c) = sSup (lpReqDualObjectiveValues A b c) ∧
sInf (lpReqPrimalObjectiveValues A b c) = c ⬝ᵥ x ∧ sSup (lpReqDualObjectiveValues A b c) = b ⬝ᵥ y
All goals completed! 🐙
Let A\in\mathbb{R}^{m\times n} and b\in\mathbb{R}^m. Exactly one of the
following two statements holds:
-
there exists
x\in\mathbb{R}^nsuch thatAx\ge bandx\ge 0; -
there exists
y\in\mathbb{R}^msuch thatA^\top y\le 0,y\ge 0, andb^\top y>0.
This is the infeasibility-certificate theorem behind the proof of strong
duality. It says that if the system Ax\ge b, x\ge 0 has no solution,
then there is a nonnegative vector y such that A^\top y\le 0 and
b^\top y>0. Any feasible x would force
b^\top y\le y^\top Ax=x^\top A^\top y\le 0,
so such a vector y explicitly rules out feasibility.
Proof
We first show that the two statements cannot hold simultaneously. Assume that
there exist x\in\mathbb{R}^n and y\in\mathbb{R}^m such that
Ax\ge b,\quad x\ge 0,\quad A^\top y\le 0,\quad y\ge 0,\quad b^\top y>0.
Then
b^\top y\le y^\top Ax=x^\top A^\top y\le 0,
which contradicts b^\top y>0. Thus at most one statement can hold.
We now show that at least one statement holds. Define
K:=-A\mathbb{R}_+^n+\mathbb{R}_+^m
=\{-Ax+s:x\in\mathbb{R}_+^n,\ s\in\mathbb{R}_+^m\}.
Because \mathbb{R}_+^n\times\mathbb{R}_+^m is a finitely generated cone and
(x,s)\mapsto -Ax+s is linear, K is a closed convex cone. Statement (1)
is equivalent to b\in A\mathbb{R}_+^n-\mathbb{R}_+^m, hence to
-b\in K. If statement (1) fails, then -b\notin K.
Applying the cone separation theorem from Lecture 2 to the closed convex cone
K and the point -b\notin K, we obtain a vector y\in\mathbb{R}^m such
that
\forall z\in K,\qquad \langle y,z\rangle\ge 0,
\qquad\text{and}\qquad
\langle y,-b\rangle<0.
For each standard basis vector e_i\in\mathbb{R}^m, one has
e_i\in\mathbb{R}_+^m\subseteq K, and therefore
\langle y,e_i\rangle=y_i\ge 0.
Thus y\ge 0. Moreover, for every x\in\mathbb{R}_+^n, one has
-Ax\in K, so
\langle y,-Ax\rangle\ge 0.
Equivalently,
x^\top A^\top y\le 0 \qquad \forall x\in\mathbb{R}_+^n.
Taking x=e_j shows (A^\top y)_j\le 0 for every j, so
A^\top y\le 0. Finally,
b^\top y=-\langle y,-b\rangle>0.
Thus statement (2) holds. This proves that exactly one of the two statements is true.
Formal Statement and Proof
Lean theorem: Lecture03.thm_l3_farkas.
theorem Lecture03_thm_l3_farkas {m n : ℕ}
(A : Matrix (Fin m) (Fin n) ℝ) (b : Fin m → ℝ) :
((∃ x : Fin n → ℝ, lpReqPrimalFeasible A b x) ∨
(∃ y : Fin m → ℝ, lpReqDualFeasible A 0 y ∧ 0 < dotProduct b y)) ∧
¬ ((∃ x : Fin n → ℝ, lpReqPrimalFeasible A b x) ∧
(∃ y : Fin m → ℝ, lpReqDualFeasible A 0 y ∧ 0 < dotProduct b y)) := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝ⊢ ((∃ x, lpReqPrimalFeasible A b x) ∨ ∃ y, lpReqDualFeasible A 0 y ∧ 0 < b ⬝ᵥ y) ∧
¬((∃ x, lpReqPrimalFeasible A b x) ∧ ∃ y, lpReqDualFeasible A 0 y ∧ 0 < b ⬝ᵥ y)
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝ⊢ ¬((∃ x, lpReqPrimalFeasible A b x) ∧ ∃ y, lpReqDualFeasible A 0 y ∧ 0 < b ⬝ᵥ y)
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝx:Fin n → ℝhx:lpReqPrimalFeasible A b xy:Fin m → ℝhy:lpReqDualFeasible A 0 yhy_obj:0 < b ⬝ᵥ y⊢ False
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝx:Fin n → ℝhx:lpReqPrimalFeasible A b xy:Fin m → ℝhy:lpReqDualFeasible A 0 yhy_obj:0 < b ⬝ᵥ yhwd:b ⬝ᵥ y ≤ (fun x => 0) ⬝ᵥ x := lem_l3_weak_duality hx hy⊢ False
have hwd' : dotProduct b y ≤ 0 := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝx:Fin n → ℝhx:lpReqPrimalFeasible A b xy:Fin m → ℝhy:lpReqDualFeasible A 0 yhy_obj:0 < b ⬝ᵥ yhwd:b ⬝ᵥ y ≤ (fun x => 0) ⬝ᵥ x := lem_l3_weak_duality hx hy⊢ b ⬝ᵥ y ≤ 0
All goals completed! 🐙
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝx:Fin n → ℝhx:lpReqPrimalFeasible A b xy:Fin m → ℝhy:lpReqDualFeasible A 0 yhy_obj:0 < b ⬝ᵥ yhwd:b ⬝ᵥ y ≤ (fun x => 0) ⬝ᵥ x := lem_l3_weak_duality hx hyhwd':b ⬝ᵥ y ≤ 0 :=
id
(Eq.mp
(congrArg (LE.le (∑ i, b i * y i))
(Eq.trans (Finset.sum_congr (Eq.refl Finset.univ) fun x_1 a => zero_mul (x x_1)) Finset.sum_const_zero))
hwd)hy_obj':0 < b ⬝ᵥ y := hy_obj⊢ False
All goals completed! 🐙
Let x^\star\in\mathbb{R}^n and y^\star\in\mathbb{R}^m. The following are
equivalent:
-
x^\staris primal feasible,y^\staris dual feasible, and both are optimal; -
x^\staris primal feasible,y^\staris dual feasible, and\forall i\in\{1,\dots,m\},\qquad y_i^\star\bigl((Ax^\star)_i-b_i\bigr)=0,and
\forall j\in\{1,\dots,n\},\qquad x_j^\star\bigl(c_j-(A^\top y^\star)_j\bigr)=0.
Finally, Theorem 3.5 characterizes exactly when equality holds in weak duality. At optimality, every positive dual multiplier must sit on a tight primal requirement, and every positive primal variable must sit on a tight dual inequality. This is the first instance of a pattern that will recur throughout the course: optimality is not only about matching objective values, but also about matching the geometry of the active constraints to the geometry of the dual certificate.
Proof
Assume first that item (1) holds. Then x^\star is primal feasible,
y^\star is dual feasible, and both are optimal. By weak duality,
b^\top y^\star\le c^\top x^\star.
Because both points are optimal, equality must hold:
c^\top x^\star=b^\top y^\star.
Using dual feasibility,
c^\top x^\star-b^\top y^\star
=
x^{\star\top}(c-A^\top y^\star)+y^{\star\top}(Ax^\star-b).
Equivalently,
c^\top x^\star-b^\top y^\star
=
\sum_{j=1}^n x_j^\star\bigl(c_j-(A^\top y^\star)_j\bigr)
\;+\;
\sum_{i=1}^m y_i^\star\bigl((Ax^\star)_i-b_i\bigr).
Each term in the sum is nonnegative because x^\star\ge 0,
c-A^\top y^\star\ge 0, y^\star\ge 0, and Ax^\star-b\ge 0. Since the
sum equals zero, every term must be zero. Thus item (2) holds.
Assume now that item (2) holds. Then x^\star is primal feasible,
y^\star is dual feasible, and the complementary-slackness products vanish.
Therefore
c^\top x^\star-b^\top y^\star
=
x^{\star\top}(c-A^\top y^\star)+y^{\star\top}(Ax^\star-b)=0.
Hence
b^\top y^\star=c^\top x^\star.
By weak duality, every dual-feasible objective value is at most every
primal-feasible objective value. Since equality is achieved by
(x^\star,y^\star), both points are optimal. Thus item (1) holds.
Formal Statement and Proof
Lean theorem: Lecture03.thm_l3_cs.
theorem Lecture03_thm_l3_cs {m n : ℕ}
{A : Matrix (Fin m) (Fin n) ℝ} {b : Fin m → ℝ} {c : Fin n → ℝ}
{x : Fin n → ℝ} {y : Fin m → ℝ}
(hx : lpReqPrimalFeasible A b x) (hy : lpReqDualFeasible A c y) :
(lpReqPrimalOptimal A b c x ∧ lpReqDualOptimal A b c y) ↔
((∀ j, x j * (c j - (A.transpose.mulVec y) j) = 0) ∧
(∀ i, y i * ((A.mulVec x) i - b i) = 0)) := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhx:lpReqPrimalFeasible A b xhy:lpReqDualFeasible A c y⊢ lpReqPrimalOptimal A b c x ∧ lpReqDualOptimal A b c y ↔
(∀ (j : Fin n), x j * (c j - A.transpose.mulVec y j) = 0) ∧ ∀ (i : Fin m), y i * (A.mulVec x i - b i) = 0
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhx:lpReqPrimalFeasible A b xhy:lpReqDualFeasible A c y⊢ lpReqPrimalOptimal A b c x ∧ lpReqDualOptimal A b c y →
(∀ (j : Fin n), x j * (c j - A.transpose.mulVec y j) = 0) ∧ ∀ (i : Fin m), y i * (A.mulVec x i - b i) = 0m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhx:lpReqPrimalFeasible A b xhy:lpReqDualFeasible A c y⊢ ((∀ (j : Fin n), x j * (c j - A.transpose.mulVec y j) = 0) ∧ ∀ (i : Fin m), y i * (A.mulVec x i - b i) = 0) →
lpReqPrimalOptimal A b c x ∧ lpReqDualOptimal A b c y
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhx:lpReqPrimalFeasible A b xhy:lpReqDualFeasible A c y⊢ lpReqPrimalOptimal A b c x ∧ lpReqDualOptimal A b c y →
(∀ (j : Fin n), x j * (c j - A.transpose.mulVec y j) = 0) ∧ ∀ (i : Fin m), y i * (A.mulVec x i - b i) = 0 m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhx:lpReqPrimalFeasible A b xhy:lpReqDualFeasible A c yhopt:lpReqPrimalOptimal A b c x ∧ lpReqDualOptimal A b c y⊢ (∀ (j : Fin n), x j * (c j - A.transpose.mulVec y j) = 0) ∧ ∀ (i : Fin m), y i * (A.mulVec x i - b i) = 0
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhx:lpReqPrimalFeasible A b xhy:lpReqDualFeasible A c yhopt:lpReqPrimalOptimal A b c x ∧ lpReqDualOptimal A b c yhgap:c ⬝ᵥ x = b ⬝ᵥ y := lpReq_zero_gap_of_primalOptimal_dualOptimal hopt.left hopt.right⊢ (∀ (j : Fin n), x j * (c j - A.transpose.mulVec y j) = 0) ∧ ∀ (i : Fin m), y i * (A.mulVec x i - b i) = 0
All goals completed! 🐙
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhx:lpReqPrimalFeasible A b xhy:lpReqDualFeasible A c y⊢ ((∀ (j : Fin n), x j * (c j - A.transpose.mulVec y j) = 0) ∧ ∀ (i : Fin m), y i * (A.mulVec x i - b i) = 0) →
lpReqPrimalOptimal A b c x ∧ lpReqDualOptimal A b c y m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝb:Fin m → ℝc:Fin n → ℝx:Fin n → ℝy:Fin m → ℝhx:lpReqPrimalFeasible A b xhy:lpReqDualFeasible A c yhcs:(∀ (j : Fin n), x j * (c j - A.transpose.mulVec y j) = 0) ∧ ∀ (i : Fin m), y i * (A.mulVec x i - b i) = 0⊢ lpReqPrimalOptimal A b c x ∧ lpReqDualOptimal A b c y
All goals completed! 🐙
3.1. Value Functions and Shadow Prices
Fix A\in\mathbb{R}^{m\times n} and c\in\mathbb{R}^n. The
requirement-perturbation value function associated with the canonical LP is
V:\mathbb{R}^m\to \mathbb{R}\cup\{+\infty\},
\qquad
V(u):=\inf\{c^\top x: Ax\ge u,\ x\ge 0\}.
The function V in Definition 3.2 is convex.
Proof
We record two equivalent proofs.
First proof: direct from the definition of convexity. Fix
u^1,u^2\in\mathbb{R}^m and \theta\in[0,1]. If either V(u^1) or
V(u^2) is +\infty, then the convexity inequality is immediate, so assume
both are finite. Let \varepsilon>0. Choose x^1,x^2\in\mathbb{R}^n such
that
Ax^1\ge u^1,\quad x^1\ge 0,\quad c^\top x^1\le V(u^1)+\varepsilon,
and
Ax^2\ge u^2,\quad x^2\ge 0,\quad c^\top x^2\le V(u^2)+\varepsilon.
Then
\bar x:=\theta x^1+(1-\theta)x^2
satisfies \bar x\ge 0 and
A\bar x=\theta Ax^1+(1-\theta)Ax^2
\ge
\theta u^1+(1-\theta)u^2.
Hence \bar x is feasible for the right-hand side
\theta u^1+(1-\theta)u^2. Therefore
V(\theta u^1+(1-\theta)u^2)
\le
c^\top \bar x
=
\theta c^\top x^1+(1-\theta)c^\top x^2
\le
\theta V(u^1)+(1-\theta)V(u^2)+\varepsilon.
Since \varepsilon>0 was arbitrary, it follows that
V(\theta u^1+(1-\theta)u^2)
\le
\theta V(u^1)+(1-\theta)V(u^2).
Second proof: geometric viewpoint. Consider the set
\mathcal E:=\{(u,x,t)\in \mathbb{R}^m\times\mathbb{R}^n\times\mathbb{R}:
Ax\ge u,\ x\ge 0,\ c^\top x\le t\}.
This is a convex set: each defining constraint is affine in (u,x,t), so
\mathcal E is an intersection of affine halfspaces. By definition,
(u,t)\in \pi_{u,t}(\mathcal E)
\quad\Longleftrightarrow\quad
\exists x\ge 0 \text{ such that } Ax\ge u \text{ and } c^\top x\le t,
which is exactly the condition V(u)\le t. Hence
\operatorname{epi}(V)=\pi_{u,t}(\mathcal E).
In other words, taking the partial infimum over x is exactly the same as
projecting the epigraph onto the (u,t)-coordinates. Projections preserve
convexity, so \operatorname{epi}(V) is convex, and thus V is convex.
Formal Statement and Proof
Lean theorem: Lecture03.lem_l3_value_function_convex.
theorem Lecture03_lem_l3_value_function_convex {m n : ℕ}
(A : Matrix (Fin m) (Fin n) ℝ) (c : Fin n → ℝ) :
EConvexOn Set.univ (lpReqValueFunction A c) := m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝc:Fin n → ℝ⊢ EConvexOn Set.univ (lpReqValueFunction A c)
m:ℕn:ℕA:Matrix (Fin m) (Fin n) ℝc:Fin n → ℝhNoBot:NoBot (lpReqValueFunction A c) := noBot_lpReqValueFunction A c⊢ EConvexOn Set.univ (lpReqValueFunction A c)
All goals completed! 🐙
Now weak duality applies not only to the original LP, but simultaneously to
the whole perturbed family. If y\in\mathbb{R}^m$ satisfies
A^\top y\le c$ and y\ge 0$, then the same calculation as in Lemma 3.1
shows that
V(u)\ge u^\top y.
Thus every dual-feasible vector y gives an affine lower bound on the value
function V, not just on the single number V(b). In this enlarged
picture, the meaning of the dual variable becomes much easier to see: it
measures how the optimal cost responds to changes in the requirement vector,
which is exactly the usual economic interpretation of a shadow price. If
strong duality holds at a chosen base requirement vector b and
y^\star is dual optimal there, then
V(b)=b^\top y^\star.
So an optimal dual point is not just an algebraic multiplier: it is a
supporting covector of the convex value function V at the base point
b. In particular, if V happens to be differentiable at b, then this
supporting covector is unique and
y^\star=\nabla V(b).
This is the cleanest economic interpretation of a shadow price: the component
y^\star_i is the marginal increase in optimal cost per unit increase in the
i-th requirement, at least to first order near the base point b. When
V is not differentiable, one should replace the gradient by a supporting
covector, equivalently a subgradient. Lecture 4 abstracts exactly this LP
picture into the general marginal value function p(u)=\inf_x \Phi(x,u).
Return to Example 3.1, but now replace the fixed requirement vector (6,9) by
a variable pair (u_1,u_2)\in\mathbb{R}^2. The resulting value function is
V(u_1,u_2)
:=
\inf\{5x_1+4x_2: 2x_1+x_2\ge u_1,\ x_1+2x_2\ge u_2,\ x_1\ge 0,\ x_2\ge 0\}.
The dual feasible region is the polygon
\{(p_1,p_2)\in\mathbb{R}^2:
2p_1+p_2\le 5,\ p_1+2p_2\le 4,\ p_1\ge 0,\ p_2\ge 0\},
whose vertices are
(0,0),\qquad \left(\frac52,0\right),\qquad (2,1),\qquad (0,2).
By LP strong duality, V(u_1,u_2) is therefore the pointwise maximum of the
corresponding affine functions:
V(u_1,u_2)=\max\{0,\ \tfrac52 u_1,\ 2u_1+u_2,\ 2u_2\}.
At the base point (u_1,u_2)=(6,9), the active piece is
2u_1+u_2,
so
V(6,9)=2\cdot 6+9=21.
In this example, V is differentiable at (6,9), and
\nabla V(6,9)=(2,1).
So near (6,9), increasing the first requirement by one unit raises the
optimal cost by about 2, while increasing the second requirement by one
unit raises it by about 1. This is the easiest case to interpret
economically: the shadow price is literally the gradient of the value
function. In general, however, LP value functions are only piecewise linear
and need not be differentiable at kinks; then the correct general object is
not a unique gradient, but a supporting covector, equivalently a subgradient.
3.2. Dependency and Proof Sketch
-
Lemma 3.1 is one line:
b^\top y \le y^\top Ax = x^\top A^\top y \le x^\top c = c^\top xbecause
y\ge 0,Ax\ge b,x\ge 0, andA^\top y\le c. -
Lemma 3.2 supplies the attainment input: once the primal and dual objective-value sets are closed subsets of
\mathbb R, finite optimal values are automatically achieved. -
Theorem 3.3 is the first clean specialization of duality to a model class. It says that the best lower bound produced by weak duality is exactly tight. The proof derives it from Theorem 3.4 by applying Farkas to the augmented infeasible system
Ax\ge b,\qquad x\ge 0,\qquad c^\top x\le \mu. -
Theorem 3.4 is the infeasibility-certificate theorem for linear inequalities. Its visible content is simple: if the system
Ax\ge b,x\ge 0has no solution, then there is a simple linear certificate ruling it out. The proof uses the cone-A\mathbb{R}_+^n+\mathbb{R}_+^mtogether with the Lecture 2 cone-separation theorem to produce exactly such a certificatey\ge 0,A^\top y\le 0,b^\top y>0. -
Theorem 3.5 is weak duality plus the observation that the duality gap is exactly the complementary-slackness sum
c^\top x^\star-b^\top y^\star= \sum_{i=1}^m y_i^\star((Ax^\star)_i-b_i) +\sum_{j=1}^n x_j^\star(c_j-(A^\top y^\star)_j). -
Lemma 3.6 is a direct convexification argument: feasible points for right-hand sides
u^1andu^2can be averaged, so the intermediate right-hand side inherits a feasible point with the corresponding convex-combination objective value.
3.3. Exercises
-
Consider the resource-allocation LP
\max\{d^\top z: Bz\le h,\ z\ge 0\}.First prove that it is equivalent to an instance of the canonical LP in Definition 3.1. Then derive its dual in the natural
(B,h,d)variables. -
Consider the equality-constrained LP
\min\{c^\top x: Ax=b,\ x\ge 0\}.First prove that it is equivalent to an instance of the canonical LP in Definition 3.1. Then derive its dual and explain why the equality constraint in the primal leads to a dual variable with no sign restriction.
-
Consider the free-variable LP
\min\{c^\top x: Ax\ge b,\ x\in\mathbb{R}^n\}.First prove that it is equivalent to an instance of the canonical LP in Definition 3.1 by writing each free variable as a difference of two nonnegative variables. Then derive its dual and explain why the lack of a sign restriction on
xturns the dual inequalityA^\top y\le cinto an equality. -
Consider the bounded-variable LP
\min\{c^\top x: Ax\ge b,\ 0\le x\le u\}.First prove that it is equivalent to an instance of the canonical LP in Definition 3.1. Then derive its dual in a form that keeps track of which dual variables correspond to the requirement constraints
Ax\ge band which correspond to the upper boundsx\le u. -
Let
C\subseteq \mathbb{R}^{m+n}, letK,L\subseteq \mathbb{R}^n$ be convex sets, and letT:\mathbb{R}^n\to\mathbb{R}^p$ be affine.Show that the coordinate projection
\pi(C):=\{u\in\mathbb{R}^m:\exists x\in\mathbb{R}^n\text{ such that }(u,x)\in C\}is convex; that the Minkowski sum
K+L:=\{x+y:x\in K,\ y\in L\}is convex; that the affine image
T(K)is convex; and that the affine preimageT^{-1}(L):=\{x\in\mathbb{R}^n:T(x)\in L\}is convex. Which part of Lemma 3.6 can be reinterpreted using the projection statement?