Optimization2

5 minute read

Published:

Optimization(Primal and Dual problem)

Standard Support Vector Machine

\[\begin{aligned} \min\limits_{w,\xi_A \xi_B} C(\mathbf 1^T\xi_A+ \mathbf 1^T\xi_B) & +\frac{1}{2}\Vert w\Vert^2_2 \\ (Aw+\mathbf 1^T b)+\xi_A &\geq 1 \\ (Bw+\mathbf 1^T b)-\xi_B &\leq -1 \\ \xi_A,\xi_B &\geq 0 \end{aligned}\]

Farkas’ Lemma

For any matrix $A\in \mathbb R^{m \times n}$ and any vector $b\in \mathbb R^n$, either

\begin{equation} Ax \leq 0, \quad b^Tx >0 \quad \text{has a solution} \label{1} \end{equation}

or

\begin{equation} A\alpha=b \quad \alpha \geq 0 \quad \text{has a solution} \label{2} \end{equation}

From algebraic view, assume $\eqref{2}$ and $\eqref{1}$ holds,

and \(\alpha \geq 0 \implies Ax>0.\) This leads contradiction.

Form geometric view, $\eqref{2}$ says that

\[\begin{aligned} b &\in \text{row spaces of }A \quad \text{and}\quad \alpha \geq 0 \\ \implies b &\in \text{the cone generated by rows of }A . \end{aligned}\]

On the other hand, $\eqref{1}$ says that

\[\begin{aligned} b^Tx>0 &\implies x \text{ is in the inside part of the hyperplane } H_b \text{ with normal vector }b \\ 0 \geq Ax=\left[ \begin{matrix} A_1^Tx \\ \vdots \\ A_n^Tx \end{matrix} \right] &\implies \text{the angle between } x \text{ and } A_i \geq \pi \implies A_i \text{ is in outside part of }H_b \end{aligned}\]

In other words, $\eqref{2}$ says that \(b \notin \text{the cone generated by rows } A_i\)

Minimization Problem v.s. Kuhn-Tucker Stationary-point Problem

MP:

\[\min\limits_{x\in \Omega}f(x) \quad \text{s.t.} \quad g(x) \leq 0 \quad \text{and} \quad h(x)=0\]

KTSP: Find

\[\bar x \in \Omega , \bar \alpha \in \mathbb R^m, \bar \beta_{+} \in \mathbb R^r, \bar \beta_{-} \in \mathbb R^r \quad \text{s.t.}\] \[\begin{aligned} \nabla f(\bar x)+\bar \alpha^T \nabla g(\bar x)+(\bar \beta_{+}-\bar \beta_{-})^T \nabla h(\bar x) &=0\\ \bar \alpha^T g(\bar x) =0 ,\quad\bar \beta_{+}^T h(\bar x) =0,\quad\bar \beta_{-}^T h(\bar x) &=0\\ g(\bar x) \leq 0,\quad h(\bar x)&=0\\ \bar \alpha ,\beta_{+} ,\beta_{-} &\geq 0 \end{aligned}\]

Generalized Lagrangian Function

\[\mathcal{L}(x,\alpha,\beta)=f(x)+\alpha^Tg(x)+\beta^Th(x)\]

where $\alpha, \beta\geq 0$

  • If $f(x),g(x)$ are convex and $h(x)$ is linear, then $\mathcal{L}$ is convex.
  • For fixed $\alpha \geq 0$, if $\bar x \in \text{argmin}(\mathcal{L(x,\alpha,\beta) \vert x \in \mathbb R^n})$, then
\[\frac{\partial \mathcal L (x,\alpha,\beta)}{\partial x} \bigg\vert_{x=\bar x} =\nabla f(\bar x)+\bar \alpha^T \nabla g(\bar x)+\bar \beta^T \nabla h(\bar x) =0\]
  • Above result is a sufficient condition if $\mathcal{L}(x,\alpha,\beta)$ is convex.

Lagrangian Dual Problem

\[\max\limits_{\alpha, \beta}\min\limits_{x \in \Omega} \mathcal{L}(x,\alpha,\beta) \quad \text{s.t.} \quad \alpha \geq 0 \iff \max\limits_{\alpha, \beta} \theta(\alpha,\beta)\quad \text{s.t.} \quad \alpha \geq 0\]

where

\[\theta(\alpha,\beta)=\inf\limits_{x \in \Omega} \mathcal{L}(x,\alpha,\beta)\]

Weak Duality Theorem

Let $\bar x \in \Omega$ be a feasible solution of the primal problem and $(\alpha,\beta)$ a feasible solution of the dual problem. Then

\[\begin{aligned} f(\bar x) &\geq \theta(\alpha, \beta) \\ \text{where} \quad \theta(\alpha, \beta)=\inf\limits_{x \in \Omega} \mathcal{L}(x,\alpha,\beta) &\leq \mathcal{L}(\bar x,\alpha,\beta) \end{aligned}\]

Corollary 1:

\[\sup\{\theta(\alpha, \beta) |\alpha \geq 0\}\leq \inf\{f(x)|g(x) \leq 0, h(x)=0 \}\]

Corollary 2:

If $f(x^*)=\theta(\alpha^*, \beta^*)$ where $\alpha^* \geq 0$ and $g(x^*) \leq 0$, $h(x^*)=0$, then $x^*$ and $(\alpha^*, \beta^*)$ solve the primal and dual problem respectively. In this case,

\[\alpha^* \bot g(x^*) \iff \alpha_i^* g_i(x^*)=0 \quad \forall i\]

proof:

\[\mathcal{L}(x,\alpha,\beta)=f(x)+\alpha^Tg(x)+\beta^Th(x)\]

If $\hat x$ satisfies $g(\hat x) \leq 0,\alpha\geq 0 \implies \alpha^Tg(\hat x) \leq 0$, then

\begin{equation} \theta(\alpha, \beta)=\inf\limits_{x \in \Omega} \mathcal{L}(x,\alpha,\beta) \leq \mathcal{L}(\hat x,\alpha,\beta) \leq f(\hat x). \label{eq:theta} \end{equation}

The last inequality holds if and only if

\begin{equation} \alpha^Tg(\hat x) = 0 . \label{eq:hold_eq} \end{equation}

Hence, taking sup on the left side while the right side takes inf

\[\sup\{\theta(\alpha, \beta) |\alpha \geq 0\}\leq \inf\{f(x)|g(x) \leq 0, h(x)=0 \}\]

The above holds for corollary 1. If $x^*,\alpha^*,\beta^*$ satisfiy $f(x^*)=\theta(\alpha^*,\beta^*)$, by $\text{corollary 1}$,

\[f(x^*)=\theta(\alpha^*,\beta^*)=\max\limits_{\alpha,\beta}\min\limits_{x \in \Omega} \mathcal{L}(x,\alpha,\beta)\]

is the feasible solution. Therefore, $\eqref{eq:hold_eq}$ implies

\[\alpha^*_i \bot g(x^*) \iff \alpha^*_i g_i(x^*)=0 \quad \forall i\]

This proves $\text{corollary 2}$.

Dual Problem of Linear Program

Primal LP:

\[\begin{aligned} &\min\limits_{x \in \mathbb R^n} p^Tx \\ \text{subject to } \quad &Ax \geq b, x\geq 0 \end{aligned}\]

Dual LP:

\[\begin{aligned} &\max\limits_{\alpha \in \mathbb R^m} b^T\alpha \\ \text{subject to } \quad &A^T\alpha \leq p, \alpha \geq 0 \end{aligned}\]

Write down function $\mathcal L (x,\alpha,\beta)$ as the following:

\[\begin{aligned} \mathcal L (x,\alpha,\beta) &= p^Tx+\alpha_1^T(b-Ax)+\alpha_2^T(-x)\\ &= (p-A^T\alpha_1-\alpha_2)x+\alpha_1^Tb \\ \frac{\partial \mathcal L (x,\alpha,\beta)}{\partial x} \bigg\vert_{x=\bar x} &=p-A^T\alpha_1-\alpha_2 =0 \\ \end{aligned}\]

Hence,

\[\begin{aligned} \theta(\alpha,\beta)&=\min\limits_{x \in \mathbb R^n}\mathcal L (x,\alpha,\beta)=\alpha_1^Tb \\ p-A^T\alpha_1-\alpha_2 &=0 \iff A^T\alpha_1 \leq p ,\alpha_1 \geq 0 \end{aligned}\]

which is the dual form.

Summary of LP Duality

PrimalDual
$\text{min} p^Tx$$\text{max} \alpha^T b$
Primal ConstrainsDual Variable
$Ax\geq b$$\alpha\geq 0$
$Ax= b$free
$Ax\leq b$$\alpha\leq 0$
Primal VariableDual Contrains
$x\geq 0$$A^T \alpha \leq p$
free$A^T \alpha= p$
$x\leq 0$$A^T \alpha \geq p$

Application of LP Duality

LSQ: For any matrix $A \in \mathbb R^{m\times n}$ and any vector $b\in \mathbb R^{m},$ consider

\[\min \Vert Ax-b\Vert_2^2\] \[x^* \in \arg\min\{\Vert Ax-b\Vert_2^2\} \iff A^TAx=A^Tb\]

Claim:

\[A^TAx=A^Tb\]

always has a solution.
Using LP Duality Primal:

\[\mathbf 0^Tx\quad \text{s.t.} \quad A^TAx=A^Tb\]

Dual:

\[b^TA\alpha\quad \text{s.t.} \quad (A^TA)^T\alpha=0\]

Clearly, $\alpha=0$ is the solution of Dual problem which equal to minimum of primal problem. Hence, dual has solution implies primal has solution which $A^TAx=A^Tb$ alwasy has a solution.

Dual Problem of strictly Convex Quadratic Program

Primal QP

\[\min\limits_{x\in \mathbb R^n} \frac{1}{2}x^TQx+p^Tx \quad \text{s.t.} \quad Ax \leq b\]

with strictly convex assumption, we have
Dual QP

\[\max \left\{-\frac{1}{2}(p^T+\alpha^TA)Q^{-1}(A^T\alpha +p)-\alpha^Tb \right\}\quad \text{s.t.}\quad \alpha \geq 0\]