Optimization2
Published:
- Optimization(Primal and Dual problem)
- Standard Support Vector Machine
- Farkas’ Lemma
- Minimization Problem v.s. Kuhn-Tucker Stationary-point Problem
- Generalized Lagrangian Function
- Lagrangian Dual Problem
- Weak Duality Theorem
- Corollary 1:
- Corollary 2:
- Dual Problem of Linear Program
- Summary of LP Duality
- Application of LP Duality
- Dual Problem of strictly Convex Quadratic Program
Optimization(Primal and Dual problem)
- Basically, this is my studying from NTCU open course - Machine Learning. I take the key points for my references.
Standard Support Vector Machine
\[\begin{aligned} \min\limits_{w,\xi_A \xi_B} C(\mathbf 1^T\xi_A+ \mathbf 1^T\xi_B) & +\frac{1}{2}\Vert w\Vert^2_2 \\ (Aw+\mathbf 1^T b)+\xi_A &\geq 1 \\ (Bw+\mathbf 1^T b)-\xi_B &\leq -1 \\ \xi_A,\xi_B &\geq 0 \end{aligned}\]Farkas’ Lemma
For any matrix $A\in \mathbb R^{m \times n}$ and any vector $b\in \mathbb R^n$, either
\begin{equation} Ax \leq 0, \quad b^Tx >0 \quad \text{has a solution} \label{1} \end{equation}
or
\begin{equation} A\alpha=b \quad \alpha \geq 0 \quad \text{has a solution} \label{2} \end{equation}
From algebraic view, assume $\eqref{2}$ and $\eqref{1}$ holds,
and \(\alpha \geq 0 \implies Ax>0.\) This leads contradiction.
Form geometric view, $\eqref{2}$ says that
\[\begin{aligned} b &\in \text{row spaces of }A \quad \text{and}\quad \alpha \geq 0 \\ \implies b &\in \text{the cone generated by rows of }A . \end{aligned}\]On the other hand, $\eqref{1}$ says that
\[\begin{aligned} b^Tx>0 &\implies x \text{ is in the inside part of the hyperplane } H_b \text{ with normal vector }b \\ 0 \geq Ax=\left[ \begin{matrix} A_1^Tx \\ \vdots \\ A_n^Tx \end{matrix} \right] &\implies \text{the angle between } x \text{ and } A_i \geq \pi \implies A_i \text{ is in outside part of }H_b \end{aligned}\]In other words, $\eqref{2}$ says that \(b \notin \text{the cone generated by rows } A_i\)
Minimization Problem v.s. Kuhn-Tucker Stationary-point Problem
MP:
\[\min\limits_{x\in \Omega}f(x) \quad \text{s.t.} \quad g(x) \leq 0 \quad \text{and} \quad h(x)=0\]KTSP: Find
\[\bar x \in \Omega , \bar \alpha \in \mathbb R^m, \bar \beta_{+} \in \mathbb R^r, \bar \beta_{-} \in \mathbb R^r \quad \text{s.t.}\] \[\begin{aligned} \nabla f(\bar x)+\bar \alpha^T \nabla g(\bar x)+(\bar \beta_{+}-\bar \beta_{-})^T \nabla h(\bar x) &=0\\ \bar \alpha^T g(\bar x) =0 ,\quad\bar \beta_{+}^T h(\bar x) =0,\quad\bar \beta_{-}^T h(\bar x) &=0\\ g(\bar x) \leq 0,\quad h(\bar x)&=0\\ \bar \alpha ,\beta_{+} ,\beta_{-} &\geq 0 \end{aligned}\]Generalized Lagrangian Function
\[\mathcal{L}(x,\alpha,\beta)=f(x)+\alpha^Tg(x)+\beta^Th(x)\]where $\alpha, \beta\geq 0$
- If $f(x),g(x)$ are convex and $h(x)$ is linear, then $\mathcal{L}$ is convex.
- For fixed $\alpha \geq 0$, if $\bar x \in \text{argmin}(\mathcal{L(x,\alpha,\beta) \vert x \in \mathbb R^n})$, then
- Above result is a sufficient condition if $\mathcal{L}(x,\alpha,\beta)$ is convex.
Lagrangian Dual Problem
\[\max\limits_{\alpha, \beta}\min\limits_{x \in \Omega} \mathcal{L}(x,\alpha,\beta) \quad \text{s.t.} \quad \alpha \geq 0 \iff \max\limits_{\alpha, \beta} \theta(\alpha,\beta)\quad \text{s.t.} \quad \alpha \geq 0\]where
\[\theta(\alpha,\beta)=\inf\limits_{x \in \Omega} \mathcal{L}(x,\alpha,\beta)\]Weak Duality Theorem
Let $\bar x \in \Omega$ be a feasible solution of the primal problem and $(\alpha,\beta)$ a feasible solution of the dual problem. Then
\[\begin{aligned} f(\bar x) &\geq \theta(\alpha, \beta) \\ \text{where} \quad \theta(\alpha, \beta)=\inf\limits_{x \in \Omega} \mathcal{L}(x,\alpha,\beta) &\leq \mathcal{L}(\bar x,\alpha,\beta) \end{aligned}\]Corollary 1:
\[\sup\{\theta(\alpha, \beta) |\alpha \geq 0\}\leq \inf\{f(x)|g(x) \leq 0, h(x)=0 \}\]Corollary 2:
If $f(x^*)=\theta(\alpha^*, \beta^*)$ where $\alpha^* \geq 0$ and $g(x^*) \leq 0$, $h(x^*)=0$, then $x^*$ and $(\alpha^*, \beta^*)$ solve the primal and dual problem respectively. In this case,
\[\alpha^* \bot g(x^*) \iff \alpha_i^* g_i(x^*)=0 \quad \forall i\]proof:
\[\mathcal{L}(x,\alpha,\beta)=f(x)+\alpha^Tg(x)+\beta^Th(x)\]If $\hat x$ satisfies $g(\hat x) \leq 0,\alpha\geq 0 \implies \alpha^Tg(\hat x) \leq 0$, then
\begin{equation} \theta(\alpha, \beta)=\inf\limits_{x \in \Omega} \mathcal{L}(x,\alpha,\beta) \leq \mathcal{L}(\hat x,\alpha,\beta) \leq f(\hat x). \label{eq:theta} \end{equation}
The last inequality holds if and only if
\begin{equation} \alpha^Tg(\hat x) = 0 . \label{eq:hold_eq} \end{equation}
Hence, taking sup on the left side while the right side takes inf
\[\sup\{\theta(\alpha, \beta) |\alpha \geq 0\}\leq \inf\{f(x)|g(x) \leq 0, h(x)=0 \}\]The above holds for corollary 1. If $x^*,\alpha^*,\beta^*$ satisfiy $f(x^*)=\theta(\alpha^*,\beta^*)$, by $\text{corollary 1}$,
\[f(x^*)=\theta(\alpha^*,\beta^*)=\max\limits_{\alpha,\beta}\min\limits_{x \in \Omega} \mathcal{L}(x,\alpha,\beta)\]is the feasible solution. Therefore, $\eqref{eq:hold_eq}$ implies
\[\alpha^*_i \bot g(x^*) \iff \alpha^*_i g_i(x^*)=0 \quad \forall i\]This proves $\text{corollary 2}$.
Dual Problem of Linear Program
Primal LP:
\[\begin{aligned} &\min\limits_{x \in \mathbb R^n} p^Tx \\ \text{subject to } \quad &Ax \geq b, x\geq 0 \end{aligned}\]Dual LP:
\[\begin{aligned} &\max\limits_{\alpha \in \mathbb R^m} b^T\alpha \\ \text{subject to } \quad &A^T\alpha \leq p, \alpha \geq 0 \end{aligned}\]Write down function $\mathcal L (x,\alpha,\beta)$ as the following:
\[\begin{aligned} \mathcal L (x,\alpha,\beta) &= p^Tx+\alpha_1^T(b-Ax)+\alpha_2^T(-x)\\ &= (p-A^T\alpha_1-\alpha_2)x+\alpha_1^Tb \\ \frac{\partial \mathcal L (x,\alpha,\beta)}{\partial x} \bigg\vert_{x=\bar x} &=p-A^T\alpha_1-\alpha_2 =0 \\ \end{aligned}\]Hence,
\[\begin{aligned} \theta(\alpha,\beta)&=\min\limits_{x \in \mathbb R^n}\mathcal L (x,\alpha,\beta)=\alpha_1^Tb \\ p-A^T\alpha_1-\alpha_2 &=0 \iff A^T\alpha_1 \leq p ,\alpha_1 \geq 0 \end{aligned}\]which is the dual form.
Summary of LP Duality
| Primal | Dual |
|---|---|
| $\text{min} p^Tx$ | $\text{max} \alpha^T b$ |
| Primal Constrains | Dual Variable |
|---|---|
| $Ax\geq b$ | $\alpha\geq 0$ |
| $Ax= b$ | free |
| $Ax\leq b$ | $\alpha\leq 0$ |
| Primal Variable | Dual Contrains |
|---|---|
| $x\geq 0$ | $A^T \alpha \leq p$ |
| free | $A^T \alpha= p$ |
| $x\leq 0$ | $A^T \alpha \geq p$ |
Application of LP Duality
LSQ: For any matrix $A \in \mathbb R^{m\times n}$ and any vector $b\in \mathbb R^{m},$ consider
\[\min \Vert Ax-b\Vert_2^2\] \[x^* \in \arg\min\{\Vert Ax-b\Vert_2^2\} \iff A^TAx=A^Tb\]Claim:
\[A^TAx=A^Tb\]always has a solution.
Using LP Duality Primal:
Dual:
\[b^TA\alpha\quad \text{s.t.} \quad (A^TA)^T\alpha=0\]Clearly, $\alpha=0$ is the solution of Dual problem which equal to minimum of primal problem. Hence, dual has solution implies primal has solution which $A^TAx=A^Tb$ alwasy has a solution.
Dual Problem of strictly Convex Quadratic Program
Primal QP
\[\min\limits_{x\in \mathbb R^n} \frac{1}{2}x^TQx+p^Tx \quad \text{s.t.} \quad Ax \leq b\]with strictly convex assumption, we have
Dual QP
