|
| 1 | +# # On the importance of Dualization |
| 2 | + |
| 3 | +#md # [](@__BINDER_ROOT_URL__/generated/Getting started/dualization.ipynb) |
| 4 | +#md # [](@__NBVIEWER_ROOT_URL__/generated/Getting started/dualization.ipynb) |
| 5 | + |
| 6 | +using Test #src |
| 7 | +using DynamicPolynomials |
| 8 | +using SumOfSquares |
| 9 | + |
| 10 | +# Sum-of-Squares programs are usually solved by SemiDefinite Programming solvers (SDPs). |
| 11 | +# These programs can be represented into two different formats: |
| 12 | +# Either the *standard conic form*, also known as *kernel form*: |
| 13 | +# ```math |
| 14 | +# \begin{aligned} |
| 15 | +# \min\limits_{Q \in \mathbb{S}_n} & \langle C, Q \rangle\\ |
| 16 | +# \text{subject to:} & \langle A_i, Q \rangle = b_i, \quad i=1,2,\ldots,m\\ |
| 17 | +# & Q \succeq 0, |
| 18 | +# \end{aligned} |
| 19 | +# ``` |
| 20 | +# or the *geometric conic form*, also known as *image form*: |
| 21 | +# ```math |
| 22 | +# \begin{aligned} |
| 23 | +# \max\limits_{y \in \mathbb{R}^m} & \langle b, y \rangle\\ |
| 24 | +# \text{subject to:} & C \succeq \sum_{i=1}^m A_i y_i\\ |
| 25 | +# & y\ \mathsf{free}, |
| 26 | +# \end{aligned} |
| 27 | +# ``` |
| 28 | + |
| 29 | +# In this tutorial, we investigate in which of these two forms a Sum-of-Squares |
| 30 | +# constraint should be written into. |
| 31 | +# Consider the simple example of trying to determine whether the following univariate |
| 32 | +# polynomial is a Sum-of-Squares: |
| 33 | + |
| 34 | +import SCS |
| 35 | +@polyvar x |
| 36 | +p = (x + 1)^2 * (x + 2)^2 |
| 37 | +model_scs = Model(SCS.Optimizer) |
| 38 | +con_ref = @constraint(model_scs, p in SOSCone()) |
| 39 | +optimize!(model_scs) |
| 40 | + |
| 41 | +# As we can see in the log, SCS reports `6` variables and `11` constraints. |
| 42 | +# We can also choose to dualize the problem before it is |
| 43 | +# passed to SCS as follows: |
| 44 | + |
| 45 | +using Dualization |
| 46 | +model_dual_scs = Model(dual_optimizer(SCS.Optimizer)) |
| 47 | +@objective(model_dual_scs, Max, 0.0) |
| 48 | +con_ref = @constraint(model_dual_scs, p in SOSCone()) |
| 49 | +optimize!(model_dual_scs) |
| 50 | + |
| 51 | +# This time, SCS reports `5` variables and `6` constraints. |
| 52 | + |
| 53 | +# ## Bridges operating behind the scenes |
| 54 | +# |
| 55 | +# The difference comes from the fact that, when designing the JuMP interface of |
| 56 | +# SCS, it was decided that the model would be read in the image form. |
| 57 | +# SCS therefore declares that it only supports free variables, represented in |
| 58 | +# JuMP as variables in `MOI.Reals` and affine semidefinite constraints, |
| 59 | +# represented in JuMP as |
| 60 | +# `MOI.VectorAffineFunction`-in-`MOI.PositiveSemidefiniteConeTriangle` |
| 61 | +# constraints. |
| 62 | +# On the other hand, SumOfSquares gave the model in kernel form so the |
| 63 | +# positive semidefinite (PSD) variables were reformulated as free variables |
| 64 | +# constrained to be PSD using an affine PSD constraints. |
| 65 | +# |
| 66 | +# This transformation is done transparently without warning but it can be |
| 67 | +# inspected using `print_active_bridges`. |
| 68 | +# As shown below, we can see |
| 69 | +# `Unsupported variable: MOI.PositiveSemidefiniteConeTriangle` and |
| 70 | +# `adding as constraint` |
| 71 | +# indicating that PSD variables are not supported and they are added as free |
| 72 | +# variables. |
| 73 | +# Then we have `Unsupported constraint: MOI.VectorOfVariables-in-MOI.PositiveSemidefiniteConeTriangle` |
| 74 | +# indicating that SCS does not support constraining variables in the PSD cone |
| 75 | +# so it will just convert it into affine expressions in the PSD cone. |
| 76 | +# Of course, this is equivalent but it means that SCS will not exploit this |
| 77 | +# particular structure of the problem hence solving might be less efficient. |
| 78 | + |
| 79 | +print_active_bridges(model_scs) |
| 80 | + |
| 81 | +# With the dual version, we can see that variables in the PSD cone are supported |
| 82 | +# directly hence we don't need that extra conversion. |
| 83 | + |
| 84 | +print_active_bridges(model_dual_scs) |
| 85 | + |
| 86 | +# ## In more details |
| 87 | +# |
| 88 | +# Consider a polynomial |
| 89 | +# ```math |
| 90 | +# p(x) = \sum_{\alpha} p_\alpha x^\alpha, |
| 91 | +# ``` |
| 92 | +# a vector of monomials `b(x)` and the set |
| 93 | +# ```math |
| 94 | +# \mathcal{A}_\alpha = \{\,(\beta, \gamma) \in b(x)^2 \mid x^\beta x^\gamma = x^\alpha\,\} |
| 95 | +# ``` |
| 96 | +# The constraint encoding the existence of a PSD matrix `Q` such that `p(x) = b(x)' * Q * b(x)` |
| 97 | +# can be written in standard conic form as follows: |
| 98 | +# ```math |
| 99 | +# \begin{aligned} |
| 100 | +# \langle \sum_{(\beta, \gamma) \in \mathcal{A}_\alpha} e_\beta e_\gamma^\top, Q \rangle & = p_\alpha, \quad\forall \alpha\\ |
| 101 | +# Q & \succeq 0 |
| 102 | +# \end{aligned} |
| 103 | +# ``` |
| 104 | +# Given an arbitrary choice of elements in each set ``\mathcal{A}_\alpha``: |
| 105 | +# ``(\beta_\alpha, \gamma_\alpha) \in \mathcal{A}_\alpha``. |
| 106 | +# It can also equivalently be written in the geometric conic form as follows: |
| 107 | +# ```math |
| 108 | +# \begin{aligned} |
| 109 | +# p_\alpha e_{\beta_\alpha} e_{\gamma_\alpha}^\top + |
| 110 | +# \sum_{(\beta, \gamma) \in \mathcal{A}_\alpha \setminus (\beta_\alpha, \gamma_\alpha)} |
| 111 | +# y_{\beta,\gamma} (e_\beta e_\gamma - e_{\beta_\alpha} e_{\gamma_\alpha}^\top)^\top |
| 112 | +# & \succeq 0\\ |
| 113 | +# y_{\beta,\gamma} & \text{ free} |
| 114 | +# \end{aligned} |
| 115 | +# ``` |
| 116 | +# |
| 117 | +# ## Should I dualize or not ? |
| 118 | +# |
| 119 | +# Let's study the evolution of the dimensions `m` and `n` of the semidefinite |
| 120 | +# program in two extreme examples and then try to extrapolate from these. |
| 121 | +# |
| 122 | +# ### Univariate case |
| 123 | +# |
| 124 | +# Suppose `p` is a univariate polynomial of degree $2d$. |
| 125 | +# Then `n` will be equal to `d(d + 1)/2` for both the standard and geometric conic forms. |
| 126 | +# On the other hand, `m` will be equal to `2d + 1` for the standard conic form and |
| 127 | +# `d(d + 1) / 2 - (2d + 1)` for the geometric form case. |
| 128 | +# So `m` grows **linearly** for the kernel form but **quadratically** for the image form! |
| 129 | +# |
| 130 | +# ### Quadratic case |
| 131 | +# |
| 132 | +# Suppose `p` is a quadratic form of `d` variables. |
| 133 | +# Then `n` will be equal to `d` for both the standard and geometric conic forms. |
| 134 | +# On the other hand, `m` will be equal to `d(d + 1)/2` for the standard conic form and |
| 135 | +# `0` for the geometric form case. |
| 136 | +# So `m` grows **quadratically** for the kernel form but is zero for the image form! |
| 137 | +# |
| 138 | +# ### In general |
| 139 | +# |
| 140 | +# In general, if ``s_d`` is the dimension of the space of polynomials of degree `d` then |
| 141 | +# ``m = s_{2d}`` for the kernel form and ``m = s_{d}(s_{d} + 1)/2 - s_{2d}`` for the image form. |
| 142 | +# As a rule of thumb, the kernel form will have a smaller `m` if `p` has a low number of variables |
| 143 | +# and low degree and vice versa. |
| 144 | +# Of course, you can always try with and without Dualization and see which one works best. |
0 commit comments