|
1 | 1 | # L2ODLL.jl |
2 | 2 |
|
3 | | -Documentation for L2ODLL.jl |
4 | | - |
5 | 3 | !!! warning |
6 | 4 | This documentation is a work in progress. |
7 | | - Please open an issue if content is missing / erroneous |
| 5 | + Please open an issue if content is missing / erroneous. |
| 6 | + |
| 7 | +L2ODLL.jl implements the Dual Lagrangian Learning (DLL) method of [Tanneau and Hentenryck (2024)](https://arxiv.org/pdf/2402.03086) using JuMP. |
| 8 | + |
| 9 | +## Installation |
| 10 | + |
| 11 | +```julia |
| 12 | +import Pkg |
| 13 | +Pkg.add(; url="https://github.com/LearningToOptimize/L2ODLL.jl") |
| 14 | +``` |
| 15 | + |
| 16 | +## Usage |
| 17 | + |
| 18 | + |
| 19 | +This package simplifies the implementation of DLL by taking as input a primal JuMP model, then automatically generating the dual projection and completion functions which can be used in the training and inference of DLL models. The basic usage is as follows: |
| 20 | + |
| 21 | + |
| 22 | +#### Define your (primal) model using JuMP |
| 23 | + |
| 24 | +For the purposes of this example, we'll use a portfolio optimization problem. |
| 25 | +```julia |
| 26 | +using JuMP, LinearAlgebra |
| 27 | + |
| 28 | +model = Model() |
| 29 | + |
| 30 | +# define constant problem data |
| 31 | +Σ = [166 34 58; 34 64 4; 58 4 100] / 100^2 |
| 32 | +N = size(Σ, 1) |
| 33 | + |
| 34 | +# define variables |
| 35 | +@variable(model, x[1:N]) |
| 36 | +set_lower_bound.(x, 0) # we explicitly set upper and lower bounds |
| 37 | +set_upper_bound.(x, 1) # in order to use the BoundDecomposition |
| 38 | + |
| 39 | +# define parameteric problem data |
| 40 | +μ0 = randn(N) |
| 41 | +γ0 = rand() |
| 42 | +@variable(model, μ[1:N] in MOI.Parameter.(μ0)) |
| 43 | +@variable(model, γ in MOI.Parameter(γ0)) |
| 44 | + |
| 45 | +# define constraints |
| 46 | +@constraint(model, simplex, sum(x) == 1) |
| 47 | +@constraint(model, risk, [γ; cholesky(Σ).L * x] in SecondOrderCone()) |
| 48 | + |
| 49 | +# define objective |
| 50 | +@objective(model, Max, dot(μ,x)) |
| 51 | +``` |
| 52 | + |
| 53 | +#### Decompose and build the functions |
| 54 | + |
| 55 | +Since all the variables have finite bounds, L2ODLL will automatically pick the `BoundDecomposition`. |
| 56 | +```julia |
| 57 | +using L2ODLL |
| 58 | + |
| 59 | +L2ODLL.decompose!(model) |
| 60 | +``` |
| 61 | + |
| 62 | +Now, L2ODLL has automatically generated the dual projection and completion layer. To compute the dual objective value and gradient with respect to the prediction, use: |
| 63 | +```julia |
| 64 | +param_value = ... # some values for μ and γ |
| 65 | +y_predicted = nn(param_value) # e.g. neural network inference |
| 66 | + |
| 67 | +dobj = L2ODLL.dual_objective(model, y_predicted, param_value) |
| 68 | +dobj_wrt_y = L2ODLL.dual_objective_gradient(model, y_predicted, param_value) |
| 69 | +``` |
| 70 | + |
| 71 | +This also works with batches, using broadcasting: |
| 72 | +```julia |
| 73 | +dobj = L2ODLL.dual_objective.(model, y_predicted_batch, param_value_batch) |
| 74 | +dobj_wrt_y = L2ODLL.dual_objective_gradient.(model, y_predicted_batch, param_value_batch) |
| 75 | +``` |
| 76 | + |
| 77 | +!!! warning |
| 78 | + These functions currently run on the CPU. A batched GPU-friendly version is coming soon. |
| 79 | + |
| 80 | +## Math Background |
| 81 | + |
| 82 | +### Decomposition |
| 83 | + |
| 84 | +In DLL, the primal constraints (dual variables) are decomposed into a predicted set and a completed set. |
| 85 | +Consider the primal-dual pair: |
| 86 | +```math |
| 87 | +\begin{equation} |
| 88 | +\begin{aligned} |
| 89 | +& \min\nolimits_{x} & c^\top x |
| 90 | +\\ |
| 91 | +& \;\;\text{s.t.} & Ax + b \in \mathcal{C} |
| 92 | +\\ |
| 93 | +& & x \in \mathbb{R}^n |
| 94 | +\end{aligned} |
| 95 | +\quad\quad\quad\quad |
| 96 | +\begin{aligned} |
| 97 | +& \max\nolimits_{y} & - b^\top y |
| 98 | +\\ |
| 99 | +& \;\;\text{s.t.} & A^\top y = c |
| 100 | +\\ |
| 101 | +& & y \in \mathcal{C}^* |
| 102 | +\end{aligned} |
| 103 | +\end{equation} |
| 104 | +``` |
| 105 | +After the decomposition, we have: |
| 106 | +```math |
| 107 | +\begin{equation} |
| 108 | +\begin{aligned} |
| 109 | +& \min\nolimits_{x} & c^\top x |
| 110 | +\\ |
| 111 | +& \;\;\text{s.t.} & Ax + b \in \mathcal{C} |
| 112 | +\\ |
| 113 | +& \;\;\phantom{\text{s.t.}} & Hx + h \in \mathcal{K} |
| 114 | +\\ |
| 115 | +& & x \in \mathbb{R}^n |
| 116 | +\end{aligned} |
| 117 | +\quad\quad\quad\quad |
| 118 | +\begin{aligned} |
| 119 | +& \max\nolimits_{y} & - b^\top y |
| 120 | +\\ |
| 121 | +& \;\;\text{s.t.} & A^\top y + H^\top z = c |
| 122 | +\\ |
| 123 | +& & y \in \mathcal{C}^*,\; z \in \mathcal{K}^* |
| 124 | +\end{aligned} |
| 125 | +\end{equation} |
| 126 | +``` |
| 127 | + |
| 128 | +Then, the completion model is: |
| 129 | + |
| 130 | +```math |
| 131 | +\begin{equation} |
| 132 | +\begin{aligned} |
| 133 | +& \max\nolimits_{z} & - h^\top z - b^\top y |
| 134 | +\\ |
| 135 | +& \;\;\text{s.t.} & H z = c - A^\top y |
| 136 | +\\ |
| 137 | +& & z \in \mathcal{K}^* |
| 138 | +\end{aligned} |
| 139 | +\end{equation} |
| 140 | +``` |
| 141 | + |
| 142 | +To train the neural network, we need the gradient of the optimal value with respect to the predicted $y$. This is $\nabla_y = -b-Ax$ where $x$ is the optimal dual solution corresponding to the affine constraints in the completion model. In the special cases below, we specify just the expression for $x$ in this formula. |
| 143 | + |
| 144 | + |
| 145 | +#### Bounded Decomposition |
| 146 | + |
| 147 | +When all primal variables have finite upper and lower bounds, a natural way to decompose the constraints is to have $z$ correspond to the bound constraints, and $y$ correspond to the main constraints, i.e. |
| 148 | + |
| 149 | +```math |
| 150 | +\begin{equation} |
| 151 | +\begin{aligned} |
| 152 | +& \min\nolimits_{x} & c^\top x |
| 153 | +\\ |
| 154 | +& \;\;\text{s.t.} & Ax + b \in \mathcal{C} |
| 155 | +\\ |
| 156 | +& & l \leq x \leq u |
| 157 | +\end{aligned} |
| 158 | +\quad\quad\quad\quad |
| 159 | +\begin{aligned} |
| 160 | +& \max\nolimits_{y,z_l,z_u} & - b^\top y - l^\top z_l - u^\top z_u |
| 161 | +\\ |
| 162 | +& \;\;\text{s.t.} & A^\top y + I z_l + I z_u = c |
| 163 | +\\ |
| 164 | +& & y \in \mathcal{C}^*,\; z_l \in \mathbb{R}_+^n,\; z_u \in \mathbb{R}_-^n |
| 165 | +\end{aligned} |
| 166 | +\end{equation} |
| 167 | +``` |
| 168 | + |
| 169 | +Then, the completion model is: |
| 170 | + |
| 171 | +```math |
| 172 | +\begin{equation} |
| 173 | +\begin{aligned} |
| 174 | +& \max\nolimits_{z_l,z_u} & - l^\top z_l - u^\top z_u - b^\top y |
| 175 | +\\ |
| 176 | +& \;\;\text{s.t.} & I z_l + I z_u = c - A^\top y |
| 177 | +\\ |
| 178 | +& & z_l \in \mathbb{R}_+^n,\; z_u \in \mathbb{R}_-^n |
| 179 | +\end{aligned} |
| 180 | +\end{equation} |
| 181 | +``` |
| 182 | + |
| 183 | +This model admits a closed form solution, $z_l = |c-A^\top y|^+$ and $z_u = -|c-A^\top y|^-$. Furthermore, the $x$ that defines the (sub-)gradient is given element-wise by $l$ if $z_l > 0$, $u$ if $z_u < 0$, and $x\in[l,u]$ otherwise. |
| 184 | + |
| 185 | + |
| 186 | +#### (Strictly) Convex QP |
| 187 | + |
| 188 | +In the convex QP case, the primal has a strictly convex quadratic objective function, i.e. $Q\succ 0$. In that case it is natural to use the main constraints as the predicted set and to complete the quadratic slack dual variables. |
| 189 | + |
| 190 | +```math |
| 191 | +\begin{equation} |
| 192 | +\begin{aligned} |
| 193 | +& \min\nolimits_{x} & x^\top Q x + c^\top x |
| 194 | +\\ |
| 195 | +& \;\;\text{s.t.} & Ax + b \in \mathcal{C} |
| 196 | +\\ |
| 197 | +& & x \in \mathbb{R}^n |
| 198 | +\end{aligned} |
| 199 | +\quad\quad\quad\quad |
| 200 | +\begin{aligned} |
| 201 | +& \max\nolimits_{y} & - b^\top y - z^\top Q z |
| 202 | +\\ |
| 203 | +& \;\;\text{s.t.} & A^\top y + Qz = c |
| 204 | +\\ |
| 205 | +& & y \in \mathcal{C}^*,\; z \in \mathbb{R}^n |
| 206 | +\end{aligned} |
| 207 | +\end{equation} |
| 208 | +``` |
| 209 | + |
| 210 | +Then, the completion model is: |
| 211 | + |
| 212 | +```math |
| 213 | +\begin{equation} |
| 214 | +\begin{aligned} |
| 215 | +& \max\nolimits_{z} & - z^\top Q z - b^\top y |
| 216 | +\\ |
| 217 | +& \;\;\text{s.t.} & Q z = c - A^\top y |
| 218 | +\\ |
| 219 | +& & z \in \mathbb{R}^n |
| 220 | +\end{aligned} |
| 221 | +\end{equation} |
| 222 | +``` |
| 223 | + |
| 224 | +This model admits a closed form solution, $z = Q^{-1}(c - A^\top y)$. Furthermore, the closed form dual solution in this case is $x=z$. |
0 commit comments