|
| 1 | +# [Feasibilty and optimality](@id kkt) |
| 2 | + |
| 3 | +Mathematically, continuous optimization problems have exact feasibilty |
| 4 | +and optimality conditions. However, since solvers cannot always |
| 5 | +satisfy these conditions exactly when using floating-point arithmetic, |
| 6 | +they do so to within tolerances. As explored below, some solvers aim |
| 7 | +to satisfy those tolerances absolutely, and others aim to satisfy |
| 8 | +tolerances relative to problem data. When tolerances are satisfied |
| 9 | +relatively, they are generally not satisfied absolutely. The use of |
| 10 | +tolerances relative to problem data is not consistent across solvers, |
| 11 | +and can give a misleading claim of optimality. To achieve consistency, |
| 12 | +HiGHS reassesses the optimal solution claimed by such a solver in a |
| 13 | +reasonable and uniform manner. |
| 14 | + |
| 15 | +### Feasibilty and optimality conditions |
| 16 | + |
| 17 | +To discuss tolerances and their use in different solvers, consider the |
| 18 | +standard form linear programming (LP) problem with ``n`` variables and |
| 19 | +``m`` equations (``n\ge m``). |
| 20 | + |
| 21 | +```math |
| 22 | +\begin{aligned} |
| 23 | +\textrm{minimize} \quad & c^T\! x \\ |
| 24 | +\textrm{subject to} \quad & Ax = b \\ |
| 25 | + & x \ge 0, |
| 26 | +\end{aligned} |
| 27 | +``` |
| 28 | + |
| 29 | +The feasibilty and optimality conditions (KKT conditions) are that, at |
| 30 | +a point ``x``, there exist (row) dual values ``y`` and reduced costs |
| 31 | +(column dual values) ``s`` such that |
| 32 | + |
| 33 | +```math |
| 34 | +\begin{aligned} |
| 35 | +Ax=b&\qquad\textrm{Primal~equations}\\ |
| 36 | +A^Ty+s=c&\qquad\textrm{Dual~equations}\\ |
| 37 | +x\ge0&\qquad\textrm{Primal~feasibility}\\ |
| 38 | +s\ge0&\qquad\textrm{Dual~feasibility}\\ |
| 39 | +c^Tx-b^Ty=0&\qquad\textrm{Optimality} |
| 40 | +\end{aligned} |
| 41 | +``` |
| 42 | + |
| 43 | +The optimality condition is equivalent to the complementarity |
| 44 | +condition that ``x^Ts=0``. Since any LP problem can be transformed |
| 45 | +into standard form, the following discussion loses no generality. This |
| 46 | +discussion also largely applies to quadratic programming (QP) |
| 47 | +problems, with the differences explored below. |
| 48 | + |
| 49 | +### The HiGHS feasibility and optimality tolerances |
| 50 | + |
| 51 | +HiGHS has separate tolerances for the following, listed with convenient mathematical notation |
| 52 | + |
| 53 | +- [Primal feasibility](@ref option-primal-feasibility-tolerance) (``\epsilon_P``) |
| 54 | +- [Dual feasibility](@ref option-dual-feasibility-tolerance) (``\epsilon_D``) |
| 55 | +- Residual errors in the [primal equations](@ref option-primal-residual-tolerance) (``\epsilon_R``) |
| 56 | +- Residual errors in the [dual equations](@ref option-dual-residual-tolerance) (``\epsilon_C``) |
| 57 | +- [Optimality](@ref option-optimality-tolerance) (``\epsilon_{O}``) |
| 58 | + |
| 59 | +All are set to the same default value of ``10^{-7}``. Although each |
| 60 | +can be set to different values by the user, if the user wishes to |
| 61 | +solve LPs to a general lower or higher tolerance, the value of the |
| 62 | +[KKT tolerance](@ref option-kkt-tolerance) can be changed from this |
| 63 | +default value. |
| 64 | + |
| 65 | +### When HiGHS yields an optimal solution |
| 66 | + |
| 67 | +When HiGHS returns a model status of optimal, the solution will |
| 68 | +satisfy feasibility and optimality tolerances absolutely or relatively |
| 69 | +according to whether the solver yields a basic solution. |
| 70 | + |
| 71 | +### Solutions with a corresponding basis |
| 72 | + |
| 73 | +The HiGHS simplex solvers and the interior point solver after |
| 74 | +crossover yield an optimal basic solution of the LP, consisting of |
| 75 | +``m`` basic variables and ``n-m`` nonbasic variables. At any basis, |
| 76 | +the nonbasic variables are zero, and values for the basic variables |
| 77 | +are given by solving a linear system of equations. Values for the row |
| 78 | +dual values (``y``) can be computed by solving a linear system of equations, |
| 79 | +and the column dual values are then given by ``s=c-A^Ty``. With exact |
| 80 | +arithmetic, the basic dual values are zero by construction. |
| 81 | + |
| 82 | +When primal and dual values are computed using floating-point |
| 83 | +arithmetic, the basic dual values are set to zero so the optimality |
| 84 | +condition holds by construction. However, the primal and dual |
| 85 | +equations may not be satisfied exactly, so have nonzero |
| 86 | +residuals. Fortunately, when solving a linear system of equations |
| 87 | +using a stable technique, any residuals are small relative to the RHS |
| 88 | +of the equations, whatever the condition of the matrix of |
| 89 | +coefficients. Hence HiGHS does not assess the primal residuals, or the |
| 90 | +dual residuals for basic variables. Thus optimality for a basic |
| 91 | +solution is assessed by HiGHS according to whether the following |
| 92 | +conditions hold |
| 93 | + |
| 94 | +```math |
| 95 | +\begin{aligned} |
| 96 | +x_i\ge-\epsilon_P&\qquad\forall i=1,\ldots,n\\ |
| 97 | +s_i\ge-\epsilon_D&\qquad\forall i=1,\ldots,n. |
| 98 | +\end{aligned} |
| 99 | +``` |
| 100 | + |
| 101 | +The HiGHS active set QP solver has an objective function ``(1/2)x^TQx + c^Tx``, |
| 102 | +and maintains the QP equivalent of a basis in which a subset |
| 103 | +of (up to ``n``) variables are zero. However, there are variables |
| 104 | +that are off their bounds whose reduced costs are not zero by |
| 105 | +construction. At an optimal solution they will only be less than a |
| 106 | +dual feasibility tolerance in magnitude, so the optimality condition |
| 107 | +will not be satisfied by construction. The primal and dual equations |
| 108 | +(where the latter is ``A^Ty+s=Qx+c``) will be satisfied with small |
| 109 | +residuals. Optimality is assessed by HiGHS according to whether primal |
| 110 | +and dual feasibility is satisfied to within the corresponding |
| 111 | +tolerance. |
| 112 | + |
| 113 | + |
| 114 | +### Solutions without a corresponding basis |
| 115 | + |
| 116 | +The HiGHS PDLP solver and the interior point solver without crossover |
| 117 | +(IPX) yield "optimal" primal and dual values that satisfy internal |
| 118 | +conditions for termination of the underlying algorithm. These |
| 119 | +conditions are discussed below, and are used for good reason. However |
| 120 | +they can lead to a misleading claim of optimality. |
| 121 | + |
| 122 | +#### Interior point solutions |
| 123 | + |
| 124 | +The interior point algorithm uses a single feasibility tolerance |
| 125 | +``\epsilon=\min(\epsilon_P, \epsilon_D)``, and an independent |
| 126 | +[optimality tolerance](@ref option-ipm-optimality-tolerance) |
| 127 | +(``\epsilon_{IPM}``) that, by default, is (currently) ten times |
| 128 | +smaller than the other feasibility and optimality tolerances used by |
| 129 | +HiGHS. It terminates when |
| 130 | + |
| 131 | +```math |
| 132 | +\begin{aligned} |
| 133 | +\|Ax-b\|_\infty&\le(1+\|b\|_\infty)\epsilon_R\\ |
| 134 | +\|c-A^Ty+s\|_\infty&\le(1+\|c\|_\infty)\epsilon_C\\ |
| 135 | +-x_i&\le\epsilon\qquad\forall i=1,\ldots,n\\ |
| 136 | +-s_i&\le \epsilon\qquad\forall i=1,\ldots,n\\ |
| 137 | +|c^Tx-b^Ty|&\le(1+|c^Tx+b^Ty|/2)\epsilon_{IPM}. |
| 138 | +\end{aligned} |
| 139 | +``` |
| 140 | + |
| 141 | +#### PDLP solutions |
| 142 | + |
| 143 | +The PDLP algorithm uses an independent [optimality tolerance](@ref |
| 144 | +option-pdlp-optimality-tolerance) (``\epsilon_{PDLP}``) that is equal |
| 145 | +to the other feasibility and optimality tolerances used by HiGHS. It |
| 146 | +determines values of ``x\ge0`` and ``y``, and chooses ``s`` to be the |
| 147 | +non-negative values of ``c-A^Ty``. Hence it guarantees primal and dual |
| 148 | +feasibility by construction. It terminates when |
| 149 | + |
| 150 | +```math |
| 151 | +\begin{aligned} |
| 152 | +\|Ax-b\|_2&\le (1+\|b\|_2)\epsilon_P\\ |
| 153 | +\|c-A^Ty-s\|_2&\le (1+\|c\|_2)\epsilon_D\\ |
| 154 | +|c^Tx-b^Ty|&\le (1+|c^Tx|+|b^Ty|)\epsilon_{PDLP}. |
| 155 | +\end{aligned} |
| 156 | +``` |
| 157 | + |
| 158 | +#### HiGHS solutions |
| 159 | + |
| 160 | +The relative measures used by PDLP and IPX assume that all components |
| 161 | +of the cost and RHS vectors are relevant. When an LP problem is in |
| 162 | +standard form this is true for ``b``, but not necessarily for the cost |
| 163 | +vector ``c``. Consider a large component of ``c`` for which the |
| 164 | +corresponding reduced cost value in ``s`` is also large, in which case |
| 165 | +the LP solution is insensitive to the cost. This component will |
| 166 | +contribute significantly to ``\|c\|`` and, hence, the RHS of the dual |
| 167 | +residual condition, allowing large values of ``\|c-A^Ty-s\|`` to be |
| 168 | +accepted. However, this can lead to unacceptably large absolute |
| 169 | +residual errors and non-optimal solutions being deemed "optimal". When |
| 170 | +equations in ``Ax=b`` correspond to inequality constraints with large |
| 171 | +RHS values and a slack variable (so the constraint is redundant) the |
| 172 | +same issue occurs in the case of primal residual errors. The solution |
| 173 | +of the LP is not sensitive to this large RHS value, but its |
| 174 | +contribution to ``||b||`` can allow large absolute primal residual |
| 175 | +errors to be overlooked. |
| 176 | + |
| 177 | +To make an informed assessment of whether an "optimal" solution |
| 178 | +obtained by IPX or PDLP is acceptable, HiGHS computes infinity norm |
| 179 | +measures of ``b`` and ``c`` corresponding to the components that |
| 180 | +define the optimal solution. For ``c`` these are the components |
| 181 | +corresponding to positive values of ``x`` and reduced costs that are |
| 182 | +close to zero. For ``b``, these are the components corresponding to |
| 183 | +constraints that are (close to being) satisfied exactly. The resulting |
| 184 | +measures are smaller than ``\|b\|`` or ``\|c\|``, and may lead to |
| 185 | +relative measures of primal/dual residual errors or infeasibilities |
| 186 | +not being satisfied, so the status of the solver's "optimal" solution |
| 187 | +may be reduced to "unknown". When this happens - and possibly if |
| 188 | +tolerances on relative measures _have_ been satisfied - users can |
| 189 | +consult the absolute and relative measures available via |
| 190 | +[HighsInfo](@ref info-num-primal-infeasibilities). |
| 191 | + |
| 192 | +### Discrete optimization problems |
| 193 | + |
| 194 | +Discrete optimization problems, such as the mixed-integer programming |
| 195 | +(MIP) problems solved by HiGHS, have no local optimality |
| 196 | +conditions. Variables required to take integer values will do so to |
| 197 | +within the `mip_feasibility_tolerance`. Since MIP sub-problems are |
| 198 | +solved with the simplex solver, the values of the variables and |
| 199 | +constraints will satisfy absolute feasibility tolerances. Within the |
| 200 | +MIP solver, the value of `mip_feasibility_tolerance` is used for |
| 201 | +`primal_feasibility_tolerance` when solving LP sub-problems, and one |
| 202 | +tenth of this value is used for `dual_feasibility_tolerance`. Hence |
| 203 | +any value of `primal_feasibility_tolerance` (or |
| 204 | +`dual_feasibility_tolerance`) set by the user has no effect of the MIP |
| 205 | +solver. |
| 206 | + |
0 commit comments