|
| 1 | +# A regularized optimization problem |
| 2 | + |
| 3 | +In this tutorial, we will show how to model and solve the nonconvex nonsmooth optimization problem |
| 4 | +```math |
| 5 | + \min_{x \in \mathbb{R}^2} x_1^2 + 100(x_2 - x_1^2 - 2x_1)^2 + |x_1| + |x_2|. |
| 6 | +``` |
| 7 | + |
| 8 | +## Modelling the problem |
| 9 | +We first formulate the objective function as the sum of a smooth function $f$ and a nonsmooth regularizer $h$: |
| 10 | +```math |
| 11 | + x_1^2 + 100(x_2 - x_1^2 - 2x_1)^2 + |x_1| + |x_2| = f(x_1, x_2) + h(x_1, x_2), |
| 12 | +``` |
| 13 | +where |
| 14 | +```math |
| 15 | +\begin{align*} |
| 16 | +f(x_1, x_2) &:= x_1^2 + 100(x_2 - x_1^2 - 2x_1)^2,\\ |
| 17 | +h(x_1, x_2) &:= \|x\|_1. |
| 18 | +\end{align*} |
| 19 | +``` |
| 20 | +To model $f$, we are going to use [ADNLPModels.jl](https://github.com/JuliaSmoothOptimizers/ADNLPModels.jl). |
| 21 | +For the nonsmooth regularizer, we observe that $h$ is actually readily available in [ProximalOperators.jl](https://github.com/JuliaFirstOrder/ProximalOperators.jl), you can refer to [this section](@ref regularizers) for a list of readily available regularizers. |
| 22 | +We then wrap the smooth function and the regularizer in a `RegularizedNLPModel` |
| 23 | + |
| 24 | +```@example |
| 25 | +using ADNLPModels |
| 26 | +using ProximalOperators |
| 27 | +using RegularizedProblems |
| 28 | +
|
| 29 | +# Model the function |
| 30 | +f_fun = x -> x[1]^2 + 100*(x[2] - x[1]^2 - 2*x[1])^2 |
| 31 | +
|
| 32 | +# Choose a starting point for the optimization process, for the sake of this example, we choose |
| 33 | +x0 = [-1.0, 2.0] |
| 34 | +
|
| 35 | +# Get an NLPModel corresponding to the smooth function f |
| 36 | +f_model = ADNLPModel(f_fun, x0, name = "AD model of f") |
| 37 | +
|
| 38 | +# Get the regularizer from ProximalOperators |
| 39 | +h = NormL1(1.0) |
| 40 | +
|
| 41 | +# Wrap into a RegularizedNLPModel |
| 42 | +regularized_pb = RegularizedNLPModel(f_model, h) |
| 43 | +``` |
| 44 | + |
| 45 | +## Solving the problem |
| 46 | +We can now choose one of the algorithms presented [here](@ref algorithms) to solve the problem we defined above. |
| 47 | +Please refer to other sections of this documentation to make the wisest choice for your particular problem. |
| 48 | +Depending on the problem structure and on requirements from the user, some solvers are more appropriate than others. |
| 49 | +The following tries to give a quick overview of what choices one can make. |
| 50 | +```@example |
| 51 | +using ADNLPModels |
| 52 | +using ProximalOperators |
| 53 | +using RegularizedProblems |
| 54 | +
|
| 55 | +f_fun = x -> x[1]^2 + 100*(x[2] - x[1]^2 - 2*x[1])^2 |
| 56 | +x0 = [-1.0, 2.0] |
| 57 | +
|
| 58 | +f_model = ADNLPModel(f_fun, x0, name = "AD model of f") |
| 59 | +h = NormL1(1.0) |
| 60 | +regularized_pb = RegularizedNLPModel(f_model, h) |
| 61 | +
|
| 62 | +using RegularizedOptimization |
| 63 | +
|
| 64 | +# Suppose for example that we don't want to use a quasi-Newton approach and/or that we don't have access to the Hessian of f. |
| 65 | +# In this case, the most appropriate solver would be R2. |
| 66 | +# For this example, we also choose a relatively small tolerance by specifying the keyword argument atol across all solvers. |
| 67 | +out = R2(regularized_pb, verbose = 10, atol = 1e-3) |
| 68 | +println("R2 converged after $(out.iter) iterations to the solution x = $(out.solution)") |
| 69 | +println("--------------------------------------------------------------------------------------") |
| 70 | +
|
| 71 | +# Now, on this example, we can actually use second information on f. |
| 72 | +# To do so, we are going to use TR, a trust-region algorithm that can exploit second order information. |
| 73 | +out = TR(regularized_pb, verbose = 10, atol = 1e-3) |
| 74 | +println("TR converged after $(out.iter) iterations to the solution x = $(out.solution)") |
| 75 | +println("--------------------------------------------------------------------------------------") |
| 76 | +
|
| 77 | +# Suppose for some reason we can not compute the Hessian. |
| 78 | +# In this case, we can try to switch to a quasi-Newton approximation, this can be done with NLPModelsModifiers.jl |
| 79 | +# We could choose to TR again but for the sake of this tutorial we are going to try to run it with R2N |
| 80 | +
|
| 81 | +using NLPModelsModifiers |
| 82 | +
|
| 83 | +# Switch the model of the smooth function to a quasi-Newton approximation |
| 84 | +f_model_lsr1 = LSR1Model(f_model) |
| 85 | +regularized_pb_lsr1 = RegularizedNLPModel(f_model_lsr1, h) |
| 86 | +
|
| 87 | +# Solve with R2N |
| 88 | +out = R2N(regularized_pb_lsr1, verbose = 10, atol = 1e-3) |
| 89 | +println("R2N converged after $(out.iter) iterations to the solution x = $(out.solution)") |
| 90 | +println("--------------------------------------------------------------------------------------") |
| 91 | +
|
| 92 | +# Finally, in the case where the quasi-Newton approximation is diagonal, TRDH and R2DH are specialized solvers to this case |
| 93 | +# and should be used in favour of TR and R2N respectively. |
| 94 | +f_model_sg = SpectralGradientModel(f_model) |
| 95 | +regularized_pb_sg = RegularizedNLPModel(f_model_sg, h) |
| 96 | +
|
| 97 | +# Solve with R2DH |
| 98 | +out = R2DH(regularized_pb_sg, verbose = 10, atol = 1e-3) |
| 99 | +println("R2DH converged after $(out.iter) iterations to the solution x = $(out.solution)") |
| 100 | +println("--------------------------------------------------------------------------------------") |
| 101 | +
|
| 102 | +``` |
0 commit comments