-
Notifications
You must be signed in to change notification settings - Fork 10
Add Documentation #238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Documentation #238
Changes from all commits
434c15b
87affc1
9f96d15
4a4d899
8584c9e
e4d3a6f
cbc716d
0b4c4d6
0286d75
e0335d5
b8d11a8
8ee36b7
8f5c249
5e00a62
69fc0c9
57559cf
d9cc27e
685f911
444ffde
6894b43
4e1dcf3
6cd1cf8
f75f245
13d63e1
62b105f
9859f1f
eaa3a1c
bdddf6a
3156e9b
983c2a8
df2f5a0
45ff78b
4bb680d
8c8947b
d4755d4
f56f77d
ccb8b85
93749c2
2bee900
d6b8182
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,16 @@ | ||
| [deps] | ||
| ADNLPModels = "54578032-b7ea-4c30-94aa-7cbd1cce6c9a" | ||
| Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4" | ||
| DocumenterCitations = "daee34ce-89f3-4625-b898-19384cb65244" | ||
| LLSModels = "39f5bc3e-5160-4bf8-ac48-504fd2534d24" | ||
| LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" | ||
| NLPModelsModifiers = "e01155f1-5c6f-4375-a9d8-616dd036575f" | ||
| Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" | ||
| ProximalOperators = "a725b495-10eb-56fe-b38b-717eba820537" | ||
| Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" | ||
| RegularizedProblems = "ea076b23-609f-44d2-bb12-a4ae45328278" | ||
| ShiftedProximalOperators = "d4fd37fa-580c-4e43-9b30-361c21aae263" | ||
|
|
||
| [compat] | ||
| Documenter = "~0.25" | ||
| Documenter = "1" | ||
| DocumenterCitations = "1.2" | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| .mi, .mo, .mn { | ||
| color: #317293; | ||
| } | ||
|
|
||
| a { | ||
| color: #3091d1; | ||
| } | ||
|
|
||
| a:visited { | ||
| color: #3091d1; | ||
| } | ||
|
|
||
| a:hover { | ||
| color: #ff5722; | ||
| } | ||
|
|
||
| nav.toc .logo { | ||
| max-width: 256px; | ||
| max-height: 256px; | ||
| } |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -1,16 +1,30 @@ | ||||||
| using Documenter, RegularizedOptimization | ||||||
| using Documenter, DocumenterCitations | ||||||
|
|
||||||
| using RegularizedOptimization | ||||||
|
|
||||||
| bib = CitationBibliography(joinpath(@__DIR__, "references.bib")) | ||||||
|
|
||||||
| makedocs( | ||||||
| modules = [RegularizedOptimization], | ||||||
| doctest = true, | ||||||
| # linkcheck = true, | ||||||
| strict = true, | ||||||
| warnonly = false, | ||||||
|
||||||
| warnonly = false, | |
| strict = [:missing_docs], |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,56 @@ | ||
| @Article{aravkin-baraldi-orban-2022, | ||
| author = {Aravkin, Aleksandr Y. and Baraldi, Robert and Orban, Dominique}, | ||
| title = {A Proximal Quasi-Newton Trust-Region Method for Nonsmooth Regularized Optimization}, | ||
| journal = {SIAM Journal on Optimization}, | ||
| volume = {32}, | ||
| number = {2}, | ||
| pages = {900-929}, | ||
| year = {2022}, | ||
| doi = {10.1137/21M1409536} | ||
| } | ||
|
|
||
| @Article{aravkin-baraldi-orban-2024, | ||
| author = {Aravkin, Aleksandr Y. and Baraldi, Robert and Orban, Dominique}, | ||
| title = {A Levenberg–Marquardt Method for Nonsmooth Regularized Least Squares}, | ||
| journal = {SIAM Journal on Scientific Computing}, | ||
| volume = {46}, | ||
| number = {4}, | ||
| pages = {A2557-A2581}, | ||
| year = {2024}, | ||
| doi = {10.1137/22M1538971}, | ||
| } | ||
|
|
||
| @Article{leconte-orban-2025, | ||
| author = {Leconte, Geoffroy and Orban, Dominique}, | ||
| title = {The indefinite proximal gradient method}, | ||
| journal = {Computational Optimization and Applications}, | ||
| volume = {91}, | ||
| number = {2}, | ||
| pages = {861-903}, | ||
| year = {2025}, | ||
| doi = {10.1007/s10589-024-00604-5} | ||
| } | ||
|
|
||
| @TechReport{diouane-gollier-orban-2024, | ||
| Author = {Diouane, Youssef and Gollier, Maxence and Orban, Dominique}, | ||
| Title = {A nonsmooth exact penalty method for equality-constrained optimization: complexity and implementation}, | ||
| Institution = {Groupe d’études et de recherche en analyse des décisions}, | ||
| Year = {2024}, | ||
| Type = {Les Cahiers du GERAD}, | ||
| Number = {G-2024-65}, | ||
| Address = {Montreal, Canada}, | ||
| doi = {10.48550/arxiv.2103.15993}, | ||
| url = {https://www.gerad.ca/fr/papers/G-2024-65}, | ||
| } | ||
|
|
||
| @TechReport{diouane-habiboullah-orban-2024, | ||
| Author = {Diouane, Youssef and Laghdaf Habiboullah, Mohamed and Orban, Dominique}, | ||
| Title = {A proximal modified quasi-Newton method for nonsmooth regularized optimization}, | ||
| Institution = {Groupe d’études et de recherche en analyse des décisions}, | ||
| Year = {2024}, | ||
| Type = {Les Cahiers du GERAD}, | ||
| Number = {G-2024-64}, | ||
| Address = {Montreal, Canada}, | ||
| doi = {10.48550/arxiv.2409.19428}, | ||
| url = {https://www.gerad.ca/fr/papers/G-2024-64}, | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,66 @@ | ||
| # [Solvers](@id algorithms) | ||
|
|
||
| ## General case | ||
| The solvers in this package are based upon the approach of [aravkin-baraldi-orban-2022](@cite). | ||
| Suppose we are given the general regularized problem | ||
| ```math | ||
| \underset{x \in \mathbb{R}^n}{\text{minimize}} \quad f(x) + h(x), | ||
| ``` | ||
| where $f : \mathbb{R}^n \mapsto \mathbb{R}$ is continuously differentiable and $h : \mathbb{R}^n \mapsto \mathbb{R} \cup \{\infty\}$ is lower semi-continuous. | ||
| Instead of solving the above directly, which is often impossible, we will solve a simplified version of it repeatedly until we reach a stationary point of the problem above. | ||
| To do so, suppose we are given an iterate $x_0 \in \mathbb{R}^n$, we wish to compute a step, $s_0 \in \mathbb{R}^n$ and improve our iterate with $x_1 := x_0 + s_0$. | ||
| Now, we are going to approximate the functions $f$ and $h$ around $x_0$ with simpler functions (models), which we denote respectively $\varphi(\cdot; x_0)$ and $\psi(\cdot; x_0)$ so that | ||
| ```math | ||
| \varphi(s; x_0) \approx f(x_0 + s) \quad \text{and} \quad \psi(s; x_0) \approx h(x_0 + s). | ||
| ``` | ||
| We then wish to compute the step as | ||
| ```math | ||
| s_0 \in \underset{s \in \mathbb{R}^n}{\argmin} \ \varphi(s; x_0) + \psi(s; x_0). | ||
| ``` | ||
| In order to ensure convergence and to handle the potential nonconvexity of the objective function, we either add a trust-region constraint, | ||
| ```math | ||
| s_0 \in \underset{s \in \mathbb{R}^n}{\argmin} \ \varphi(s; x_0) + \psi(s; x_0) \quad \text{subject to} \ \|s\| \leq \Delta, | ||
| ``` | ||
| or a quadratic regularization | ||
| ```math | ||
| s_0 \in \underset{s \in \mathbb{R}^n}{\argmin} \ \varphi(s; x_0) + \psi(s; x_0) + \tfrac{1}{2} \sigma \|s\|^2_2. | ||
| ``` | ||
| Solvers that work with a trust-region are [`TR`](@ref TR) and [`TRDH`](@ref TRDH) and the ones working with a quadratic regularization are [`R2`](@ref R2), [`R2N`](@ref R2N) and [`R2DH`](@ref R2DH) | ||
|
|
||
| The models for the smooth part `f` in this package are always quadratic models of the form | ||
| ```math | ||
| \varphi(s; x_0) = f(x_0) + \nabla f(x_0)^T s + \tfrac{1}{2} s^T H(x_0) s, | ||
| ``` | ||
| where $H(x_0)$ is a symmetric matrix that can be either $0$, the Hessian of $f$ (if it exists) or a quasi-Newton approximation. | ||
| Some solvers require a specific structure for $H$, for an overview, refer to the table below. | ||
|
|
||
| The following table gives an overview of the available solvers in the general case. | ||
|
|
||
| Solver | Quadratic Regularization | Trust Region | Quadratic term for $\varphi$ : H | Reference | ||
| ----------|--------------------------|--------------|---------------|---------- | ||
| [`R2`](@ref R2) | Yes | No | $H = 0$ | [aravkin-baraldi-orban-2022; Algorithm 6.1](@cite) | ||
| [`R2N`](@ref R2N) | Yes | No | Any Symmetric| [diouane-habiboullah-orban-2024; Algorithm 1](@cite) | ||
| [`R2DH`](@ref R2DH) | Yes | No | Any Diagonal | [diouane-habiboullah-orban-2024; Algorithm 1](@cite) | ||
| [`TR`](@ref TR) | No | Yes | Any Symmetric | [aravkin-baraldi-orban-2022; Algorithm 3.1](@cite) | ||
| [`TRDH`](@ref TRDH) | No | Yes | Any Diagonal | [leconte-orban-2025; Algorithm 5.1](@cite) | ||
|
|
||
| ## Nonlinear least-squares | ||
| This package provides two solvers, [`LM`](@ref LM) and [`LMTR`](@ref LMTR), specialized for regularized, nonlinear least-squares, i.e., problems of the form | ||
| ```math | ||
| \underset{x \in \mathbb{R}^n}{\text{minimize}} \quad \tfrac{1}{2}\|F(x)\|_2^2 + h(x), | ||
| ``` | ||
| where $F : \mathbb{R}^n \mapsto \mathbb{R}^m$ is continuously differentiable and $h : \mathbb{R}^n \mapsto \mathbb{R} \cup \{\infty\}$ is lower semi-continuous. | ||
| In that case, the model $\varphi$ is defined as | ||
| ```math | ||
| \varphi(s; x) = \tfrac{1}{2}\|F(x) + J(x)s\|_2^2, | ||
| ``` | ||
| where $J(x)$ is the Jacobian of $F$ at $x$. | ||
| Similar to the solvers in the previous section, we either add a quadratic regularization to the model ([`LM`](@ref LM)) or a trust-region ([`LMTR`](@ref LMTR)). | ||
| These solvers are described in [aravkin-baraldi-orban-2024](@cite). | ||
|
|
||
| ## Constrained Optimization | ||
| For constrained, regularized optimization, | ||
| ```math | ||
| \underset{x \in \mathbb{R}^n}{\text{minimize}} \quad f(x) + h(x) \quad \text{subject to} \ l \leq x \leq u \ \text{and} \ c(x) = 0, | ||
| ``` | ||
| an augmented Lagrangian method is provided, [`AL`](@ref AL). |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| # Bibliography | ||
|
|
||
| ```@bibliography | ||
| ``` | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The bibliography looks funny. What are those newlines?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I did not find a fix, sorry. Will try it in a future PR |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,103 @@ | ||
| # A regularized optimization problem | ||
|
|
||
| In this tutorial, we will show how to model and solve the nonconvex nonsmooth optimization problem | ||
| ```math | ||
| \min_{x \in \mathbb{R}^2} (1 - x_1)^2 + 100(x_2 - x_1^2)^2 + |x_1| + |x_2|, | ||
| ``` | ||
| which can be seen as a $$\ell_1$$ regularization of the Rosenbrock function. | ||
| It can be shown that the solution to the problem is | ||
| ```math | ||
| x^* = \begin{pmatrix} | ||
| 0.25\\ | ||
| 0.0575 | ||
| \end{pmatrix} | ||
| ``` | ||
|
|
||
|
|
||
| ## Modelling the problem | ||
| We first formulate the objective function as the sum of a smooth function $f$ and a nonsmooth regularizer $h$: | ||
| ```math | ||
| (1 - x_1)^2 + 100(x_2 - x_1^2)^2 + |x_1| + |x_2| = f(x_1, x_2) + h(x_1, x_2), | ||
| ``` | ||
| where | ||
| ```math | ||
| \begin{align*} | ||
| f(x_1, x_2) &:= (1 - x_1)^2 + 100(x_2 - x_1^2)^2,\\ | ||
| h(x_1, x_2) &:= \|x\|_1. | ||
| \end{align*} | ||
| ``` | ||
| To model $f$, we are going to use [ADNLPModels.jl](https://github.com/JuliaSmoothOptimizers/ADNLPModels.jl). | ||
| For the nonsmooth regularizer, we use [ProximalOperators.jl](https://github.com/JuliaFirstOrder/ProximalOperators.jl). | ||
| We then wrap the smooth function and the regularizer in a `RegularizedNLPModel` | ||
|
|
||
| ```@example basic | ||
| using ADNLPModels | ||
| using ProximalOperators | ||
| using RegularizedProblems | ||
|
|
||
| # Model the function | ||
| f_fun = x -> (1 - x[1])^2 + 100*(x[2] - x[1]^2)^2 | ||
|
|
||
| # Choose a starting point for the optimization process, for the sake of this example, we choose | ||
| x0 = [-1.0, 2.0] | ||
|
|
||
| # Get an NLPModel corresponding to the smooth function f | ||
| f_model = ADNLPModel(f_fun, x0, name = "AD model of f") | ||
|
|
||
| # Get the regularizer from ProximalOperators | ||
| h = NormL1(1.0) | ||
|
|
||
| # Wrap into a RegularizedNLPModel | ||
| regularized_pb = RegularizedNLPModel(f_model, h) | ||
| ``` | ||
|
|
||
| ## Solving the problem | ||
| We can now choose one of the solvers presented [here](@ref algorithms) to solve the problem we defined above. | ||
| Please refer to other sections of this documentation to make the wisest choice for your particular problem. | ||
| Depending on the problem structure and on requirements from the user, some solvers are more appropriate than others. | ||
| The following tries to give a quick overview of what choices one can make. | ||
|
|
||
| Suppose for example that we don't want to use a quasi-Newton approach and that we don't have access to the Hessian of f, or that we don't want to incur the cost of computing it. | ||
| In this case, the most appropriate solver would be R2. | ||
| For this example, we also choose a tolerance by specifying the keyword arguments `atol` and `rtol` across all solvers. | ||
|
|
||
| ```@example basic | ||
| using RegularizedOptimization | ||
|
|
||
| out = R2(regularized_pb, verbose = 100, atol = 1e-6, rtol = 1e-6) | ||
| println("R2 converged after $(out.iter) iterations to the solution x = $(out.solution)") | ||
| ``` | ||
|
|
||
| Now, we can actually use second information on f. | ||
| To do so, we are going to use TR, a trust-region solver that can exploit second order information. | ||
| ```@example basic | ||
|
|
||
| out = TR(regularized_pb, verbose = 100, atol = 1e-6, rtol = 1e-6) | ||
| println("TR converged after $(out.iter) iterations to the solution x = $(out.solution)") | ||
| ``` | ||
|
|
||
| Suppose for some reason we can not compute the Hessian. | ||
| In this case, we can try to switch to a quasi-Newton approximation, this can be done with NLPModelsModifiers.jl | ||
| We could choose to use TR again but for the sake of this tutorial we run it with R2N | ||
| ```@example basic | ||
| using NLPModelsModifiers | ||
|
|
||
| # Switch the model of the smooth function to a quasi-Newton approximation | ||
| f_model_lsr1 = LSR1Model(f_model) | ||
| regularized_pb_lsr1 = RegularizedNLPModel(f_model_lsr1, h) | ||
|
|
||
| # Solve with R2N | ||
| out = R2N(regularized_pb_lsr1, verbose = 100, atol = 1e-6, rtol = 1e-6) | ||
| println("R2N converged after $(out.iter) iterations to the solution x = $(out.solution)") | ||
| ``` | ||
|
|
||
| Finally, TRDH and R2DH are specialized for diagonal quasi-Newton approximations, and should be used instead of TR and R2N, respectively. | ||
| ```@example basic | ||
|
|
||
| f_model_sg = SpectralGradientModel(f_model) | ||
| regularized_pb_sg = RegularizedNLPModel(f_model_sg, h) | ||
|
|
||
| # Solve with R2DH | ||
| out = R2DH(regularized_pb_sg, verbose = 100, atol = 1e-6, rtol = 1e-6) | ||
| println("R2DH converged after $(out.iter) iterations to the solution x = $(out.solution)") | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The version constraint for Documenter changed from
~0.25to1, which is a major version jump. Consider using1.0to be more explicit about the minimum required version, and ensure compatibility with the current codebase.