numericalalgorithmsgroup
diff --git a/‎docs/advanced.rst‎
Lines changed: 10 additions & 2 deletions b/‎docs/advanced.rst‎
Lines changed: 10 additions & 2 deletions
diff --git a/‎docs/build/doctrees/advanced.doctree‎
3.12 KB b/‎docs/build/doctrees/advanced.doctree‎
3.12 KB
diff --git a/‎docs/build/doctrees/environment.pickle‎
27.4 KB b/‎docs/build/doctrees/environment.pickle‎
27.4 KB
diff --git a/‎docs/build/doctrees/index.doctree‎
4.91 KB b/‎docs/build/doctrees/index.doctree‎
4.91 KB
diff --git a/‎docs/build/doctrees/info.doctree‎
2.55 KB b/‎docs/build/doctrees/info.doctree‎
2.55 KB
diff --git a/‎docs/build/doctrees/install.doctree‎
-2.17 KB b/‎docs/build/doctrees/install.doctree‎
-2.17 KB
diff --git a/‎docs/build/doctrees/userguide.doctree‎
12.5 KB b/‎docs/build/doctrees/userguide.doctree‎
12.5 KB
diff --git a/‎docs/build/html/_sources/advanced.rst.txt‎
Lines changed: 10 additions & 2 deletions b/‎docs/build/html/_sources/advanced.rst.txt‎
Lines changed: 10 additions & 2 deletions
diff --git a/‎docs/build/html/_sources/index.rst.txt‎
Lines changed: 12 additions & 9 deletions b/‎docs/build/html/_sources/index.rst.txt‎
Lines changed: 12 additions & 9 deletions
diff --git a/‎docs/build/html/_sources/info.rst.txt‎
Lines changed: 9 additions & 6 deletions b/‎docs/build/html/_sources/info.rst.txt‎
Lines changed: 9 additions & 6 deletions
@@ -104,14 +104,22 @@ Dynamically Growing Initial Set
 * :code:`growing.num_new_dirns_each_iter` - Number of new search directions to add with each iteration where we do not have a full set of search directions. Default is 0, as this approach is not recommended. 
 
 Dykstra's Algorithm
--------------------------------
+-------------------
 * :code:`dykstra.d_tol` - Tolerance on the stopping conditions of Dykstra's algorithm. Default is :math:`10^{-10}`. 
 * :code:`dykstra.max_iters` - The maximum number of iterations Dykstra's algorithm is allowed to take before stopping. Default is :math:`100`. 
 
 Checking Matrix Rank
--------------------------------
+--------------------
 * :code:`matrix_rank.r_tol` - Tolerance on what is the smallest posisble diagonal entry value in the QR factorization before being considered zero. Default is :math:`10^{-18}`. 
 
+Handling regularizer
+--------------------
+* :code:`func_tol.criticality_measure` - scale factor (of the current trust-region radius) to determine the accuracy of the calculated  criticality/stationarity measure (smaller means more accurate). Default is :math:`10^{-3}`. 
+* :code:`func_tol.tr_step` - scale factor to determine the accuracy of the trust-region step (smaller is less accurate). Default is :math:`0.9`. 
+* :code:`func_tol.max_iters` - maximum number of subproblem (S-FISTA) iterations. Default is 500. 
+* :code:`sfista.max_iters_scaling` - by what factor to increase the minimum number of subproblem (S-FISTA) iterations. Must be at least 1. Default is 2. 
+
+
 
 References
 ----------
 
@@ -104,14 +104,22 @@ Dynamically Growing Initial Set
 * :code:`growing.num_new_dirns_each_iter` - Number of new search directions to add with each iteration where we do not have a full set of search directions. Default is 0, as this approach is not recommended. 
 
 Dykstra's Algorithm
--------------------------------
+-------------------
 * :code:`dykstra.d_tol` - Tolerance on the stopping conditions of Dykstra's algorithm. Default is :math:`10^{-10}`. 
 * :code:`dykstra.max_iters` - The maximum number of iterations Dykstra's algorithm is allowed to take before stopping. Default is :math:`100`. 
 
 Checking Matrix Rank
--------------------------------
+--------------------
 * :code:`matrix_rank.r_tol` - Tolerance on what is the smallest posisble diagonal entry value in the QR factorization before being considered zero. Default is :math:`10^{-18}`. 
 
+Handling regularizer
+--------------------
+* :code:`func_tol.criticality_measure` - scale factor (of the current trust-region radius) to determine the accuracy of the calculated  criticality/stationarity measure (smaller means more accurate). Default is :math:`10^{-3}`. 
+* :code:`func_tol.tr_step` - scale factor to determine the accuracy of the trust-region step (smaller is less accurate). Default is :math:`0.9`. 
+* :code:`func_tol.max_iters` - maximum number of subproblem (S-FISTA) iterations. Default is 500. 
+* :code:`sfista.max_iters_scaling` - by what factor to increase the minimum number of subproblem (S-FISTA) iterations. Must be at least 1. Default is 2. 
+
+
 
 References
 ----------
 
@@ -11,22 +11,26 @@ DFO-LS: Derivative-Free Optimizer for Least-Squares Minimization
 
 **Author:** `Lindon Roberts <[email protected]>`_
 
-DFO-LS is a flexible package for finding local solutions to nonlinear least-squares minimization problems (with optional constraints), without requiring any derivatives of the objective. DFO-LS stands for Derivative-Free Optimizer for Least-Squares.
+DFO-LS is a flexible package for finding local solutions to nonlinear least-squares minimization problems (with optional regularizer and constraints), without requiring any derivatives of the objective. DFO-LS stands for Derivative-Free Optimizer for Least-Squares.
 
 That is, DFO-LS solves
 
 .. math::
 
-   \min_{x\in\mathbb{R}^n}  &\quad  f(x) := \sum_{i=1}^{m}r_{i}(x)^2 \\
-   \text{s.t.} &\quad x \in C\\
-               &\quad  a \leq x \leq b
+   \min_{x\in\mathbb{R}^n}  &\quad  f(x) := \sum_{i=1}^{m}r_{i}(x)^2 + h(x) \\
+   \text{s.t.} &\quad  a \leq x \leq b\\
+               &\quad x \in C := C_1 \cap \cdots \cap C_n, \quad \text{all $C_i$ convex}\\
 
-The constraint set :math:`C` is the intersection of multiple convex sets provided as input by the user. All constraints are non-relaxable (i.e. DFO-LS will never ask to evaluate a point that is not feasible).
+The optional regularizer :math:`h(x)` is a Lipschitz continuous and convex, but possibly non-differentiable function that is typically used to avoid overfitting. 
+A common choice is :math:`h(x)=\lambda \|x\|_1` (called L1 regularization or LASSO) for :math:`\lambda>0`. 
+Note that in the case of Tikhonov regularization/ridge regression, :math:`h(x)=\lambda\|x\|_2^2` is not Lipschitz continuous, so should instead be incorporated by adding an extra term into the least-squares sum, :math:`r_{m+1}(x)=\sqrt{\lambda} \|x\|_2`.
+The (optional) constraint set :math:`C` is the intersection of multiple convex sets provided as input by the user. All constraints are non-relaxable (i.e. DFO-LS will never ask to evaluate a point that is not feasible), although the general constraints :math:`x\in C` may be slightly violated from rounding errors.
 
 Full details of the DFO-LS algorithm are given in our papers: 
 
-* C. Cartis, J. Fiala, B. Marteau and L. Roberts, `Improving the Flexibility and Robustness of Model-Based Derivative-Free Optimization Solvers <https://doi.org/10.1145/3338517>`_, *ACM Transactions on Mathematical Software*, 45:3 (2019), pp. 32:1-32:41 [`preprint <https://arxiv.org/abs/1804.00154>`_] . 
-* Hough, M. and Roberts, L., `Model-Based Derivative-Free Methods for Convex-Constrained Optimization <https://doi.org/10.1137/21M1460971>`_, *SIAM Journal on Optimization*, 21:4 (2022), pp. 2552-2579 [`preprint <https://arxiv.org/abs/2111.05443>`_].
+1. C. Cartis, J. Fiala, B. Marteau and L. Roberts, `Improving the Flexibility and Robustness of Model-Based Derivative-Free Optimization Solvers <https://doi.org/10.1145/3338517>`_, *ACM Transactions on Mathematical Software*, 45:3 (2019), pp. 32:1-32:41 [`preprint <https://arxiv.org/abs/1804.00154>`_] . 
+2. M. Hough, and L. Roberts, `Model-Based Derivative-Free Methods for Convex-Constrained Optimization <https://doi.org/10.1137/21M1460971>`_, *SIAM Journal on Optimization*, 21:4 (2022), pp. 2552-2579 [`preprint <https://arxiv.org/abs/2111.05443>`_].
+3. Y. Liu, K. H. Lam and L. Roberts, `Black-box Optimization Algorithms for Regularized Least-squares Problems <http://arxiv.org/abs/2407.14915>`_, *arXiv preprint arXiv:arXiv:2407.14915*, 2024.
 
 DFO-LS is a more flexible version of `DFO-GN <https://github.com/numericalalgorithmsgroup/dfogn>`_.
 
@@ -48,5 +52,4 @@ DFO-LS is released under the GNU General Public License. Please `contact NAG <ht
 
 Acknowledgements
 ----------------
-This software was initially developed under the supervision of `Coralia Cartis <https://www.maths.ox.ac.uk/people/coralia.cartis>`_, and was supported by the EPSRC Centre For Doctoral Training in `Industrially Focused Mathematical Modelling <https://www.maths.ox.ac.uk/study-here/postgraduate-study/industrially-focused-mathematical-modelling-epsrc-cdt>`_ (EP/L015803/1) in collaboration with the `Numerical Algorithms Group <http://www.nag.com/>`_.
-
+This software was initially developed under the supervision of `Coralia Cartis <https://www.maths.ox.ac.uk/people/coralia.cartis>`_, and was supported by the EPSRC Centre For Doctoral Training in `Industrially Focused Mathematical Modelling <https://www.maths.ox.ac.uk/study-here/postgraduate-study/industrially-focused-mathematical-modelling-epsrc-cdt>`_ (EP/L015803/1) in collaboration with the `Numerical Algorithms Group <http://www.nag.com/>`_. Development of DFO-LS has also been supported by the Australian Research Council (DE240100006).
@@ -7,11 +7,11 @@ DFO-LS is designed to solve the nonlinear least-squares minimization problem (wi
 
 .. math::
 
-   \min_{x\in\mathbb{R}^n}  &\quad  f(x) := \sum_{i=1}^{m}r_{i}(x)^2 \\
-   \text{s.t.} &\quad x \in C\\
-               &\quad  a \leq x \leq b
+   \min_{x\in\mathbb{R}^n}  &\quad  f(x) := \sum_{i=1}^{m}r_{i}(x)^2 + h(x) \\
+   \text{s.t.} &\quad  a \leq x \leq b\\
+               &\quad x \in C := C_1 \cap \cdots \cap C_n, \quad \text{all $C_i$ convex}
 
-We call :math:`f(x)` the objective function and :math:`r_i(x)` the residual functions (or simply residuals).
+We call :math:`f(x)` the objective function, :math:`r_i(x)` the residual functions (or simply residuals), and :math:`h(x)` the regularizer.
 :math:`C` is the intersection of multiple convex sets given as input by the user.
 
 DFO-LS is a *derivative-free* optimization algorithm, which means it does not require the user to provide the derivatives of :math:`f(x)` or :math:`r_i(x)`, nor does it attempt to estimate them internally (by using finite differencing, for instance). 
@@ -86,7 +86,7 @@ At each step, we compute a trial step :math:`s_k` designed to make our approxima
 
 In DFO-LS, we construct our approximation :math:`m_k(s)` by interpolating a linear approximation for each residual :math:`r_i(x)` at several points close to :math:`x_k`. To make sure our interpolated model is accurate, we need to regularly check that the points are well-spaced, and move them if they aren't (i.e. improve the geometry of our interpolation points).
 
-A complete description of the DFO-LS algorithm is given in our papers [CFMR2018]_ and [HR2022]_.
+A complete description of the DFO-LS algorithm is given in our papers [CFMR2018]_, [HR2022]_ and [LLR2024]_.
 
 References
 ----------
@@ -95,4 +95,7 @@ References
    Coralia Cartis, Jan Fiala, Benjamin Marteau and Lindon Roberts, `Improving the Flexibility and Robustness of Model-Based Derivative-Free Optimization Solvers <https://doi.org/10.1145/3338517>`_, *ACM Transactions on Mathematical Software*, 45:3 (2019), pp. 32:1-32:41 [`preprint <https://arxiv.org/abs/1804.00154>`_] 
 
 .. [HR2022]   
-   Hough, M. and Roberts, L., `Model-Based Derivative-Free Methods for Convex-Constrained Optimization <https://doi.org/10.1137/21M1460971>`_, *SIAM Journal on Optimization*, 21:4 (2022), pp. 2552-2579 [`preprint <https://arxiv.org/abs/2111.05443>`_].
+   Matthew Hough and Lindon Roberts, `Model-Based Derivative-Free Methods for Convex-Constrained Optimization <https://doi.org/10.1137/21M1460971>`_, *SIAM Journal on Optimization*, 21:4 (2022), pp. 2552-2579 [`preprint <https://arxiv.org/abs/2111.05443>`_].
+
+.. [LLR2024]   
+   Yanjun Liu, Kevin H. Lam and Lindon Roberts, `Black-box Optimization Algorithms for Regularized Least-squares Problems <http://arxiv.org/abs/2407.14915>`_, *arXiv preprint arXiv:2407.14915* (2024).