documentation: apply suggestions from code review

MaxenceGollier · dpo · web-flow · commit bdddf6a65760 · 2025-10-01T08:58:32.000-04:00
Co-authored-by: Dominique &lt;dominique.orban@gmail.com&gt;
diff --git a/docs/src/algorithms.md b/docs/src/algorithms.md
@@ -7,7 +7,7 @@ Suppose we are given the general regularized problem
 \underset{x \in \mathbb{R}^n}{\text{minimize}} \quad f(x) + h(x),
 ```
 where $f : \mathbb{R}^n \mapsto \mathbb{R}$ is continuously differentiable and $h : \mathbb{R}^n \mapsto \mathbb{R} \cup \{\infty\}$ is lower semi-continuous.
-Instead of solving the above directly, which is often impossible, we will solve a simplified version of it repeatedly until we reach a minimizer of the problem above.
+Instead of solving the above directly, which is often impossible, we will solve a simplified version of it repeatedly until we reach a stationary point of the problem above.
 To do so, suppose we are given an iterate $x_0 \in \mathbb{R}^n$, we wish to compute a step, $s_0 \in \mathbb{R}^n$ and improve our iterate with $x_1 := x_0 + s_0$.
 Now, we are going to approximate the functions $f$ and $h$ around $x_0$ with simpler functions (models), which we denote respectively $\varphi(\cdot; x_0)$ and $\psi(\cdot; x_0)$ so that
 ```math
@@ -17,19 +17,19 @@ We then wish to compute the step as
 ```math
 s_0 \in \underset{s \in \mathbb{R}^n}{\argmin} \  \varphi(s; x_0) + \psi(s; x_0).
 ```
-In order to ensure convergence and to handle the potential nonconvexity of the objective function, we either add a trust-region,
+In order to ensure convergence and to handle the potential nonconvexity of the objective function, we either add a trust-region constraint,
 ```math
 s_0 \in \underset{s \in \mathbb{R}^n}{\argmin} \  \varphi(s; x_0) + \psi(s; x_0) \quad \text{subject to} \ \|s\| \leq \Delta,
 ```
 or a quadratic regularization
 ```math
-s_0 \in \underset{s \in \mathbb{R}^n}{\argmin} \  \varphi(s; x_0) + \psi(s; x_0) + \sigma \|s\|^2_2.
+s_0 \in \underset{s \in \mathbb{R}^n}{\argmin} \  \varphi(s; x_0) + \psi(s; x_0) + \tfrac{1}{2} \sigma \|s\|^2_2.
 ```
 Algorithms that work with a trust-region are [`TR`](@ref TR) and [`TRDH`](@ref TRDH) and the ones working with a quadratic regularization are [`R2`](@ref R2), [`R2N`](@ref R2N) and [`R2DH`](@ref R2DH)
 
 The models for the smooth part `f` in this package are always quadratic models of the form
 ```math
-\varphi(s; x_0) = f(x_0) + \nabla f(x_0)^T s + \frac{1}{2} s^T H(x_0) s,
+\varphi(s; x_0) = f(x_0) + \nabla f(x_0)^T s + \tfrac{1}{2} s^T H(x_0) s,
 ```
 where $H(x_0)$ is a symmetric matrix that can be either $0$, the Hessian of $f$ (if it exists) or a quasi-Newton approximation.
 Some algorithms require a specific structure for $H$, for an overview, refer to the table below.
@@ -45,15 +45,14 @@ Algorithm | Quadratic Regularization | Trust Region | Quadratic term for $\varph
 [`TRDH`](@ref TRDH) | No | Yes | Any Diagonal | [leconte-orban-2025; Algorithm 5.1](@cite)
 
 ## Nonlinear least-squares
-This package provides two algorithms, [`LM`](@ref LM) and [`LMTR`](@ref LMTR), specialized for regularized, nonlinear least-squares.
-That is, problems of the form
+This package provides two algorithms, [`LM`](@ref LM) and [`LMTR`](@ref LMTR), specialized for regularized, nonlinear least-squares, i.e., problems of the form
 ```math
-\underset{x \in \mathbb{R}^n}{\text{minimize}} \quad \frac{1}{2}\|F(x)\|_2^2 + h(x),
+\underset{x \in \mathbb{R}^n}{\text{minimize}} \quad \tfrac{1}{2}\|F(x)\|_2^2 + h(x),
 ```
 where $F : \mathbb{R}^n \mapsto \mathbb{R}^m$ is continuously differentiable and $h : \mathbb{R}^n \mapsto \mathbb{R} \cup \{\infty\}$ is lower semi-continuous.
 In that case, the model $\varphi$ is defined as 
 ```math
-\varphi(s; x) = \frac{1}{2}\|F(x) + J(x)s\|_2^2,
+\varphi(s; x) = \tfrac{1}{2}\|F(x) + J(x)s\|_2^2,
 ```
 where $J(x)$ is the Jacobian of $F$ at $x$.
 Similar to the algorithms in the previous section, we either add a quadratic regularization to the model ([`LM`](@ref LM)) or a trust-region ([`LMTR`](@ref LMTR)).
diff --git a/docs/src/examples/basic.md b/docs/src/examples/basic.md
@@ -61,7 +61,8 @@ regularized_pb = RegularizedNLPModel(f_model, h)
 
 using RegularizedOptimization
 
-# Suppose for example that we don't want to use a quasi-Newton approach and/or that we don't have access to the Hessian of f.
+# Suppose for example that we don't want to use a quasi-Newton approach
+# and that we don't have access to the Hessian of f, or that we don't want to incur the cost of computing it
 # In this case, the most appropriate solver would be R2.
 # For this example, we also choose a relatively small tolerance by specifying the keyword argument atol across all solvers.
 out = R2(regularized_pb, verbose = 10, atol = 1e-3)
@@ -76,7 +77,7 @@ println("-----------------------------------------------------------------------
 
 # Suppose for some reason we can not compute the Hessian. 
 # In this case, we can try to switch to a quasi-Newton approximation, this can be done with NLPModelsModifiers.jl
-# We could choose to TR again but for the sake of this tutorial we are going to try to run it with R2N
+# We could choose to use TR again but for the sake of this tutorial we run it with R2N
 
 using NLPModelsModifiers
 
@@ -89,8 +90,8 @@ out = R2N(regularized_pb_lsr1, verbose = 10, atol = 1e-3)
 println("R2N converged after $(out.iter) iterations to the solution x = $(out.solution)")
 println("--------------------------------------------------------------------------------------")
 
-# Finally, in the case where the quasi-Newton approximation is diagonal, TRDH and R2DH are specialized solvers to this case
-# and should be used in favour of TR and R2N respectively.
+# Finally, TRDH and R2DH are specialized for diagonal quasi-Newton approximations,
+# and should be used instead of TR and R2N, respectively.
 f_model_sg = SpectralGradientModel(f_model)
 regularized_pb_sg = RegularizedNLPModel(f_model_sg, h)
 
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -17,7 +17,7 @@ A [`RegularizedNLPModel`](https://jso.dev/RegularizedProblems.jl/stable/referenc
 - a smooth component `f` represented as an [`AbstractNLPModel`](https://github.com/JuliaSmoothOptimizers/NLPModels.jl),  
 - a nonsmooth regularizer `h`.  
 
-For the smooth component `f`, we refer to [jso.dev](https://jso.dev) for tutorials on the `NLPModel` API. This framework allows the usage of models from  
+For the smooth component `f`, we refer to [jso.dev](https://jso.dev) for tutorials on the `NLPModels` API. This framework allows the usage of models from  
 - AMPL ([AmplNLReader.jl](https://github.com/JuliaSmoothOptimizers/AmplNLReader.jl)),  
 - CUTEst ([CUTEst.jl](https://github.com/JuliaSmoothOptimizers/CUTEst.jl)),  
 - JuMP ([NLPModelsJuMP.jl](https://github.com/JuliaSmoothOptimizers/NLPModelsJuMP.jl)),