Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
2025-09-30 James Balamuta <[email protected]>

* DESCRIPTION (Version): Release 3.10.0
* NEWS.md: Update for Ensmallen release 3.10.0
* inst/include/ensmallen_bits: Upgraded to Ensmallen 3.10.0
* inst/include/ensmallen.hpp: ditto

2025-09-09 James Balamuta <[email protected]>

* DESCRIPTION: Updated requirements for RcppArmadillo
Expand Down
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: RcppEnsmallen
Title: Header-Only C++ Mathematical Optimization Library for 'Armadillo'
Version: 0.2.22.1.2
Version: 0.3.10.0.1
Authors@R: c(
person("James Joseph", "Balamuta", email = "[email protected]",
role = c("aut", "cre", "cph"),
Expand Down
61 changes: 61 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,64 @@
# RcppEnsmallen 0.3.10.0.1

- Upgraded to ensmallen 3.10.0: "Unexpected Rain" (2025-09-30)
- SGD-like optimizers now all divide the step size by the batch size so that
step sizes don't need to be tuned in addition to batch sizes. If you require
behavior from ensmallen 2, define the `ENS_OLD_SEPARABLE_STEP_BEHAVIOR` macro
before including `ensmallen.hpp`
([#431](https://github.com/mlpack/ensmallen/pull/431)).
- Remove deprecated `ParetoFront()` and `ParetoSet()` from multi-objective
optimizers ([#435](https://github.com/mlpack/ensmallen/pull/435)). Instead,
pass objects to the `Optimize()` function; see the documentation for each
multi-objective optimizer for more details. A typical transition will change
code like:
```c++
optimizer.Optimize(objectives, coordinates);
arma::cube paretoFront = optimizer.ParetoFront();
arma::cube paretoSet = optimizer.ParetoSet();
```
to instead gather the Pareto front and set in the call:
```c++
arma::cube paretoFront, paretoSet;
optimizer.Optimize(objectives, coordinates, paretoFront, paretoSet);
```
- Remove deprecated constructor for Active CMA-ES that takes `lowerBound` and
`upperBound` ([#435](https://github.com/mlpack/ensmallen/pull/435)).
Instead, pass an instantiated `BoundaryBoxConstraint` to the constructor. A
typical transition will change code like:
```c++
ActiveCMAES<FullSelection, BoundaryBoxConstraint> opt(lambda,
lowerBound, upperBound, ...);
```
into
```c++
ActiveCMAES<FullSelection, BoundaryBoxConstraint> opt(lambda,
BoundaryBoxConstraint(lowerBound, upperBound), ...);
```
- Add proximal gradient optimizers for L1-constrained and other related
problems: `FBS`, `FISTA`, and `FASTA`
([#427](https://github.com/mlpack/ensmallen/pull/427)). See the
documentation for more details.
- The `Lambda()` and `Sigma()` functions of the `AugLagrangian` optimizer,
which could be used to retrieve the Lagrange multipliers and penalty
parameter after optimization, are now deprecated
([#439](https://github.com/mlpack/ensmallen/pull/439)). Instead, pass a
vector and a double to the `Optimize()` function directly:
```c++
augLag.Optimize(function, coordinates, lambda, sigma)
```
and these will be filled with the final Lagrange multiplier estimates and
penalty parameters.
- Fix include statement in `tests/de_test.cpp`
([#419](https://github.com/mlpack/ensmallen/pull/419)).
- Fix `exactObjective` output for SGD-like optimizers when the number of
iterations is an even number of epochs
([#417](https://github.com/mlpack/ensmallen/pull/417)).
- Increase tolerance in `demon_sgd_test.cpp`
([#420](https://github.com/mlpack/ensmallen/pull/420)).
- Set cmake version range to 3.5...4.0
([#422](https://github.com/mlpack/ensmallen/pull/422)).


# RcppEnsmallen 0.2.22.1.2

- `-DARMA_USE_CURRENT` added to `PKG_CXXFLAGS` to use Armadillo 15.0.2 or higher
Expand Down
20 changes: 17 additions & 3 deletions inst/include/ensmallen.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,16 @@

#include <armadillo>

#if ((ARMA_VERSION_MAJOR < 10) || ((ARMA_VERSION_MAJOR == 10) && (ARMA_VERSION_MINOR < 8)))
#if defined(COOT_VERSION_MAJOR) && \
((COOT_VERSION_MAJOR >= 2) || \
(COOT_VERSION_MAJOR == 2 && COOT_VERSION_MINOR >= 1))
// The version of Bandicoot is new enough that we can use it.
#undef ENS_HAVE_COOT
#define ENS_HAVE_COOT
#endif

#if ((ARMA_VERSION_MAJOR < 10) || \
((ARMA_VERSION_MAJOR == 10) && (ARMA_VERSION_MINOR < 8)))
#error "need Armadillo version 10.8 or newer"
#endif

Expand Down Expand Up @@ -69,7 +78,10 @@
#include "ensmallen_bits/log.hpp" // TODO: should move to another place

#include "ensmallen_bits/utility/any.hpp"
#include "ensmallen_bits/utility/arma_traits.hpp"
#include "ensmallen_bits/utility/proxies.hpp"
#include "ensmallen_bits/utility/function_traits.hpp"
#include "ensmallen_bits/utility/using.hpp"
#include "ensmallen_bits/utility/detect_callbacks.hpp"
#include "ensmallen_bits/utility/indicators/epsilon.hpp"
#include "ensmallen_bits/utility/indicators/igd.hpp"
#include "ensmallen_bits/utility/indicators/igd_plus.hpp"
Expand Down Expand Up @@ -109,8 +121,10 @@
#include "ensmallen_bits/cne/cne.hpp"
#include "ensmallen_bits/de/de.hpp"
#include "ensmallen_bits/eve/eve.hpp"
#include "ensmallen_bits/fasta/fasta.hpp"
#include "ensmallen_bits/fbs/fbs.hpp"
#include "ensmallen_bits/fista/fista.hpp"
#include "ensmallen_bits/ftml/ftml.hpp"

#include "ensmallen_bits/fw/frank_wolfe.hpp"
#include "ensmallen_bits/gradient_descent/gradient_descent.hpp"
#include "ensmallen_bits/grid_search/grid_search.hpp"
Expand Down
2 changes: 1 addition & 1 deletion inst/include/ensmallen_bits/ada_belief/ada_belief.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ class AdaBelief
typename MatType,
typename GradType,
typename... CallbackTypes>
typename std::enable_if<IsArmaType<GradType>::value,
typename std::enable_if<IsMatrixType<GradType>::value,
typename MatType::elem_type>::type
Optimize(SeparableFunctionType& function,
MatType& iterate,
Expand Down
29 changes: 21 additions & 8 deletions inst/include/ensmallen_bits/ada_belief/ada_belief_update.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@ class AdaBeliefUpdate
class Policy
{
public:
typedef typename MatType::elem_type ElemType;

/**
* This constructor is called by the SGD Optimize() method before the start
* of the iteration update process.
Expand All @@ -89,10 +91,16 @@ class AdaBeliefUpdate
*/
Policy(AdaBeliefUpdate& parent, const size_t rows, const size_t cols) :
parent(parent),
beta1(ElemType(parent.beta1)),
beta2(ElemType(parent.beta2)),
epsilon(ElemType(parent.epsilon)),
iteration(0)
{
m.zeros(rows, cols);
s.zeros(rows, cols);
// Prevent underflow.
if (epsilon == ElemType(0) && parent.epsilon != 0.0)
epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
}

/**
Expand All @@ -109,18 +117,18 @@ class AdaBeliefUpdate
// Increment the iteration counter variable.
++iteration;

m *= parent.beta1;
m += (1 - parent.beta1) * gradient;
m *= beta1;
m += (1 - beta1) * gradient;

s *= parent.beta2;
s += (1 - parent.beta2) * arma::pow(gradient - m, 2.0) + parent.epsilon;
s *= beta2;
s += (1 - beta2) * pow(gradient - m, 2) + epsilon;

const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));

// And update the iterate.
iterate -= ((m / biasCorrection1) * stepSize) / (arma::sqrt(s /
biasCorrection2) + parent.epsilon);
iterate -= ((m / biasCorrection1) * ElemType(stepSize)) /
(sqrt(s / biasCorrection2) + epsilon);
}

private:
Expand All @@ -133,6 +141,11 @@ class AdaBeliefUpdate
// The exponential moving average of squared gradient values.
GradType s;

// Parent parameters converted to the element type of the matrix.
ElemType beta1;
ElemType beta2;
ElemType epsilon;

// The number of iterations.
size_t iteration;
};
Expand Down
2 changes: 1 addition & 1 deletion inst/include/ensmallen_bits/ada_bound/ada_bound.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ class AdaBoundType
typename MatType,
typename GradType,
typename... CallbackTypes>
typename std::enable_if<IsArmaType<GradType>::value,
typename std::enable_if<IsMatrixType<GradType>::value,
typename MatType::elem_type>::type
Optimize(DecomposableFunctionType& function,
MatType& iterate,
Expand Down
55 changes: 39 additions & 16 deletions inst/include/ensmallen_bits/ada_bound/ada_bound_update.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,8 @@ class AdaBoundUpdate
class Policy
{
public:
typedef typename MatType::elem_type ElemType;

/**
* This constructor is called by the SGD Optimize() method before the start
* of the iteration update process.
Expand All @@ -105,10 +107,24 @@ class AdaBoundUpdate
* @param cols Number of columns in the gradient matrix.
*/
Policy(AdaBoundUpdate& parent, const size_t rows, const size_t cols) :
parent(parent), first(true), initialStepSize(0), iteration(0)
parent(parent),
finalLr(ElemType(parent.finalLr)),
gamma(ElemType(parent.gamma)),
epsilon(ElemType(parent.epsilon)),
beta1(ElemType(parent.beta1)),
beta2(ElemType(parent.beta2)),
first(true),
initialStepSize(0),
iteration(0)
{
m.zeros(rows, cols);
v.zeros(rows, cols);

// Check for underflows in conversions.
if (gamma == ElemType(0) && parent.gamma != 0.0)
gamma = 10 * std::numeric_limits<ElemType>::epsilon();
if (epsilon == ElemType(0) && parent.epsilon != 0.0)
epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
}

/**
Expand All @@ -129,30 +145,30 @@ class AdaBoundUpdate
if (first)
{
first = false;
initialStepSize = stepSize;
initialStepSize = ElemType(stepSize);
}

// Increment the iteration counter variable.
++iteration;

// Decay the first and second moment running average coefficient.
m *= parent.beta1;
m += (1 - parent.beta1) * gradient;
m *= beta1;
m += (1 - beta1) * gradient;

v *= parent.beta2;
v += (1 - parent.beta2) * (gradient % gradient);
v *= beta2;
v += (1 - beta2) * (gradient % gradient);

const ElemType biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
const ElemType biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));

const ElemType fl = parent.finalLr * stepSize / initialStepSize;
const ElemType lower = fl * (1.0 - 1.0 / (parent.gamma * iteration + 1));
const ElemType upper = fl * (1.0 + 1.0 / (parent.gamma * iteration));
const ElemType fl = finalLr * ElemType(stepSize) / initialStepSize;
const ElemType lower = fl * (1 - 1 / (gamma * iteration + 1));
const ElemType upper = fl * (1 + 1 / (gamma * iteration));

// Applies bounds on actual learning rate.
iterate -= arma::clamp((stepSize *
std::sqrt(biasCorrection2) / biasCorrection1) / (arma::sqrt(v) +
parent.epsilon), lower, upper) % m;
// Applies bounds on actual learning rate.
iterate -= clamp((ElemType(stepSize) *
std::sqrt(biasCorrection2) / biasCorrection1) / (sqrt(v) + epsilon),
lower, upper) % m;
}

private:
Expand All @@ -165,11 +181,18 @@ class AdaBoundUpdate
// The exponential moving average of squared gradient values.
GradType v;

// Parameters of the parent, casted to the element type of the problem.
ElemType finalLr;
ElemType gamma;
ElemType epsilon;
ElemType beta1;
ElemType beta2;

// Whether this is the first call of the Update method.
bool first;

// The initial (Adam) learning rate.
double initialStepSize;
ElemType initialStepSize;

// The number of iterations.
size_t iteration;
Expand Down
Loading