diff --git a/ChangeLog b/ChangeLog index f1fb586..742666d 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2025-09-30 James Balamuta + + * DESCRIPTION (Version): Release 3.10.0 + * NEWS.md: Update for Ensmallen release 3.10.0 + * inst/include/ensmallen_bits: Upgraded to Ensmallen 3.10.0 + * inst/include/ensmallen.hpp: ditto + 2025-09-09 James Balamuta * DESCRIPTION: Updated requirements for RcppArmadillo diff --git a/DESCRIPTION b/DESCRIPTION index e7430f2..b982fe6 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: RcppEnsmallen Title: Header-Only C++ Mathematical Optimization Library for 'Armadillo' -Version: 0.2.22.1.2 +Version: 0.3.10.0.1 Authors@R: c( person("James Joseph", "Balamuta", email = "balamut2@illinois.edu", role = c("aut", "cre", "cph"), diff --git a/NEWS.md b/NEWS.md index 8a44fdc..213fb90 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,64 @@ +# RcppEnsmallen 0.3.10.0.1 + +- Upgraded to ensmallen 3.10.0: "Unexpected Rain" (2025-09-30) + - SGD-like optimizers now all divide the step size by the batch size so that + step sizes don't need to be tuned in addition to batch sizes. If you require + behavior from ensmallen 2, define the `ENS_OLD_SEPARABLE_STEP_BEHAVIOR` macro + before including `ensmallen.hpp` + ([#431](https://github.com/mlpack/ensmallen/pull/431)). + - Remove deprecated `ParetoFront()` and `ParetoSet()` from multi-objective + optimizers ([#435](https://github.com/mlpack/ensmallen/pull/435)). Instead, + pass objects to the `Optimize()` function; see the documentation for each + multi-objective optimizer for more details. A typical transition will change + code like: + ```c++ + optimizer.Optimize(objectives, coordinates); + arma::cube paretoFront = optimizer.ParetoFront(); + arma::cube paretoSet = optimizer.ParetoSet(); + ``` + to instead gather the Pareto front and set in the call: + ```c++ + arma::cube paretoFront, paretoSet; + optimizer.Optimize(objectives, coordinates, paretoFront, paretoSet); + ``` + - Remove deprecated constructor for Active CMA-ES that takes `lowerBound` and + `upperBound` ([#435](https://github.com/mlpack/ensmallen/pull/435)). + Instead, pass an instantiated `BoundaryBoxConstraint` to the constructor. A + typical transition will change code like: + ```c++ + ActiveCMAES opt(lambda, + lowerBound, upperBound, ...); + ``` + into + ```c++ + ActiveCMAES opt(lambda, + BoundaryBoxConstraint(lowerBound, upperBound), ...); + ``` + - Add proximal gradient optimizers for L1-constrained and other related + problems: `FBS`, `FISTA`, and `FASTA` + ([#427](https://github.com/mlpack/ensmallen/pull/427)). See the + documentation for more details. + - The `Lambda()` and `Sigma()` functions of the `AugLagrangian` optimizer, + which could be used to retrieve the Lagrange multipliers and penalty + parameter after optimization, are now deprecated + ([#439](https://github.com/mlpack/ensmallen/pull/439)). Instead, pass a + vector and a double to the `Optimize()` function directly: + ```c++ + augLag.Optimize(function, coordinates, lambda, sigma) + ``` + and these will be filled with the final Lagrange multiplier estimates and + penalty parameters. + - Fix include statement in `tests/de_test.cpp` + ([#419](https://github.com/mlpack/ensmallen/pull/419)). + - Fix `exactObjective` output for SGD-like optimizers when the number of + iterations is an even number of epochs + ([#417](https://github.com/mlpack/ensmallen/pull/417)). + - Increase tolerance in `demon_sgd_test.cpp` + ([#420](https://github.com/mlpack/ensmallen/pull/420)). + - Set cmake version range to 3.5...4.0 + ([#422](https://github.com/mlpack/ensmallen/pull/422)). + + # RcppEnsmallen 0.2.22.1.2 - `-DARMA_USE_CURRENT` added to `PKG_CXXFLAGS` to use Armadillo 15.0.2 or higher diff --git a/inst/include/ensmallen.hpp b/inst/include/ensmallen.hpp index b7b08c6..5b32ca3 100644 --- a/inst/include/ensmallen.hpp +++ b/inst/include/ensmallen.hpp @@ -34,7 +34,16 @@ #include -#if ((ARMA_VERSION_MAJOR < 10) || ((ARMA_VERSION_MAJOR == 10) && (ARMA_VERSION_MINOR < 8))) +#if defined(COOT_VERSION_MAJOR) && \ + ((COOT_VERSION_MAJOR >= 2) || \ + (COOT_VERSION_MAJOR == 2 && COOT_VERSION_MINOR >= 1)) + // The version of Bandicoot is new enough that we can use it. + #undef ENS_HAVE_COOT + #define ENS_HAVE_COOT +#endif + +#if ((ARMA_VERSION_MAJOR < 10) || \ + ((ARMA_VERSION_MAJOR == 10) && (ARMA_VERSION_MINOR < 8))) #error "need Armadillo version 10.8 or newer" #endif @@ -69,7 +78,10 @@ #include "ensmallen_bits/log.hpp" // TODO: should move to another place #include "ensmallen_bits/utility/any.hpp" -#include "ensmallen_bits/utility/arma_traits.hpp" +#include "ensmallen_bits/utility/proxies.hpp" +#include "ensmallen_bits/utility/function_traits.hpp" +#include "ensmallen_bits/utility/using.hpp" +#include "ensmallen_bits/utility/detect_callbacks.hpp" #include "ensmallen_bits/utility/indicators/epsilon.hpp" #include "ensmallen_bits/utility/indicators/igd.hpp" #include "ensmallen_bits/utility/indicators/igd_plus.hpp" @@ -109,8 +121,10 @@ #include "ensmallen_bits/cne/cne.hpp" #include "ensmallen_bits/de/de.hpp" #include "ensmallen_bits/eve/eve.hpp" +#include "ensmallen_bits/fasta/fasta.hpp" +#include "ensmallen_bits/fbs/fbs.hpp" +#include "ensmallen_bits/fista/fista.hpp" #include "ensmallen_bits/ftml/ftml.hpp" - #include "ensmallen_bits/fw/frank_wolfe.hpp" #include "ensmallen_bits/gradient_descent/gradient_descent.hpp" #include "ensmallen_bits/grid_search/grid_search.hpp" diff --git a/inst/include/ensmallen_bits/ada_belief/ada_belief.hpp b/inst/include/ensmallen_bits/ada_belief/ada_belief.hpp index 1a4b13c..d346c54 100644 --- a/inst/include/ensmallen_bits/ada_belief/ada_belief.hpp +++ b/inst/include/ensmallen_bits/ada_belief/ada_belief.hpp @@ -97,7 +97,7 @@ class AdaBelief typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/ada_belief/ada_belief_update.hpp b/inst/include/ensmallen_bits/ada_belief/ada_belief_update.hpp index f768987..2cddb76 100644 --- a/inst/include/ensmallen_bits/ada_belief/ada_belief_update.hpp +++ b/inst/include/ensmallen_bits/ada_belief/ada_belief_update.hpp @@ -79,6 +79,8 @@ class AdaBeliefUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -89,10 +91,16 @@ class AdaBeliefUpdate */ Policy(AdaBeliefUpdate& parent, const size_t rows, const size_t cols) : parent(parent), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), + epsilon(ElemType(parent.epsilon)), iteration(0) { m.zeros(rows, cols); s.zeros(rows, cols); + // Prevent underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -109,18 +117,18 @@ class AdaBeliefUpdate // Increment the iteration counter variable. ++iteration; - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - s *= parent.beta2; - s += (1 - parent.beta2) * arma::pow(gradient - m, 2.0) + parent.epsilon; + s *= beta2; + s += (1 - beta2) * pow(gradient - m, 2) + epsilon; - const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration); - const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration); + const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration)); + const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration)); // And update the iterate. - iterate -= ((m / biasCorrection1) * stepSize) / (arma::sqrt(s / - biasCorrection2) + parent.epsilon); + iterate -= ((m / biasCorrection1) * ElemType(stepSize)) / + (sqrt(s / biasCorrection2) + epsilon); } private: @@ -133,6 +141,11 @@ class AdaBeliefUpdate // The exponential moving average of squared gradient values. GradType s; + // Parent parameters converted to the element type of the matrix. + ElemType beta1; + ElemType beta2; + ElemType epsilon; + // The number of iterations. size_t iteration; }; diff --git a/inst/include/ensmallen_bits/ada_bound/ada_bound.hpp b/inst/include/ensmallen_bits/ada_bound/ada_bound.hpp index 94283c3..35bdc01 100644 --- a/inst/include/ensmallen_bits/ada_bound/ada_bound.hpp +++ b/inst/include/ensmallen_bits/ada_bound/ada_bound.hpp @@ -107,7 +107,7 @@ class AdaBoundType typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(DecomposableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/ada_bound/ada_bound_update.hpp b/inst/include/ensmallen_bits/ada_bound/ada_bound_update.hpp index 3a84d87..3221d10 100644 --- a/inst/include/ensmallen_bits/ada_bound/ada_bound_update.hpp +++ b/inst/include/ensmallen_bits/ada_bound/ada_bound_update.hpp @@ -96,6 +96,8 @@ class AdaBoundUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -105,10 +107,24 @@ class AdaBoundUpdate * @param cols Number of columns in the gradient matrix. */ Policy(AdaBoundUpdate& parent, const size_t rows, const size_t cols) : - parent(parent), first(true), initialStepSize(0), iteration(0) + parent(parent), + finalLr(ElemType(parent.finalLr)), + gamma(ElemType(parent.gamma)), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), + first(true), + initialStepSize(0), + iteration(0) { m.zeros(rows, cols); v.zeros(rows, cols); + + // Check for underflows in conversions. + if (gamma == ElemType(0) && parent.gamma != 0.0) + gamma = 10 * std::numeric_limits::epsilon(); + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -129,30 +145,30 @@ class AdaBoundUpdate if (first) { first = false; - initialStepSize = stepSize; + initialStepSize = ElemType(stepSize); } // Increment the iteration counter variable. ++iteration; // Decay the first and second moment running average coefficient. - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - v *= parent.beta2; - v += (1 - parent.beta2) * (gradient % gradient); + v *= beta2; + v += (1 - beta2) * (gradient % gradient); - const ElemType biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration); - const ElemType biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration); + const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration)); + const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration)); - const ElemType fl = parent.finalLr * stepSize / initialStepSize; - const ElemType lower = fl * (1.0 - 1.0 / (parent.gamma * iteration + 1)); - const ElemType upper = fl * (1.0 + 1.0 / (parent.gamma * iteration)); + const ElemType fl = finalLr * ElemType(stepSize) / initialStepSize; + const ElemType lower = fl * (1 - 1 / (gamma * iteration + 1)); + const ElemType upper = fl * (1 + 1 / (gamma * iteration)); - // Applies bounds on actual learning rate. - iterate -= arma::clamp((stepSize * - std::sqrt(biasCorrection2) / biasCorrection1) / (arma::sqrt(v) + - parent.epsilon), lower, upper) % m; + // Applies bounds on actual learning rate. + iterate -= clamp((ElemType(stepSize) * + std::sqrt(biasCorrection2) / biasCorrection1) / (sqrt(v) + epsilon), + lower, upper) % m; } private: @@ -165,11 +181,18 @@ class AdaBoundUpdate // The exponential moving average of squared gradient values. GradType v; + // Parameters of the parent, casted to the element type of the problem. + ElemType finalLr; + ElemType gamma; + ElemType epsilon; + ElemType beta1; + ElemType beta2; + // Whether this is the first call of the Update method. bool first; // The initial (Adam) learning rate. - double initialStepSize; + ElemType initialStepSize; // The number of iterations. size_t iteration; diff --git a/inst/include/ensmallen_bits/ada_bound/ams_bound_update.hpp b/inst/include/ensmallen_bits/ada_bound/ams_bound_update.hpp index 270f8eb..26bad48 100644 --- a/inst/include/ensmallen_bits/ada_bound/ams_bound_update.hpp +++ b/inst/include/ensmallen_bits/ada_bound/ams_bound_update.hpp @@ -96,6 +96,8 @@ class AMSBoundUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -105,11 +107,25 @@ class AMSBoundUpdate * @param cols Number of columns in the gradient matrix. */ Policy(AMSBoundUpdate& parent, const size_t rows, const size_t cols) : - parent(parent), first(true), initialStepSize(0), iteration(0) + parent(parent), + finalLr(ElemType(parent.finalLr)), + gamma(ElemType(parent.gamma)), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), + first(true), + initialStepSize(0), + iteration(0) { m.zeros(rows, cols); v.zeros(rows, cols); vImproved.zeros(rows, cols); + + // Check for underflows in conversions. + if (gamma == ElemType(0) && parent.gamma != 0.0) + gamma = 10 * std::numeric_limits::epsilon(); + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -123,40 +139,36 @@ class AMSBoundUpdate const double stepSize, const GradType& gradient) { - // Convenience typedefs. - typedef typename MatType::elem_type ElemType; - // Save the initial step size. if (first) { first = false; - initialStepSize = stepSize; + initialStepSize = ElemType(stepSize); } // Increment the iteration counter variable. ++iteration; // Decay the first and second moment running average coefficient. - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - v *= parent.beta2; - v += (1 - parent.beta2) * (gradient % gradient); + v *= beta2; + v += (1 - beta2) * (gradient % gradient); - const ElemType biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration); - const ElemType biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration); + const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration)); + const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration)); - const ElemType fl = parent.finalLr * stepSize / initialStepSize; - const ElemType lower = fl * (1.0 - 1.0 / (parent.gamma * iteration + 1)); - const ElemType upper = fl * (1.0 + 1.0 / (parent.gamma * iteration)); + const ElemType fl = finalLr * ElemType(stepSize) / initialStepSize; + const ElemType lower = fl * (1 - 1 / (gamma * iteration + 1)); + const ElemType upper = fl * (1 + 1 / (gamma * iteration)); // Element wise maximum of past and present squared gradients. - vImproved = arma::max(vImproved, v); + vImproved = max(vImproved, v); // Applies bounds on actual learning rate. - iterate -= arma::clamp((stepSize * - std::sqrt(biasCorrection2) / biasCorrection1) / - (arma::sqrt(vImproved) + parent.epsilon), lower, upper) % m; + iterate -= clamp((ElemType(stepSize) * std::sqrt(biasCorrection2) / + biasCorrection1) / (sqrt(vImproved) + epsilon), lower, upper) % m; } private: @@ -169,11 +181,18 @@ class AMSBoundUpdate // The exponential moving average of squared gradient values. GradType v; + // Parameters of the parent, casted to the element type of the problem. + ElemType finalLr; + ElemType gamma; + ElemType epsilon; + ElemType beta1; + ElemType beta2; + // Whether this is the first call of the Update method. bool first; // The initial (Adam) learning rate. - double initialStepSize; + ElemType initialStepSize; // The optimal squared gradient value. GradType vImproved; diff --git a/inst/include/ensmallen_bits/ada_delta/ada_delta.hpp b/inst/include/ensmallen_bits/ada_delta/ada_delta.hpp index d958ee2..5c8348b 100644 --- a/inst/include/ensmallen_bits/ada_delta/ada_delta.hpp +++ b/inst/include/ensmallen_bits/ada_delta/ada_delta.hpp @@ -98,7 +98,7 @@ class AdaDelta typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/ada_delta/ada_delta_update.hpp b/inst/include/ensmallen_bits/ada_delta/ada_delta_update.hpp index 26c6dd7..dca4a3f 100644 --- a/inst/include/ensmallen_bits/ada_delta/ada_delta_update.hpp +++ b/inst/include/ensmallen_bits/ada_delta/ada_delta_update.hpp @@ -71,6 +71,8 @@ class AdaDeltaUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD optimizer method before the start * of the iteration update process. In AdaDelta update policy, the mean @@ -82,10 +84,16 @@ class AdaDeltaUpdate * @param cols Number of columns in the gradient matrix. */ Policy(AdaDeltaUpdate& parent, const size_t rows, const size_t cols) : - parent(parent) + parent(parent), + rho(ElemType(parent.rho)), + epsilon(ElemType(parent.epsilon)) { meanSquaredGradient.zeros(rows, cols); meanSquaredGradientDx.zeros(rows, cols); + + // Check for underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -102,17 +110,17 @@ class AdaDeltaUpdate const GradType& gradient) { // Accumulate gradient. - meanSquaredGradient *= parent.rho; - meanSquaredGradient += (1 - parent.rho) * (gradient % gradient); - GradType dx = arma::sqrt((meanSquaredGradientDx + parent.epsilon) / - (meanSquaredGradient + parent.epsilon)) % gradient; + meanSquaredGradient *= rho; + meanSquaredGradient += (1 - rho) * (gradient % gradient); + GradType dx = sqrt((meanSquaredGradientDx + epsilon) / + (meanSquaredGradient + epsilon)) % gradient; // Accumulate updates. - meanSquaredGradientDx *= parent.rho; - meanSquaredGradientDx += (1 - parent.rho) * (dx % dx); + meanSquaredGradientDx *= rho; + meanSquaredGradientDx += (1 - rho) * (dx % dx); // Apply update. - iterate -= (stepSize * dx); + iterate -= (ElemType(stepSize) * dx); } private: @@ -124,6 +132,10 @@ class AdaDeltaUpdate // The delta mean squared gradient matrix. GradType meanSquaredGradientDx; + + // Parameters of the update, converted to the matrix element type. + ElemType rho; + ElemType epsilon; }; private: diff --git a/inst/include/ensmallen_bits/ada_grad/ada_grad.hpp b/inst/include/ensmallen_bits/ada_grad/ada_grad.hpp index 677d300..7522668 100644 --- a/inst/include/ensmallen_bits/ada_grad/ada_grad.hpp +++ b/inst/include/ensmallen_bits/ada_grad/ada_grad.hpp @@ -94,7 +94,7 @@ class AdaGrad typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/ada_grad/ada_grad_update.hpp b/inst/include/ensmallen_bits/ada_grad/ada_grad_update.hpp index b096dd4..4fe8e9d 100644 --- a/inst/include/ensmallen_bits/ada_grad/ada_grad_update.hpp +++ b/inst/include/ensmallen_bits/ada_grad/ada_grad_update.hpp @@ -64,6 +64,8 @@ class AdaGradUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD optimizer before the start of the * iteration update process. In AdaGrad update policy, squared gradient @@ -76,10 +78,14 @@ class AdaGradUpdate */ Policy(AdaGradUpdate& parent, const size_t rows, const size_t cols) : parent(parent), - squaredGradient(rows, cols) + squaredGradient(rows, cols), + epsilon(ElemType(parent.epsilon)) { // Initialize an empty matrix for sum of squares of parameter gradient. squaredGradient.zeros(); + // Detect underflow for epsilon and try to address it. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -96,8 +102,8 @@ class AdaGradUpdate const GradType& gradient) { squaredGradient += (gradient % gradient); - iterate -= (stepSize * gradient) / (arma::sqrt(squaredGradient) + - parent.epsilon); + iterate -= (ElemType(stepSize) * gradient) / (sqrt(squaredGradient) + + epsilon); } private: @@ -105,6 +111,8 @@ class AdaGradUpdate AdaGradUpdate& parent; // The squared gradient matrix. GradType squaredGradient; + // The epsilon value, converted to the element type of the matrix. + ElemType epsilon; }; private: diff --git a/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt.hpp b/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt.hpp index 7f1788c..ebdf212 100644 --- a/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt.hpp +++ b/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt.hpp @@ -89,7 +89,7 @@ class AdaSqrt typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt_update.hpp b/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt_update.hpp index feae24c..4bdb001 100644 --- a/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt_update.hpp +++ b/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt_update.hpp @@ -59,6 +59,8 @@ class AdaSqrtUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD optimizer before the start of the * iteration update process. In AdaSqrt update policy, squared gradient @@ -72,10 +74,14 @@ class AdaSqrtUpdate Policy(AdaSqrtUpdate& parent, const size_t rows, const size_t cols) : parent(parent), squaredGradient(rows, cols), + epsilon(ElemType(parent.epsilon)), iteration(0) { // Initialize an empty matrix for sum of squares of parameter gradient. squaredGradient.zeros(); + // Check for underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -93,10 +99,10 @@ class AdaSqrtUpdate { ++iteration; - squaredGradient += arma::square(gradient); + squaredGradient += square(gradient); - iterate -= stepSize * std::sqrt(iteration) * gradient / - (squaredGradient + parent.epsilon); + iterate -= ElemType(stepSize) * std::sqrt(ElemType(iteration)) * + gradient / (squaredGradient + epsilon); } private: @@ -104,6 +110,8 @@ class AdaSqrtUpdate AdaSqrtUpdate& parent; // The squared gradient matrix. GradType squaredGradient; + // Epsilon converted to the element type of the optimization. + ElemType epsilon; // The number of iterations. size_t iteration; }; diff --git a/inst/include/ensmallen_bits/adam/adam.hpp b/inst/include/ensmallen_bits/adam/adam.hpp index 13c2f96..4595cf4 100644 --- a/inst/include/ensmallen_bits/adam/adam.hpp +++ b/inst/include/ensmallen_bits/adam/adam.hpp @@ -120,7 +120,7 @@ class AdamType typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/adam/adam_update.hpp b/inst/include/ensmallen_bits/adam/adam_update.hpp index de7f61e..dde10a7 100644 --- a/inst/include/ensmallen_bits/adam/adam_update.hpp +++ b/inst/include/ensmallen_bits/adam/adam_update.hpp @@ -82,6 +82,8 @@ class AdamUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -92,10 +94,17 @@ class AdamUpdate */ Policy(AdamUpdate& parent, const size_t rows, const size_t cols) : parent(parent), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), iteration(0) { m.zeros(rows, cols); v.zeros(rows, cols); + + // Attempt to detect underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -113,22 +122,23 @@ class AdamUpdate ++iteration; // And update the iterate. - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - v *= parent.beta2; - v += (1 - parent.beta2) * (gradient % gradient); + v *= beta2; + v += (1 - beta2) * square(gradient); - const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration); - const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration); + const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration)); + const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration)); /** * It should be noted that the term, m / (arma::sqrt(v) + eps), in the * following expression is an approximation of the following actual term; * m / (arma::sqrt(v) + (arma::sqrt(biasCorrection2) * eps). */ - iterate -= (stepSize * std::sqrt(biasCorrection2) / biasCorrection1) * - m / (arma::sqrt(v) + parent.epsilon); + iterate -= (ElemType(stepSize) * + std::sqrt(biasCorrection2) / biasCorrection1) * + m / (sqrt(v) + epsilon); } private: @@ -141,6 +151,11 @@ class AdamUpdate // The exponential moving average of squared gradient values. GradType v; + // Parameters converted to the element type of the optimization. + ElemType epsilon; + ElemType beta1; + ElemType beta2; + // The number of iterations. size_t iteration; }; diff --git a/inst/include/ensmallen_bits/adam/adamax_update.hpp b/inst/include/ensmallen_bits/adam/adamax_update.hpp index a6c9f2f..13e8bae 100644 --- a/inst/include/ensmallen_bits/adam/adamax_update.hpp +++ b/inst/include/ensmallen_bits/adam/adamax_update.hpp @@ -30,11 +30,11 @@ namespace ens { * * @code * @article{Kingma2014, - * author = {Diederik P. Kingma and Jimmy Ba}, - * title = {Adam: {A} Method for Stochastic Optimization}, - * journal = {CoRR}, - * year = {2014}, - * url = {http://arxiv.org/abs/1412.6980} + * author = {Diederik P. Kingma and Jimmy Ba}, + * title = {Adam: {A} Method for Stochastic Optimization}, + * journal = {CoRR}, + * year = {2014}, + * url = {http://arxiv.org/abs/1412.6980} * } * @endcode */ @@ -84,6 +84,8 @@ class AdaMaxUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -94,10 +96,16 @@ class AdaMaxUpdate */ Policy(AdaMaxUpdate& parent, const size_t rows, const size_t cols) : parent(parent), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), iteration(0) { m.zeros(rows, cols); u.zeros(rows, cols); + // Attempt to detect underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -115,17 +123,17 @@ class AdaMaxUpdate ++iteration; // And update the iterate. - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; // Update the exponentially weighted infinity norm. - u *= parent.beta2; - u = arma::max(u, arma::abs(gradient)); + u *= beta2; + u = max(u, abs(gradient)); - const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration); + const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration)); if (biasCorrection1 != 0) - iterate -= (stepSize / biasCorrection1 * m / (u + parent.epsilon)); + iterate -= (ElemType(stepSize) / biasCorrection1 * m / (u + epsilon)); } private: @@ -135,6 +143,10 @@ class AdaMaxUpdate GradType m; // The exponentially weighted infinity norm. GradType u; + // Tuning parameters converted to the element type of the optimization. + ElemType epsilon; + ElemType beta1; + ElemType beta2; // The number of iterations. size_t iteration; }; diff --git a/inst/include/ensmallen_bits/adam/amsgrad_update.hpp b/inst/include/ensmallen_bits/adam/amsgrad_update.hpp index f1f420e..a5d4562 100644 --- a/inst/include/ensmallen_bits/adam/amsgrad_update.hpp +++ b/inst/include/ensmallen_bits/adam/amsgrad_update.hpp @@ -2,7 +2,7 @@ * @file amsgrad_update.hpp * @author Haritha Nair * - * Implementation of AMSGrad optimizer. AMSGrad is an exponential moving average + * Implementation of AMSGrad optimizer. AMSGrad is an exponential moving average * optimizer that dynamically adapts over time with guaranteed convergence. * * ensmallen is free software; you may redistribute it and/or modify it under @@ -25,9 +25,9 @@ namespace ens { * * @code * @article{ - * title = {On the convergence of Adam and beyond}, - * url = {https://openreview.net/pdf?id=ryQu7f-RZ} - * year = {2018} + * title = {On the convergence of Adam and beyond}, + * url = {https://openreview.net/pdf?id=ryQu7f-RZ} + * year = {2018} * } * @endcode */ @@ -77,6 +77,8 @@ class AMSGradUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -87,11 +89,18 @@ class AMSGradUpdate */ Policy(AMSGradUpdate& parent, const size_t rows, const size_t cols) : parent(parent), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), iteration(0) { m.zeros(rows, cols); v.zeros(rows, cols); vImproved.zeros(rows, cols); + + // Attempt to detect underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -109,20 +118,21 @@ class AMSGradUpdate ++iteration; // And update the iterate. - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - v *= parent.beta2; - v += (1 - parent.beta2) * (gradient % gradient); + v *= beta2; + v += (1 - beta2) * (gradient % gradient); - const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration); - const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration); + const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration)); + const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration)); // Element wise maximum of past and present squared gradients. - vImproved = arma::max(vImproved, v); + vImproved = max(vImproved, v); - iterate -= (stepSize * std::sqrt(biasCorrection2) / biasCorrection1) * - m / (arma::sqrt(vImproved) + parent.epsilon); + iterate -= (ElemType(stepSize) * + std::sqrt(biasCorrection2) / biasCorrection1) * + m / (sqrt(vImproved) + epsilon); } private: @@ -138,6 +148,11 @@ class AMSGradUpdate // The optimal squared gradient value. GradType vImproved; + // Parameters converted to the element type of the optimization. + ElemType epsilon; + ElemType beta1; + ElemType beta2; + // The number of iterations. size_t iteration; }; diff --git a/inst/include/ensmallen_bits/adam/nadam_update.hpp b/inst/include/ensmallen_bits/adam/nadam_update.hpp index 24f105c..9014095 100644 --- a/inst/include/ensmallen_bits/adam/nadam_update.hpp +++ b/inst/include/ensmallen_bits/adam/nadam_update.hpp @@ -85,6 +85,8 @@ class NadamUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the optimizer before the start of the * iteration update process. @@ -96,10 +98,17 @@ class NadamUpdate Policy(NadamUpdate& parent, const size_t rows, const size_t cols) : parent(parent), cumBeta1(1), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), iteration(0) { m.zeros(rows, cols); v.zeros(rows, cols); + + // Attempt to detect underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -117,30 +126,31 @@ class NadamUpdate ++iteration; // And update the iterate. - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - v *= parent.beta2; - v += (1 - parent.beta2) * gradient % gradient; + v *= beta2; + v += (1 - beta2) * gradient % gradient; - double beta1T = parent.beta1 * (1 - (0.5 * + ElemType beta1T = beta1 * (1 - ElemType(0.5 * std::pow(0.96, iteration * parent.scheduleDecay))); - double beta1T1 = parent.beta1 * (1 - (0.5 * + ElemType beta1T1 = beta1 * (1 - ElemType(0.5 * std::pow(0.96, (iteration + 1) * parent.scheduleDecay))); cumBeta1 *= beta1T; - const double biasCorrection1 = 1.0 - cumBeta1; - const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration); - const double biasCorrection3 = 1.0 - (cumBeta1 * beta1T1); + const ElemType biasCorrection1 = 1 - cumBeta1; + const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration)); + const ElemType biasCorrection3 = 1 - (cumBeta1 * beta1T1); /* Note :- arma::sqrt(v) + epsilon * sqrt(biasCorrection2) is approximated * as arma::sqrt(v) + epsilon */ - iterate -= (stepSize * (((1 - beta1T) / biasCorrection1) * gradient - + (beta1T1 / biasCorrection3) * m) * sqrt(biasCorrection2)) - / (arma::sqrt(v) + parent.epsilon); + iterate -= (ElemType(stepSize) * + (((1 - beta1T) / biasCorrection1) * gradient + + (beta1T1 / biasCorrection3) * m) * std::sqrt(biasCorrection2)) / + (sqrt(v) + epsilon); } private: @@ -154,7 +164,12 @@ class NadamUpdate GradType v; // The cumulative product of decay coefficients. - double cumBeta1; + ElemType cumBeta1; + + // Parameters converted to the element type of the optimization. + ElemType epsilon; + ElemType beta1; + ElemType beta2; // The number of iterations. size_t iteration; diff --git a/inst/include/ensmallen_bits/adam/nadamax_update.hpp b/inst/include/ensmallen_bits/adam/nadamax_update.hpp index f0d9b0c..570fb7b 100644 --- a/inst/include/ensmallen_bits/adam/nadamax_update.hpp +++ b/inst/include/ensmallen_bits/adam/nadamax_update.hpp @@ -85,6 +85,8 @@ class NadaMaxUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor method is called by the optimizer before the start of * the iteration update process. @@ -96,10 +98,17 @@ class NadaMaxUpdate Policy(NadaMaxUpdate& parent, const size_t rows, const size_t cols) : parent(parent), cumBeta1(1), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), iteration(0) { m.zeros(rows, cols); u.zeros(rows, cols); + + // Attempt to detect underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -117,27 +126,27 @@ class NadaMaxUpdate ++iteration; // And update the iterate. - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - u = arma::max(u * parent.beta2, arma::abs(gradient)); + u = max(u * beta2, abs(gradient)); - double beta1T = parent.beta1 * (1 - (0.5 * + ElemType beta1T = beta1 * (1 - ElemType(0.5 * std::pow(0.96, iteration * parent.scheduleDecay))); - double beta1T1 = parent.beta1 * (1 - (0.5 * + ElemType beta1T1 = beta1 * (1 - ElemType(0.5 * std::pow(0.96, (iteration + 1) * parent.scheduleDecay))); cumBeta1 *= beta1T; - const double biasCorrection1 = 1.0 - cumBeta1; - - const double biasCorrection2 = 1.0 - (cumBeta1 * beta1T1); + const ElemType biasCorrection1 = 1 - cumBeta1; + const ElemType biasCorrection2 = 1 - (cumBeta1 * beta1T1); if ((biasCorrection1 != 0) && (biasCorrection2 != 0)) { - iterate -= (stepSize * (((1 - beta1T) / biasCorrection1) * gradient - + (beta1T1 / biasCorrection2) * m)) / (u + parent.epsilon); + iterate -= (ElemType(stepSize) * + (((1 - beta1T) / biasCorrection1) * gradient + + (beta1T1 / biasCorrection2) * m)) / (u + epsilon); } } @@ -152,7 +161,12 @@ class NadaMaxUpdate GradType u; // The cumulative product of decay coefficients. - double cumBeta1; + ElemType cumBeta1; + + // Parameters converted to the element type of the optimization. + ElemType epsilon; + ElemType beta1; + ElemType beta2; // The number of iterations. size_t iteration; diff --git a/inst/include/ensmallen_bits/adam/optimisticadam_update.hpp b/inst/include/ensmallen_bits/adam/optimisticadam_update.hpp index 426a5bb..9e381ed 100644 --- a/inst/include/ensmallen_bits/adam/optimisticadam_update.hpp +++ b/inst/include/ensmallen_bits/adam/optimisticadam_update.hpp @@ -27,11 +27,11 @@ namespace ens { * * @code * @article{ - * author = {Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis, - * Haoyang Zeng}, - * title = {Training GANs with Optimism}, - * year = {2017}, - * url = {https://arxiv.org/abs/1711.00141} + * author = {Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis, + * Haoyang Zeng}, + * title = {Training GANs with Optimism}, + * year = {2017}, + * url = {https://arxiv.org/abs/1711.00141} * } * @endcode */ @@ -81,6 +81,8 @@ class OptimisticAdamUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -91,11 +93,18 @@ class OptimisticAdamUpdate */ Policy(OptimisticAdamUpdate& parent, const size_t rows, const size_t cols) : parent(parent), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), iteration(0) { m.zeros(rows, cols); v.zeros(rows, cols); g.zeros(rows, cols); + + // Attempt to detect underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -113,18 +122,18 @@ class OptimisticAdamUpdate ++iteration; // And update the iterate. - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - v *= parent.beta2; - v += (1 - parent.beta2) * arma::square(gradient); + v *= beta2; + v += (1 - beta2) * square(gradient); - GradType mCorrected = m / (1.0 - std::pow(parent.beta1, iteration)); - GradType vCorrected = v / (1.0 - std::pow(parent.beta2, iteration)); + GradType mCorrected = m / (1 - std::pow(beta1, ElemType(iteration))); + GradType vCorrected = v / (1 - std::pow(beta2, ElemType(iteration))); - GradType update = mCorrected / (arma::sqrt(vCorrected) + parent.epsilon); + GradType update = mCorrected / (sqrt(vCorrected) + epsilon); - iterate -= (2 * stepSize * update - stepSize * g); + iterate -= (2 * ElemType(stepSize) * update - ElemType(stepSize) * g); g = std::move(update); } @@ -142,6 +151,11 @@ class OptimisticAdamUpdate // The previous update. GradType g; + // Parameters converted to the element type of the optimization. + ElemType epsilon; + ElemType beta1; + ElemType beta2; + // The number of iterations. size_t iteration; }; diff --git a/inst/include/ensmallen_bits/agemoea/agemoea.hpp b/inst/include/ensmallen_bits/agemoea/agemoea.hpp index 9f914da..452bc98 100644 --- a/inst/include/ensmallen_bits/agemoea/agemoea.hpp +++ b/inst/include/ensmallen_bits/agemoea/agemoea.hpp @@ -126,6 +126,33 @@ class AGEMOEA MatType& iterate, CallbackTypes&&... callbacks); + /** + * Optimize a set of objectives. The initial population is generated using the + * starting point. The output is the best generated front. + * + * @tparam ArbitraryFunctionType std::tuple of multiple objectives. + * @tparam MatType Type of matrix to optimize. + * @tparam CubeType The type of cube used to store the front and Pareto set. + * @tparam CallbackTypes Types of callback functions. + * @param objectives Vector of objective functions to optimize for. + * @param iterate Starting point. + * @param front The generated front. + * @param paretoSet The generated Pareto set. + * @param callbacks Callback functions. + * @return MatType::elem_type The minimum of the accumulated sum over the + * objective values in the best front. + */ + template + typename MatType::elem_type Optimize( + std::tuple& objectives, + MatType& iterate, + CubeType& front, + CubeType& paretoSet, + CallbackTypes&&... callbacks); + //! Get the population size. size_t PopulationSize() const { return populationSize; } //! Modify the population size. @@ -166,34 +193,6 @@ class AGEMOEA //! Modify value of upperBound. arma::vec& UpperBound() { return upperBound; } - //! Retrieve the Pareto optimal points in variable space. This returns an empty cube - //! until `Optimize()` has been called. - const arma::cube& ParetoSet() const { return paretoSet; } - - //! Retrieve the best front (the Pareto frontier). This returns an empty cube until - //! `Optimize()` has been called. - const arma::cube& ParetoFront() const { return paretoFront; } - - /** - * Retrieve the best front (the Pareto frontier). This returns an empty - * vector until `Optimize()` has been called. Note that this function is - * deprecated and will be removed in ensmallen 3.x! Use `ParetoFront()` - * instead. - */ - const std::vector& Front() - { - if (rcFront.size() == 0) - { - // Match the old return format. - for (size_t i = 0; i < paretoFront.n_slices; ++i) - { - rcFront.push_back(arma::mat(paretoFront.slice(i))); - } - } - - return rcFront; - } - private: /** * Evaluate objectives for the elite population. @@ -205,21 +204,22 @@ class AGEMOEA * @param calculatedObjectives Vector to store calculated objectives. */ template typename std::enable_if::type - EvaluateObjectives(std::vector&, + EvaluateObjectives(std::vector&, std::tuple&, - std::vector >&); + std::vector&); template typename std::enable_if::type - EvaluateObjectives(std::vector& population, + EvaluateObjectives(std::vector& population, std::tuple& objectives, - std::vector >& - calculatedObjectives); + std::vector& calculatedObjectives); /** * Reproduce candidates from the elite population to generate a new @@ -283,7 +283,8 @@ class AGEMOEA void FastNonDominatedSort( std::vector >& fronts, std::vector& ranks, - std::vector >& calculatedObjectives); + std::vector >& + calculatedObjectives); /** * Operator to check if one candidate Pareto-dominates the other. @@ -304,17 +305,18 @@ class AGEMOEA size_t candidateP, size_t candidateQ); - /** - * Assigns Survival Score metric for sorting. - * - * @param front The previously generated Pareto fronts. - * @param idealPoint The ideal point of teh first front. - * @param calculatedObjectives The previously calculated objectives. - * @param survivalScore The Survival Score vector to be updated for each individual in the population. - * @param normalize The normlization vector of the fronts. - * @param dimension The dimension of the first front. - * @param fNum teh current front index. - */ + /** + * Assigns Survival Score metric for sorting. + * + * @param front The previously generated Pareto fronts. + * @param idealPoint The ideal point of teh first front. + * @param calculatedObjectives The previously calculated objectives. + * @param survivalScore The Survival Score vector to be updated for each + * individual in the population. + * @param normalize The normlization vector of the fronts. + * @param dimension The dimension of the first front. + * @param fNum teh current front index. + */ template void SurvivalScoreAssignment( const std::vector& front, @@ -322,7 +324,7 @@ class AGEMOEA std::vector>& calculatedObjectives, std::vector& survivalScore, arma::Col& normalize, - double& dimension, + typename MatType::elem_type& dimension, size_t fNum); /** @@ -338,7 +340,7 @@ class AGEMOEA * being sorted. * @param ranks The previously calculated ranks. * @param survivalScore The Survival score for each individual in - * the population. + * the population. * @return true if the first candidate is preferred, otherwise, false. */ template @@ -347,37 +349,39 @@ class AGEMOEA size_t idxQ, const std::vector& ranks, const std::vector& survivalScore); - - /** - * Normalizes the front given the extreme points in the current front. - * - * @tparam The type of population datapoints. - * @param calculatedObjectives The current population evaluated objectives. - * @param normalization The normalizing vector. - * @param front The previously generated Pareto front. - * @param extreme The indexes of the extreme points in the front. - */ - template - void NormalizeFront( - std::vector>& calculatedObjectives, - arma::Col& normalization, - const std::vector& front, - const arma::Row& extreme); - - /** - * Get the geometry information p of Lp norm (p > 0). - * - * @param calculatedObjectives The current population evaluated objectives. - * @param front The previously generated Pareto fronts. - * @param extreme The indexes of the extreme points in the front. - * @return The variable p in the Lp norm that best fits the geometry of the current front. - */ - template - double GetGeometry( - std::vector >& calculatedObjectives, + + /** + * Normalizes the front given the extreme points in the current front. + * + * @tparam The type of population datapoints. + * @param calculatedObjectives The current population evaluated objectives. + * @param normalization The normalizing vector. + * @param front The previously generated Pareto front. + * @param extreme The indexes of the extreme points in the front. + */ + template + void NormalizeFront( + std::vector>& calculatedObjectives, + arma::Col& normalization, + const std::vector& front, + const arma::Row& extreme); + + /** + * Get the geometry information p of Lp norm (p > 0). + * + * @param calculatedObjectives The current population evaluated objectives. + * @param front The previously generated Pareto fronts. + * @param extreme The indexes of the extreme points in the front. + * @return The variable p in the Lp norm that best fits the geometry of the + * current front. + */ + template + typename MatType::elem_type GetGeometry( + std::vector >& + calculatedObjectives, const std::vector& front, const arma::Row& extreme); - + /** * Finds the pairwise Lp distance between all the points in the front. * @@ -389,13 +393,14 @@ class AGEMOEA template void PairwiseDistance( MatType& final, - std::vector >& calculatedObjectives, + std::vector >& + calculatedObjectives, const std::vector& front, - double dimension); + const typename MatType::elem_type dimension); /** * Finding the indexes of the extreme points in the front. - * + * * @param indexes vector containing the slected indexes. * @param calculatedObjectives The current population objectives. * @param front The front of the current generation. @@ -405,32 +410,37 @@ class AGEMOEA arma::Row& indexes, std::vector >& calculatedObjectives, const std::vector& front); - + /** * Finding the distance of each point in the front from the line formed * by pointA and pointB. - * - * @param distance The vector containing the distances of the points in the fron from the line. - * @param calculatedObjectives Reference to the current population evaluated Objectives. + * + * @param distance The vector containing the distances of the points in the + * from from the line. + * @param calculatedObjectives Reference to the current population evaluated + * objectives. * @param front The front of the current generation(indices of population). * @param pointA The first point on the line. * @param pointB The second point on the line. - */ + */ template void PointToLineDistance( arma::Row& distances, - std::vector >& calculatedObjectives, + std::vector >& + calculatedObjectives, const std::vector& front, const arma::Col& pointA, const arma::Col& pointB); - + /** - * Find the Diversity score corresponding the solution S using the selected set. - * + * Find the Diversity score corresponding the solution S using the selected + * set. + * * @param selected The current selected set. * @param pairwiseDistance The current pairwise distance for the whole front. * @param S The relative index of S being considered within the front. - * @return The diversity score for S which the sum of the two smallest elements. + * @return The diversity score for S which the sum of the two smallest + * elements. */ template typename MatType::elem_type DiversityScore(std::set& selected, @@ -467,19 +477,6 @@ class AGEMOEA //! Upper bound of the initial swarm. arma::vec upperBound; - - //! The set of all the Pareto optimal points. - //! Stored after Optimize() is called. - arma::cube paretoSet; - - //! The set of all the Pareto optimal objective vectors. - //! Stored after Optimize() is called. - arma::cube paretoFront; - - //! A different representation of the Pareto front, for reverse compatibility - //! purposes. This can be removed when ensmallen 3.x is released! (Along - //! with `Front()`.) This is only populated when `Front()` is called. - std::vector rcFront; }; } // namespace ens diff --git a/inst/include/ensmallen_bits/agemoea/agemoea_impl.hpp b/inst/include/ensmallen_bits/agemoea/agemoea_impl.hpp index c226095..0f7815d 100644 --- a/inst/include/ensmallen_bits/agemoea/agemoea_impl.hpp +++ b/inst/include/ensmallen_bits/agemoea/agemoea_impl.hpp @@ -67,6 +67,24 @@ typename MatType::elem_type AGEMOEA::Optimize( std::tuple& objectives, MatType& iterateIn, CallbackTypes&&... callbacks) +{ + typedef typename ForwardType::bcube CubeType; + CubeType paretoFront, paretoSet; + return Optimize(objectives, iterateIn, paretoFront, paretoSet, + std::forward(callbacks)...); +} + +//! Optimize the function. +template +typename MatType::elem_type AGEMOEA::Optimize( + std::tuple& objectives, + MatType& iterateIn, + CubeType& paretoFrontIn, + CubeType& paretoSetIn, + CallbackTypes&&... callbacks) { // Make sure for evolution to work at least four candidates are present. if (populationSize < 4 && populationSize % 4 != 0) @@ -78,6 +96,8 @@ typename MatType::elem_type AGEMOEA::Optimize( // Convenience typedefs. typedef typename MatType::elem_type ElemType; typedef typename MatTypeTraits::BaseMatType BaseMatType; + typedef typename ForwardType::bcol BaseColType; + typedef typename ForwardType::bmat CubeBaseMatType; BaseMatType& iterate = (BaseMatType&) iterateIn; @@ -104,7 +124,7 @@ typename MatType::elem_type AGEMOEA::Optimize( numVariables = iterate.n_rows; // Cache calculated objectives. - std::vector > calculatedObjectives(populationSize); + std::vector calculatedObjectives(populationSize); // Population size reserved to 2 * populationSize + 1 to accommodate // for the size of intermediate candidate population. @@ -120,8 +140,8 @@ typename MatType::elem_type AGEMOEA::Optimize( std::vector ranks; //! Useful temporaries for float-like comparisons. - const BaseMatType castedLowerBound = arma::conv_to::from(lowerBound); - const BaseMatType castedUpperBound = arma::conv_to::from(upperBound); + const BaseMatType castedLowerBound = conv_to::from(lowerBound); + const BaseMatType castedUpperBound = conv_to::from(upperBound); // Controls early termination of the optimization process. bool terminate = false; @@ -131,10 +151,10 @@ typename MatType::elem_type AGEMOEA::Optimize( for (size_t i = 0; i < populationSize; i++) { population.push_back(arma::randu(iterate.n_rows, - iterate.n_cols) - 0.5 + iterate); + iterate.n_cols) - ElemType(0.5) + iterate); // Constrain all genes to be within bounds. - population[i] = arma::min(arma::max(population[i], castedLowerBound), + population[i] = min(max(population[i], castedLowerBound), castedUpperBound); } @@ -152,26 +172,24 @@ typename MatType::elem_type AGEMOEA::Optimize( // Evaluate the objectives for the new population. calculatedObjectives.resize(population.size()); std::fill(calculatedObjectives.begin(), calculatedObjectives.end(), - arma::Col(numObjectives, arma::fill::zeros)); + BaseColType(numObjectives, GetFillType::zeros)); EvaluateObjectives(population, objectives, calculatedObjectives); // Perform fast non dominated sort on P_t ∪ G_t. ranks.resize(population.size()); FastNonDominatedSort(fronts, ranks, calculatedObjectives); - + arma::Col idealPoint(calculatedObjectives[fronts[0][0]]); for (size_t index = 1; index < fronts[0].size(); index++) { - idealPoint = arma::min(idealPoint, - calculatedObjectives[fronts[0][index]]); + idealPoint = min(idealPoint, calculatedObjectives[fronts[0][index]]); } // Perform survival score assignment. survivalScore.resize(population.size()); std::fill(survivalScore.begin(), survivalScore.end(), 0.); - double dimension; - arma::Col normalize(numObjectives, - arma::fill::zeros); + ElemType dimension; + BaseColType normalize(numObjectives, GetFillType::zeros); for (size_t fNum = 0; fNum < fronts.size(); fNum++) { SurvivalScoreAssignment(fronts[fNum], idealPoint, @@ -186,16 +204,16 @@ typename MatType::elem_type AGEMOEA::Optimize( size_t idxP{}, idxQ{}; for (size_t i = 0; i < population.size(); i++) { - if (arma::approx_equal(population[i], candidateP, - "absdiff", epsilon)) + if (approx_equal(population[i], candidateP, "absdiff", + ElemType(epsilon))) idxP = i; - if (arma::approx_equal(population[i], candidateQ, - "absdiff", epsilon)) + if (approx_equal(population[i], candidateQ, "absdiff", + ElemType(epsilon))) idxQ = i; } - return SurvivalScoreOperator(idxP, idxQ, ranks, + return SurvivalScoreOperator(idxP, idxQ, ranks, survivalScore); } ); @@ -209,29 +227,24 @@ typename MatType::elem_type AGEMOEA::Optimize( } EvaluateObjectives(population, objectives, calculatedObjectives); // Set the candidates from the Pareto Set as the output. - paretoSet.set_size(population[0].n_rows, population[0].n_cols, + paretoSetIn.set_size(population[0].n_rows, population[0].n_cols, population.size()); // The Pareto Set is stored, can be obtained via ParetoSet() getter. for (size_t solutionIdx = 0; solutionIdx < population.size(); ++solutionIdx) { - paretoSet.slice(solutionIdx) = - arma::conv_to::from(population[solutionIdx]); + paretoSetIn.slice(solutionIdx) = + conv_to::from(population[solutionIdx]); } // Set the candidates from the Pareto Front as the output. - paretoFront.set_size(calculatedObjectives[0].n_rows, + paretoFrontIn.set_size(calculatedObjectives[0].n_rows, calculatedObjectives[0].n_cols, population.size()); - // The Pareto Front is stored, can be obtained via ParetoFront() getter. for (size_t solutionIdx = 0; solutionIdx < population.size(); ++solutionIdx) { - paretoFront.slice(solutionIdx) = - arma::conv_to::from(calculatedObjectives[solutionIdx]); + paretoFrontIn.slice(solutionIdx) = + conv_to::from(calculatedObjectives[solutionIdx]); } - // Clear rcFront, in case it is later requested by the user for reverse - // compatibility reasons. - rcFront.clear(); - // Assign iterate to first element of the Pareto Set. iterate = population[fronts[0][0]]; @@ -239,57 +252,62 @@ typename MatType::elem_type AGEMOEA::Optimize( ElemType performance = std::numeric_limits::max(); - for (const arma::Col& objective: calculatedObjectives) - if (arma::accu(objective) < performance) - performance = arma::accu(objective); + for (const BaseColType& objective: calculatedObjectives) + if (accu(objective) < performance) + performance = accu(objective); return performance; } //! No objectives to evaluate. template typename std::enable_if::type AGEMOEA::EvaluateObjectives( - std::vector&, + std::vector&, std::tuple&, - std::vector >&) + std::vector&) { // Nothing to do here. } //! Evaluate the objectives for the entire population. template typename std::enable_if::type AGEMOEA::EvaluateObjectives( - std::vector& population, + std::vector& population, std::tuple& objectives, - std::vector >& calculatedObjectives) + std::vector& calculatedObjectives) { for (size_t i = 0; i < population.size(); i++) { calculatedObjectives[i](I) = std::get(objectives).Evaluate(population[i]); - EvaluateObjectives(population, objectives, + EvaluateObjectives(population, objectives, calculatedObjectives); } } //! Reproduce and generate new candidates. -template -inline void AGEMOEA::BinaryTournamentSelection(std::vector& population, - const MatType& lowerBound, - const MatType& upperBound) +template +inline void AGEMOEA::BinaryTournamentSelection(std::vector& population, + const InputMatType& lowerBound, + const InputMatType& upperBound) { - std::vector children; + std::vector children; while (children.size() < population.size()) { // Choose two random parents for reproduction from the elite population. - size_t indexA = arma::randi(arma::distr_param(0, populationSize - 1)); - size_t indexB = arma::randi(arma::distr_param(0, populationSize - 1)); + size_t indexA = arma::randi( + arma::distr_param(0, populationSize - 1)); + size_t indexB = arma::randi( + arma::distr_param(0, populationSize - 1)); // Make sure that the parents differ. if (indexA == indexB) @@ -301,10 +319,10 @@ inline void AGEMOEA::BinaryTournamentSelection(std::vector& population, } // Initialize the children to the respective parents. - MatType childA = population[indexA], childB = population[indexB]; + InputMatType childA = population[indexA], childB = population[indexB]; if (arma::randu() <= crossoverProb) - Crossover(childA, childB, population[indexA], population[indexB], + Crossover(childA, childB, population[indexA], population[indexB], lowerBound, upperBound); Mutate(childA, 1.0 / static_cast(numVariables), @@ -318,68 +336,74 @@ inline void AGEMOEA::BinaryTournamentSelection(std::vector& population, } // Add the candidates to the elite population. - population.insert(std::end(population), std::begin(children), std::end(children)); + population.insert(std::end(population), std::begin(children), + std::end(children)); } //! Perform simulated binary crossover (SBX) of genes for the children. -template -inline void AGEMOEA::Crossover(MatType& childA, - MatType& childB, - const MatType& parentA, - const MatType& parentB, - const MatType& lowerBound, - const MatType& upperBound) +template +inline void AGEMOEA::Crossover(InputMatType& childA, + InputMatType& childB, + const InputMatType& parentA, + const InputMatType& parentB, + const InputMatType& lowerBound, + const InputMatType& upperBound) { - //! Generates a child from two parent individuals - // according to the polynomial probability distribution. - arma::Cube parents(parentA.n_rows, - parentA.n_cols, 2); - parents.slice(0) = parentA; - parents.slice(1) = parentB; - MatType current_min = arma::min(parents, 2); - MatType current_max = arma::max(parents, 2); - - if (arma::accu(parentA - parentB < 1e-14)) - { - childA = parentA; - childB = parentB; - return; - } - MatType current_diff = current_max - current_min; - current_diff.transform( [](typename MatType::elem_type val) - { return (val < 1e-10 ? 1e-10:val); } ); - - // Calculating beta used for the final crossover. - MatType beta1 = 1 + 2.0 * (current_min - lowerBound) / current_diff; - MatType beta2 = 1 + 2.0 * (upperBound - current_max) / current_diff; - MatType alpha1 = 2 - arma::pow(beta1, -(eta + 1)); - MatType alpha2 = 2 - arma::pow(beta2, -(eta + 1)); - - MatType us(arma::size(alpha1), arma::fill::randu); - arma::umat mask1 = us > (1.0 / alpha1); - MatType betaq1 = arma::pow(us % alpha1, 1. / (eta + 1)); - betaq1 = betaq1 % (mask1 != 1.0) + arma::pow((1.0 / (2.0 - us % alpha1)), - 1.0 / (eta + 1)) % mask1; - arma::umat mask2 = us > (1.0 / alpha2); - MatType betaq2 = arma::pow(us % alpha2, 1 / (eta + 1)); - betaq2 = betaq2 % (mask1 != 1.0) + arma::pow((1.0 / (2.0 - us % alpha2)), - 1.0 / (eta + 1)) % mask2; - - // Variables after the cross over for all of them. - MatType c1 = 0.5 * ((current_min + current_max) - betaq1 % current_diff); - MatType c2 = 0.5 * ((current_min + current_max) + betaq2 % current_diff); - c1 = arma::min(arma::max(c1, lowerBound), upperBound); - c2 = arma::min(arma::max(c2, lowerBound), upperBound); - - // Decision for the crossover between the two parents for each variable. - us.randu(); - childA = parentA % (us <= 0.5); - childB = parentB % (us <= 0.5); - us.randu(); - childA = childA + c1 % ((us <= 0.5) % (childA == 0)); - childA = childA + c2 % ((us > 0.5) % (childA == 0)); - childB = childB + c2 % ((us <= 0.5) % (childB == 0)); - childB = childB + c1 % ((us > 0.5) % (childB == 0)); + typedef typename InputMatType::elem_type ElemType; + typedef typename ForwardType::bcube BaseCubeType; + typedef typename ForwardType::umat UMatType; + + // Generates a child from two parent individuals + // according to the polynomial probability distribution. + BaseCubeType parents(parentA.n_rows, + parentA.n_cols, 2); + parents.slice(0) = parentA; + parents.slice(1) = parentB; + InputMatType current_min = min(parents, 2); + InputMatType current_max = max(parents, 2); + + if (accu(parentA - parentB < ElemType(1e-14))) + { + childA = parentA; + childB = parentB; + return; + } + InputMatType current_diff = current_max - current_min; + current_diff.transform( [](ElemType val) + { return (val < ElemType(1e-10) ? ElemType(1e-10) : val); } ); + + // Calculating beta used for the final crossover. + InputMatType beta1 = 1 + 2 * (current_min - lowerBound) / current_diff; + InputMatType beta2 = 1 + 2 * (upperBound - current_max) / current_diff; + InputMatType alpha1 = 2 - pow(beta1, -(eta + 1)); + InputMatType alpha2 = 2 - pow(beta2, -(eta + 1)); + + InputMatType us(size(alpha1), GetFillType::randu); + + UMatType mask1 = us > (1 / alpha1); + InputMatType betaq1 = pow(us % alpha1, 1. / (eta + 1)); + betaq1 = betaq1 % (mask1 != 1) + pow((1 / (2 - us % alpha1)), + 1 / (eta + 1)) % mask1; + UMatType mask2 = us > (1 / alpha2); + InputMatType betaq2 = pow(us % alpha2, 1 / (eta + 1)); + betaq2 = betaq2 % (mask1 != 1) + pow((1 / (2 - us % alpha2)), + 1 / (eta + 1)) % mask2; + + // Variables after the cross over for all of them. + InputMatType c1 = ((current_min + current_max) - betaq1 % current_diff) / 2; + InputMatType c2 = ((current_min + current_max) + betaq2 % current_diff) / 2; + c1 = min(max(c1, lowerBound), upperBound); + c2 = min(max(c2, lowerBound), upperBound); + + // Decision for the crossover between the two parents for each variable. + us.randu(); + childA = parentA % (us <= ElemType(0.5)); + childB = parentB % (us <= ElemType(0.5)); + us.randu(); + childA = childA + c1 % ((us <= ElemType(0.5)) % (childA == 0)); + childA = childA + c2 % ((us > ElemType(0.5)) % (childA == 0)); + childB = childB + c2 % ((us <= ElemType(0.5)) % (childB == 0)); + childB = childB + c1 % ((us > ElemType(0.5)) % (childB == 0)); } //! Perform Polynomial mutation of the candidate. @@ -389,39 +413,40 @@ inline void AGEMOEA::Mutate(MatType& candidate, const MatType& lowerBound, const MatType& upperBound) { - const size_t numVariables = candidate.n_rows; - for (size_t geneIdx = 0; geneIdx < numVariables; ++geneIdx) + const size_t numVariables = candidate.n_rows; + for (size_t geneIdx = 0; geneIdx < numVariables; ++geneIdx) + { + // Should this gene be mutated? + if (arma::randu() > mutationRate) + continue; + + const double geneRange = upperBound(geneIdx) - lowerBound(geneIdx); + // Normalised distance from the bounds. + const double lowerDelta = (candidate(geneIdx) + - lowerBound(geneIdx)) / geneRange; + const double upperDelta = (upperBound(geneIdx) + - candidate(geneIdx)) / geneRange; + const double mutationPower = 1. / (distributionIndex + 1.0); + const double rand = arma::randu(); + double value, perturbationFactor; + if (rand < 0.5) { - // Should this gene be mutated? - if (arma::randu() > mutationRate) - continue; - - const double geneRange = upperBound(geneIdx) - lowerBound(geneIdx); - // Normalised distance from the bounds. - const double lowerDelta = (candidate(geneIdx) - - lowerBound(geneIdx)) / geneRange; - const double upperDelta = (upperBound(geneIdx) - - candidate(geneIdx)) / geneRange; - const double mutationPower = 1. / (distributionIndex + 1.0); - const double rand = arma::randu(); - double value, perturbationFactor; - if (rand < 0.5) - { - value = 2.0 * rand + (1.0 - 2.0 * rand) * - std::pow(upperDelta, distributionIndex + 1.0); - perturbationFactor = std::pow(value, mutationPower) - 1.0; - } - else - { - value = 2.0 * (1.0 - rand) + 2.0 *(rand - 0.5) * - std::pow(lowerDelta, distributionIndex + 1.0); - perturbationFactor = 1.0 - std::pow(value, mutationPower); - } - - candidate(geneIdx) += perturbationFactor * geneRange; + value = 2.0 * rand + (1.0 - 2.0 * rand) * + std::pow(upperDelta, distributionIndex + 1.0); + perturbationFactor = std::pow(value, mutationPower) - 1.0; } - //! Enforce bounds. - candidate = arma::min(arma::max(candidate, lowerBound), upperBound); + else + { + value = 2.0 * (1.0 - rand) + 2.0 *(rand - 0.5) * + std::pow(lowerDelta, distributionIndex + 1.0); + perturbationFactor = 1.0 - std::pow(value, mutationPower); + } + + candidate(geneIdx) += + typename MatType::elem_type(perturbationFactor * geneRange); + } + //! Enforce bounds. + candidate = min(max(candidate, lowerBound), upperBound); } template @@ -431,9 +456,9 @@ inline void AGEMOEA::NormalizeFront( const std::vector& front, const arma::Row& extreme) { - arma::Mat vectorizedObjectives(numObjectives, + arma::Mat vectorizedObjectives(numObjectives, front.size()); - arma::Mat vectorizedExtremes(numObjectives, + arma::Mat vectorizedExtremes(numObjectives, extreme.n_elem); for (size_t i = 0; i < front.size(); i++) { @@ -441,7 +466,7 @@ inline void AGEMOEA::NormalizeFront( } for (size_t i = 0; i < extreme.n_elem; i++) { - vectorizedExtremes.col(i) = calculatedObjectives[front[extreme[i]]]; + vectorizedExtremes.col(i) = calculatedObjectives[front[extreme[i]]]; } if (front.size() < numObjectives) @@ -474,9 +499,9 @@ inline void AGEMOEA::NormalizeFront( } else { - normalization = 1. / hyperplane; + normalization = 1. / hyperplane; if (normalization.has_inf() || normalization.has_nan()) - { + { normalization = arma::max(vectorizedObjectives, 1); } } @@ -484,26 +509,29 @@ inline void AGEMOEA::NormalizeFront( } template -inline double AGEMOEA::GetGeometry( +inline typename MatType::elem_type AGEMOEA::GetGeometry( std::vector >& calculatedObjectives, const std::vector& front, const arma::Row& extreme) { - arma::Row d; - arma::Col zero(numObjectives, arma::fill::zeros); - arma::Col one(numObjectives, arma::fill::ones); + typedef typename MatType::elem_type ElemType; - PointToLineDistance (d, calculatedObjectives, front, zero, one); + arma::Row d; + arma::Col zero(numObjectives, arma::fill::zeros); + arma::Col one(numObjectives, arma::fill::ones); + + PointToLineDistance(d, calculatedObjectives, front, zero, one); for (size_t i = 0; i < extreme.size(); i++) { - d[extreme[i]] = arma::datum::inf; + d[extreme[i]] = arma::Datum::inf; } + size_t index = arma::index_min(d); - double avg = arma::accu(calculatedObjectives[front[index]]) / static_cast (numObjectives); - double p = std::log(numObjectives) / std::log(1.0 / avg); - if (p <= 0.1 || std::isnan(p)) - p = 1.0; + ElemType avg = accu(calculatedObjectives[front[index]]) / numObjectives; + ElemType p = std::log(ElemType(numObjectives)) / std::log(1 / avg); + if (p <= ElemType(0.1) || std::isnan(p)) + p = 1; return p; } @@ -514,13 +542,15 @@ inline void AGEMOEA::PairwiseDistance( MatType& f, std::vector >& calculatedObjectives, const std::vector& front, - double dimension) -{ + const typename MatType::elem_type dimension) +{ for (size_t i = 0; i < front.size(); i++) { for (size_t j = i + 1; j < front.size(); j++) { - f(i, j) = std::pow(arma::accu(arma::pow(arma::abs(calculatedObjectives[front[i]] - calculatedObjectives[front[j]]), dimension)), 1.0 / dimension); + f(i, j) = std::pow(accu(pow(abs( + calculatedObjectives[front[i]] - calculatedObjectives[front[j]]), + dimension)), 1 / dimension); f(j, i) = f(i, j); } } @@ -529,12 +559,12 @@ inline void AGEMOEA::PairwiseDistance( //! Find the index of the of the extreme points in the given front. template void AGEMOEA::FindExtremePoints( - arma::Row& indexes, + arma::Row& indexes, std::vector >& calculatedObjectives, const std::vector& front) { typedef typename MatType::elem_type ElemType; - + if (numObjectives >= front.size()) { indexes = arma::linspace>(0, front.size() - 1, front.size()); @@ -567,13 +597,13 @@ void AGEMOEA::PointToLineDistance( { typedef typename MatType::elem_type ElemType; arma::Row distancesTemp(front.size()); - arma::Col ba = pointB - pointA; + arma::Col ba = pointB - pointA; arma::Col pa; for (size_t i = 0; i < front.size(); i++) { size_t ind = front[i]; - + pa = (calculatedObjectives[ind] - pointA); double t = arma::dot(pa, ba) / arma::dot(ba, ba); distancesTemp[i] = arma::accu(arma::pow((pa - t * ba), 2)); @@ -660,7 +690,7 @@ inline bool AGEMOEA::Dominates( allBetterOrEqual = false; // P is better than Q for the i-th objective function. - else if (calculatedObjectives[candidateP](i) < + else if (calculatedObjectives[candidateP](i) < calculatedObjectives[candidateQ](i)) atleastOneBetter = true; } @@ -674,7 +704,7 @@ inline typename MatType::elem_type AGEMOEA::DiversityScore( std::set& selected, const MatType& pairwiseDistance, size_t S) -{ +{ typedef typename MatType::elem_type ElemType; ElemType m = arma::datum::inf; ElemType m1 = arma::datum::inf; @@ -682,7 +712,7 @@ inline typename MatType::elem_type AGEMOEA::DiversityScore( for (it = selected.begin(); it != selected.end(); it++) { if (*it == S){ continue; } - if (pairwiseDistance(S, *it) < m) + if (pairwiseDistance(S, *it) < m) { m1 = m; m = pairwiseDistance(S, *it); @@ -705,7 +735,7 @@ inline void AGEMOEA::SurvivalScoreAssignment( std::vector>& calculatedObjectives, std::vector& survivalScore, arma::Col& normalize, - double& dimension, + typename MatType::elem_type& dimension, size_t fNum) { typedef typename MatType::elem_type ElemType; @@ -718,12 +748,12 @@ inline void AGEMOEA::SurvivalScoreAssignment( dimension = 1; arma::Row extreme(numObjectives, arma::fill::zeros); NormalizeFront(calculatedObjectives, normalize, front, extreme); - return; + return; } for (size_t index = 0; index < front.size(); index++) { - calculatedObjectives[front[index]] = calculatedObjectives[front[index]] + calculatedObjectives[front[index]] = calculatedObjectives[front[index]] - idealPoint; } @@ -733,22 +763,21 @@ inline void AGEMOEA::SurvivalScoreAssignment( for (size_t index = 0; index < front.size(); index++) { - calculatedObjectives[front[index]] = calculatedObjectives[front[index]] + calculatedObjectives[front[index]] = calculatedObjectives[front[index]] / normalize; } std::set selected; std::set remaining; - + // Create the selected and remaining sets. for (size_t index: extreme) - { + { selected.insert(index); - survivalScore[front[index]] = arma::datum::inf; + survivalScore[front[index]] = arma::Datum::inf; } - dimension = GetGeometry(calculatedObjectives, front, - extreme); + dimension = GetGeometry(calculatedObjectives, front, extreme); for (size_t i = 0; i < front.size(); i++) { if (selected.count(i) == 0) @@ -758,17 +787,17 @@ inline void AGEMOEA::SurvivalScoreAssignment( } arma::Mat pairwise(front.size(), front.size(), arma::fill::zeros); - PairwiseDistance(pairwise,calculatedObjectives,front,dimension); - arma::Row value(front.size(), + PairwiseDistance(pairwise, calculatedObjectives, front, dimension); + arma::Row value(front.size(), arma::fill::zeros); - + // Calculate the diversity and proximity score. for (size_t i = 0; i < front.size(); i++) { - pairwise.col(i) = pairwise.col(i) / std::pow(arma::accu(arma::pow( - arma::abs(calculatedObjectives[front[i]]), dimension)), 1.0 / dimension); + pairwise.col(i) = pairwise.col(i) / std::pow(accu(pow( + arma::abs(calculatedObjectives[front[i]]), dimension)), 1 / dimension); } - + while (remaining.size() > 0) { std::set::iterator it; @@ -789,12 +818,12 @@ inline void AGEMOEA::SurvivalScoreAssignment( { for (size_t i = 0; i < front.size(); i++) { - calculatedObjectives[front[i]] = (calculatedObjectives[front[i]]) / normalize; - survivalScore[front[i]] = 1.0 / std::pow(arma::accu(arma::pow(arma::abs( - calculatedObjectives[front[i]] - idealPoint), dimension)), - 1.0 / dimension); + calculatedObjectives[front[i]] = + (calculatedObjectives[front[i]]) / normalize; + survivalScore[front[i]] = 1 / std::pow(accu(pow(abs( + calculatedObjectives[front[i]] - idealPoint), dimension)), + 1 / dimension); } - } } diff --git a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian.hpp b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian.hpp index 2411e87..f247081 100644 --- a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian.hpp +++ b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian.hpp @@ -30,7 +30,8 @@ namespace ens { * documentation on function types included with this distribution or on the * ensmallen website. */ -class AugLagrangian +template // TODO: remove for ensmallen 4.x +class AugLagrangianType { public: /** @@ -43,13 +44,13 @@ class AugLagrangian * @param maxIterations Maximum number of iterations of the Augmented * Lagrangian algorithm. 0 indicates no maximum. */ - AugLagrangian(const size_t maxIterations = 1000, - const double penaltyThresholdFactor = 0.25, - const double sigmaUpdateFactor = 10.0, - const L_BFGS& lbfgs = L_BFGS()); + AugLagrangianType(const size_t maxIterations = 1000, + const double penaltyThresholdFactor = 0.25, + const double sigmaUpdateFactor = 10.0, + const L_BFGS& lbfgs = L_BFGS()); /** - * Optimize the function. The value '1' is used for the initial value of each + * Optimize the function. The value '0' is used for the initial value of each * Lagrange multiplier. To set the Lagrange multipliers yourself, use the * other overload of Optimize(). * @@ -66,7 +67,8 @@ class AugLagrangian typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, bool>::type + typename std::enable_if::value && + IsAllNonMatrix::value, bool>::type Optimize(LagrangianFunctionType& function, MatType& coordinates, CallbackTypes&&... callbacks); @@ -75,9 +77,10 @@ class AugLagrangian template - bool Optimize(LagrangianFunctionType& function, - MatType& coordinates, - CallbackTypes&&... callbacks) + typename std::enable_if::value, bool>::type + Optimize(LagrangianFunctionType& function, + MatType& coordinates, + CallbackTypes&&... callbacks) { return Optimize(function, coordinates, @@ -96,29 +99,53 @@ class AugLagrangian * @tparam CallbackTypes Types of callback functions. * @param function The function to optimize. * @param coordinates Output matrix to store the optimized coordinates in. - * @param initLambda Vector of initial Lagrange multipliers. Should have - * length equal to the number of constraints. - * @param initSigma Initial penalty parameter. + * @param lambda Vector containing initial Lagrange multipliers. Should have + * length equal to the number of constraints. This will be overwritten + * with the Lagrange multipliers that are found during optimization. + * @param sigma Initial penalty parameter. This will be overwritten with the + * final penalty value used during optimization. * @param callbacks Callback functions. */ template - typename std::enable_if::value, bool>::type + [[deprecated("use Optimize() with non-const lambda/sigma instead")]] + typename std::enable_if::value, bool>::type Optimize(LagrangianFunctionType& function, MatType& coordinates, - const arma::vec& initLambda, + const InVecType& initLambda, const double initSigma, + CallbackTypes&&... callbacks) + { + deprecatedLambda = initLambda; + deprecatedSigma = initSigma; + const bool result = Optimize(function, coordinates, this->deprecatedLambda, + this->deprecatedSigma, + std::forward(callbacks)...); + } + + template + typename std::enable_if::value, bool>::type + Optimize(LagrangianFunctionType& function, + MatType& coordinates, + InVecType& lambda, + double& sigma, CallbackTypes&&... callbacks); //! Forward the MatType as GradType. template + [[deprecated("use Optimize() with non-const lambda/sigma instead")]] bool Optimize(LagrangianFunctionType& function, MatType& coordinates, - const arma::vec& initLambda, + const VecType& initLambda, const double initSigma, CallbackTypes&&... callbacks) { @@ -127,20 +154,39 @@ class AugLagrangian std::forward(callbacks)...); } + template + bool Optimize(LagrangianFunctionType& function, + MatType& coordinates, + InVecType& lambda, + double& sigma, + CallbackTypes&&... callbacks) + { + return Optimize(function, coordinates, lambda, sigma, + std::forward(callbacks)...); + } + //! Get the L-BFGS object used for the actual optimization. const L_BFGS& LBFGS() const { return lbfgs; } //! Modify the L-BFGS object used for the actual optimization. L_BFGS& LBFGS() { return lbfgs; } //! Get the Lagrange multipliers. - const arma::vec& Lambda() const { return lambda; } + [[deprecated("use Optimize() with lambda/sigma parameters instead")]] + const VecType& Lambda() const { return deprecatedLambda; } //! Modify the Lagrange multipliers (i.e. set them before optimization). - arma::vec& Lambda() { return lambda; } + [[deprecated("use Optimize() with lambda/sigma parameters instead")]] + VecType& Lambda() { return deprecatedLambda; } //! Get the penalty parameter. - double Sigma() const { return sigma; } + [[deprecated("use Optimize() with lambda/sigma parameters instead")]] + double Sigma() const { return deprecatedSigma; } //! Modify the penalty parameter. - double& Sigma() { return sigma; } + [[deprecated("use Optimize() with lambda/sigma parameters instead")]] + double& Sigma() { return deprecatedSigma; } //! Get the maximum iterations size_t MaxIterations() const { return maxIterations; } @@ -173,11 +219,11 @@ class AugLagrangian //! Controls early termination of the optimization process. bool terminate; + // NOTE: these will be removed in ensmallen 4.x! //! Lagrange multipliers. - arma::vec lambda; - + VecType deprecatedLambda; //! Penalty parameter. - double sigma; + double deprecatedSigma; /** * Internal optimization function: given an initialized AugLagrangianFunction, @@ -185,27 +231,32 @@ class AugLagrangian */ template - typename std::enable_if::value, bool>::type - Optimize(AugLagrangianFunction& augfunc, + typename std::enable_if::value, bool>::type + Optimize(AugLagrangianFunction& augfunc, MatType& coordinates, CallbackTypes&&... callbacks); //! Forward the MatType as GradType. template - bool Optimize(AugLagrangianFunction& function, - MatType& coordinates, - CallbackTypes&&... callbacks) + bool Optimize( + AugLagrangianFunction& function, + MatType& coordinates, + CallbackTypes&&... callbacks) { - return Optimize(function, coordinates, std::forward(callbacks)...); } }; +using AugLagrangian = AugLagrangianType; + } // namespace ens #include "aug_lagrangian_impl.hpp" diff --git a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function.hpp b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function.hpp index d9310d9..ae3e1a2 100644 --- a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function.hpp +++ b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function.hpp @@ -31,19 +31,10 @@ namespace ens { * * @tparam LagrangianFunction Lagrangian function to be used. */ -template +template class AugLagrangianFunction { public: - /** - * Initialize the AugLagrangianFunction, but don't set the Lagrange - * multipliers or penalty parameters yet. Make sure you set the Lagrange - * multipliers before you use this... - * - * @param function Lagrangian function. - */ - AugLagrangianFunction(LagrangianFunction& function); - /** * Initialize the AugLagrangianFunction with the given LagrangianFunction, * Lagrange multipliers, and initial penalty parameter. @@ -53,8 +44,8 @@ class AugLagrangianFunction * @param sigma Initial penalty parameter. */ AugLagrangianFunction(LagrangianFunction& function, - const arma::vec& lambda, - const double sigma); + VecType& lambda, + double& sigma); /** * Evaluate the objective function of the Augmented Lagrangian function, which * is the standard Lagrangian function evaluation plus a penalty term, which @@ -81,17 +72,12 @@ class AugLagrangianFunction * * @return Initial point. */ - template + template const MatType& GetInitialPoint() const; - //! Get the Lagrange multipliers. - const arma::vec& Lambda() const { return lambda; } - //! Modify the Lagrange multipliers. - arma::vec& Lambda() { return lambda; } - - //! Get sigma (the penalty parameter). - double Sigma() const { return sigma; } - //! Modify sigma (the penalty parameter). + // Get the Lagrange multipliers. + VecType& Lambda() { return lambda; } + // Get the penalty parameter. double& Sigma() { return sigma; } //! Get the Lagrangian function. @@ -104,9 +90,9 @@ class AugLagrangianFunction LagrangianFunction& function; //! The Lagrange multipliers. - arma::vec lambda; + VecType& lambda; //! The penalty parameter. - double sigma; + double& sigma; }; } // namespace ens diff --git a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function_impl.hpp b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function_impl.hpp index 092a7c2..ed7675b 100644 --- a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function_impl.hpp +++ b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function_impl.hpp @@ -20,23 +20,11 @@ namespace ens { // Initialize the AugLagrangianFunction. -template -AugLagrangianFunction::AugLagrangianFunction( - LagrangianFunction& function) : - function(function), - lambda(function.NumConstraints()), - sigma(10) -{ - // Initialize lambda vector to all zeroes. - lambda.zeros(); -} - -// Initialize the AugLagrangianFunction. -template -AugLagrangianFunction::AugLagrangianFunction( +template +AugLagrangianFunction::AugLagrangianFunction( LagrangianFunction& function, - const arma::vec& lambda, - const double sigma) : + VecType& lambda, + double& sigma) : function(function), lambda(lambda), sigma(sigma) @@ -45,9 +33,10 @@ AugLagrangianFunction::AugLagrangianFunction( } // Evaluate the AugLagrangianFunction at the given coordinates. -template +template template -typename MatType::elem_type AugLagrangianFunction::Evaluate( +typename MatType::elem_type +AugLagrangianFunction::Evaluate( const MatType& coordinates) const { // The augmented Lagrangian is evaluated as @@ -63,20 +52,22 @@ typename MatType::elem_type AugLagrangianFunction::Evaluate( { ElemType constraint = function.EvaluateConstraint(i, coordinates); - objective += (-lambda[i] * constraint) + - sigma * std::pow(constraint, 2) / 2; + objective += (-ElemType(lambda[i]) * constraint) + + ElemType(sigma) * std::pow(constraint, ElemType(2)) / 2; } return objective; } // Evaluate the gradient of the AugLagrangianFunction at the given coordinates. -template +template template -void AugLagrangianFunction::Gradient( +void AugLagrangianFunction::Gradient( const MatType& coordinates, GradType& gradient) const { + typedef typename MatType::elem_type ElemType; + // The augmented Lagrangian's gradient is evaluted as // f'(x) + {(-lambda_i + sigma * c_i(x)) * c'_i(x)} for all constraints gradient.zeros(); @@ -89,16 +80,17 @@ void AugLagrangianFunction::Gradient( // Now calculate scaling factor and add to existing gradient. GradType tmpGradient; - tmpGradient = (-lambda[i] + sigma * + tmpGradient = (ElemType(-lambda[i]) + ElemType(sigma) * function.EvaluateConstraint(i, coordinates)) * constraintGradient; gradient += tmpGradient; } } // Get the initial point. -template +template template -const MatType& AugLagrangianFunction::GetInitialPoint() +const MatType& +AugLagrangianFunction::GetInitialPoint() const { return function.template GetInitialPoint(); diff --git a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_impl.hpp b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_impl.hpp index f972eae..75cf58c 100644 --- a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_impl.hpp +++ b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_impl.hpp @@ -19,70 +19,90 @@ namespace ens { -inline AugLagrangian::AugLagrangian(const size_t maxIterations, - const double penaltyThresholdFactor, - const double sigmaUpdateFactor, - const L_BFGS& lbfgs) : +template +inline AugLagrangianType::AugLagrangianType( + const size_t maxIterations, + const double penaltyThresholdFactor, + const double sigmaUpdateFactor, + const L_BFGS& lbfgs) : maxIterations(maxIterations), penaltyThresholdFactor(penaltyThresholdFactor), sigmaUpdateFactor(sigmaUpdateFactor), lbfgs(lbfgs), terminate(false), - sigma(0.0) + deprecatedSigma(0.0) { } +template template -typename std::enable_if::value, bool>::type -AugLagrangian::Optimize(LagrangianFunctionType& function, - MatType& coordinates, - const arma::vec& initLambda, - const double initSigma, - CallbackTypes&&... callbacks) +typename std::enable_if::value, bool>::type +AugLagrangianType::Optimize( + LagrangianFunctionType& function, + MatType& coordinates, + InVecType& lambda, + double& sigma, + CallbackTypes&&... callbacks) { - lambda = initLambda; - sigma = initSigma; - - AugLagrangianFunction augfunc(function, - lambda, sigma); + AugLagrangianFunction augfunc( + function, lambda, sigma); return Optimize(augfunc, coordinates, callbacks...); } +template template -typename std::enable_if::value, bool>::type -AugLagrangian::Optimize(LagrangianFunctionType& function, - MatType& coordinates, - CallbackTypes&&... callbacks) +typename std::enable_if::value && + IsAllNonMatrix::value, bool>::type +AugLagrangianType::Optimize(LagrangianFunctionType& function, + MatType& coordinates, + CallbackTypes&&... callbacks) { + typedef typename ForwardType::bvec InVecType; + // If the user did not specify the right size for sigma and lambda, we will // use defaults. - if (!lambda.is_empty()) + // TODO: remove this when ensmallen 4.x is released! + if (!deprecatedLambda.is_empty()) { - AugLagrangianFunction augfunc(function, lambda, - sigma); - return Optimize(augfunc, coordinates, callbacks...); + InVecType lambda(conv_to::from(deprecatedLambda)); + + AugLagrangianFunction augfunc(function, + lambda, deprecatedSigma); + const bool result = Optimize(augfunc, coordinates, callbacks...); + deprecatedLambda = conv_to::from(lambda); + + return result; } else { - AugLagrangianFunction augfunc(function); + // Use default values. + InVecType lambda(function.NumConstraints()); + lambda.zeros(); + double sigma = 10; + + AugLagrangianFunction augfunc( + function, lambda, sigma); return Optimize(augfunc, coordinates, callbacks...); } } +template template -typename std::enable_if::value, bool>::type -AugLagrangian::Optimize( - AugLagrangianFunction& augfunc, +typename std::enable_if::value, bool>::type +AugLagrangianType::Optimize( + AugLagrangianFunction& augfunc, MatType& coordinatesIn, CallbackTypes&&... callbacks) { @@ -110,13 +130,14 @@ AugLagrangian::Optimize( // Convergence tolerance---depends on the epsilon of the type we are using for // optimization. - ElemType tolerance = 1e3 * std::numeric_limits::epsilon(); + ElemType tolerance = 1000 * std::numeric_limits::epsilon(); // Then, calculate the current penalty. ElemType penalty = 0; for (size_t i = 0; i < function.NumConstraints(); i++) { - const ElemType p = std::pow(function.EvaluateConstraint(i, coordinates), 2); + const ElemType p = std::pow(function.EvaluateConstraint(i, coordinates), + ElemType(2)); terminate |= Callback::EvaluateConstraint(*this, function, coordinates, i, p, callbacks...); @@ -149,9 +170,6 @@ AugLagrangian::Optimize( if (std::abs(lastObjective - objective) < tolerance && augfunc.Sigma() > 500000) { - lambda = std::move(augfunc.Lambda()); - sigma = augfunc.Sigma(); - Callback::EndOptimization(*this, function, coordinates, callbacks...); return true; } @@ -167,7 +185,7 @@ AugLagrangian::Optimize( for (size_t i = 0; i < function.NumConstraints(); i++) { const ElemType p = std::pow(function.EvaluateConstraint(i, coordinates), - 2); + ElemType(2)); terminate |= Callback::EvaluateConstraint(*this, function, coordinates, i, p, callbacks...); @@ -190,12 +208,12 @@ AugLagrangian::Optimize( terminate |= Callback::EvaluateConstraint(*this, function, coordinates, i, p, callbacks...); - augfunc.Lambda()[i] -= augfunc.Sigma() * p; + augfunc.Lambda()[i] -= ElemType(augfunc.Sigma()) * p; } // We also update the penalty threshold to be a factor of the current // penalty. - penaltyThreshold = penaltyThresholdFactor * penalty; + penaltyThreshold = ElemType(penaltyThresholdFactor) * penalty; Info << "Lagrange multiplier estimates updated." << std::endl; } else @@ -208,7 +226,7 @@ AugLagrangian::Optimize( Warn << "AugLagrangian::Optimize(): sigma too large for element type; " << "terminating." << std::endl; Callback::EndOptimization(*this, function, coordinates, callbacks...); - return false; + return true; } } diff --git a/inst/include/ensmallen_bits/bigbatch_sgd/adaptive_stepsize.hpp b/inst/include/ensmallen_bits/bigbatch_sgd/adaptive_stepsize.hpp index a7e2816..816a7a0 100644 --- a/inst/include/ensmallen_bits/bigbatch_sgd/adaptive_stepsize.hpp +++ b/inst/include/ensmallen_bits/bigbatch_sgd/adaptive_stepsize.hpp @@ -69,6 +69,8 @@ class AdaptiveStepsize class Policy { public: + typedef typename MatType::elem_type ElemType; + // Create the instantiated object. Policy(AdaptiveStepsize& parent) : parent(parent) { } @@ -104,7 +106,7 @@ class AdaptiveStepsize backtrackingBatchSize); // Update the iterate. - iterate -= stepSize * gradient; + iterate -= ElemType(stepSize) * gradient; // Update Gradient & calculate curvature of quadratic approximation. GradType functionGradient(iterate.n_rows, iterate.n_cols); @@ -132,8 +134,8 @@ class AdaptiveStepsize delta0 = delta1 + (functionGradient - delta1) / k; // Compute sample variance. - vB += arma::norm(functionGradient - delta1, 2.0) * - arma::norm(functionGradient - delta0, 2.0); + vB += norm(functionGradient - delta1, 2.0) * + norm(functionGradient - delta0, 2.0); delta1 = delta0; gradient += functionGradient; @@ -145,13 +147,13 @@ class AdaptiveStepsize // Update sample variance & norm of the gradient. sampleVariance = vB; - gradientNorm = std::pow(arma::norm(gradient / backtrackingBatchSize, 2), + gradientNorm = std::pow(norm(gradient / backtrackingBatchSize, 2), 2.0); // Compute curvature. - double v = arma::trace(arma::trans(iterate - iteratePrev) * + double v = trace(trans(iterate - iteratePrev) * (gradient - gradPrevIterate)) / - std::pow(arma::norm(iterate - iteratePrev, 2), 2.0); + std::pow(norm(iterate - iteratePrev, 2), 2.0); // Update previous iterate. iteratePrev = iterate; @@ -205,12 +207,10 @@ class AdaptiveStepsize const size_t offset, const size_t backtrackingBatchSize) { - typedef typename MatType::elem_type ElemType; - ElemType overallObjective = function.Evaluate(iterate, offset, backtrackingBatchSize); - MatType iterateUpdate = iterate - (stepSize * gradient); + MatType iterateUpdate = iterate - (ElemType(stepSize) * gradient); ElemType overallObjectiveUpdate = function.Evaluate(iterateUpdate, offset, backtrackingBatchSize); @@ -220,7 +220,7 @@ class AdaptiveStepsize { stepSize *= parent.backtrackStepSize; - iterateUpdate = iterate - (stepSize * gradient); + iterateUpdate = iterate - (ElemType(stepSize) * gradient); overallObjectiveUpdate = function.Evaluate(iterateUpdate, offset, backtrackingBatchSize); } diff --git a/inst/include/ensmallen_bits/bigbatch_sgd/backtracking_line_search.hpp b/inst/include/ensmallen_bits/bigbatch_sgd/backtracking_line_search.hpp index 8f271b8..4739019 100644 --- a/inst/include/ensmallen_bits/bigbatch_sgd/backtracking_line_search.hpp +++ b/inst/include/ensmallen_bits/bigbatch_sgd/backtracking_line_search.hpp @@ -60,6 +60,8 @@ class BacktrackingLineSearch class Policy { public: + typedef typename MatType::elem_type ElemType; + // Instantiate the policy with the given parent. Policy(BacktrackingLineSearch& parent) : parent(parent) { } @@ -94,12 +96,10 @@ class BacktrackingLineSearch if (reset) stepSize *= 2; - typedef typename MatType::elem_type ElemType; - ElemType overallObjective = function.Evaluate(iterate, offset, backtrackingBatchSize); - MatType iterateUpdate = iterate - (stepSize * gradient); + MatType iterateUpdate = iterate - (ElemType(stepSize) * gradient); ElemType overallObjectiveUpdate = function.Evaluate(iterateUpdate, offset, backtrackingBatchSize); @@ -109,7 +109,7 @@ class BacktrackingLineSearch { stepSize /= 2; - iterateUpdate = iterate - (stepSize * gradient); + iterateUpdate = iterate - (ElemType(stepSize) * gradient); overallObjectiveUpdate = function.Evaluate(iterateUpdate, offset, backtrackingBatchSize); } diff --git a/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd.hpp b/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd.hpp index 4d670f2..ee194be 100644 --- a/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd.hpp +++ b/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd.hpp @@ -125,7 +125,7 @@ class BigBatchSGD typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd_impl.hpp b/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd_impl.hpp index cd88660..39b479f 100644 --- a/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd_impl.hpp +++ b/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd_impl.hpp @@ -50,8 +50,8 @@ template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type BigBatchSGD::Optimize( SeparableFunctionType& function, MatType& iterateIn, @@ -137,13 +137,13 @@ BigBatchSGD::Optimize( delta0 = delta1 + (functionGradient - delta1) / k; // Compute sample variance. - vB += arma::norm(functionGradient - delta1, 2.0) * - arma::norm(functionGradient - delta0, 2.0); + vB += norm(functionGradient - delta1, 2.0) * + norm(functionGradient - delta0, 2.0); delta1 = delta0; gradient += functionGradient; } - double gB = std::pow(arma::norm(gradient / effectiveBatchSize, 2), 2.0); + double gB = std::pow(norm(gradient / effectiveBatchSize, 2), 2.0); // Reset the batch size update process counter. reset = false; @@ -174,13 +174,13 @@ BigBatchSGD::Optimize( delta0 = delta1 + (functionGradient - delta1) / (k + 1); // Compute sample variance. - vB += arma::norm(functionGradient - delta1, 2.0) * - arma::norm(functionGradient - delta0, 2.0); + vB += norm(functionGradient - delta1, 2.0) * + norm(functionGradient - delta0, 2.0); delta1 = delta0; gradient += functionGradient; } - gB = std::pow(arma::norm(gradient / (batchSize + batchOffset), 2), 2.0); + gB = std::pow(norm(gradient / (batchSize + batchOffset), 2), 2.0); // Update the batchSize. batchSize += batchOffset; @@ -199,7 +199,7 @@ BigBatchSGD::Optimize( reset); // Update the iterate. - iterate -= stepSize * gradient; + iterate -= ElemType(stepSize) * gradient; terminate |= Callback::StepTaken(*this, f, iterate, callbacks...); const ElemType objective = f.Evaluate(iterate, currentFunction, @@ -244,10 +244,13 @@ BigBatchSGD::Optimize( terminate |= Callback::BeginEpoch(*this, f, iterate, epoch, overallObjective, callbacks...); - // Reset the counter variables. - lastObjective = overallObjective; - overallObjective = 0; - currentFunction = 0; + // Reset the counter variables if we will continue. + if (i != actualMaxIterations) + { + lastObjective = overallObjective; + overallObjective = 0; + currentFunction = 0; + } if (shuffle) // Determine order of visitation. f.Shuffle(); diff --git a/inst/include/ensmallen_bits/callbacks/timer_stop.hpp b/inst/include/ensmallen_bits/callbacks/timer_stop.hpp index 7b7f80b..542a7a1 100644 --- a/inst/include/ensmallen_bits/callbacks/timer_stop.hpp +++ b/inst/include/ensmallen_bits/callbacks/timer_stop.hpp @@ -45,6 +45,30 @@ class TimerStop timer.tic(); } + /** + * Callback function called when a step is taken. + * + * @param optimizer The optimizer used to update the function. + * @param function Function to optimize. + * @param coordinates Starting point. + * @param epoch The index of the current epoch. + * @param objective Objective value of the current point. + */ + template + bool EndEpoch(OptimizerType& /* optimizer */, + FunctionType& /* function */, + const MatType& /* coordinates */) + { + if (timer.toc() > duration) + { + Info << "Timer timeout (" << duration << "s) reached; terminating " + << "optimization." << std::endl; + return true; + } + + return false; + } + /** * Callback function called at the end of a pass over the data. * @@ -63,7 +87,8 @@ class TimerStop { if (timer.toc() > duration) { - Info << "Timer timeout reached; terminate optimization." << std::endl; + Info << "Timer timeout (" << duration << "s) reached; terminating " + << "optimization." << std::endl; return true; } diff --git a/inst/include/ensmallen_bits/cd/cd.hpp b/inst/include/ensmallen_bits/cd/cd.hpp index 062a210..34510a7 100644 --- a/inst/include/ensmallen_bits/cd/cd.hpp +++ b/inst/include/ensmallen_bits/cd/cd.hpp @@ -94,7 +94,7 @@ class CD typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(ResolvableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/cd/cd_impl.hpp b/inst/include/ensmallen_bits/cd/cd_impl.hpp index 8f57e27..3d1c7af 100644 --- a/inst/include/ensmallen_bits/cd/cd_impl.hpp +++ b/inst/include/ensmallen_bits/cd/cd_impl.hpp @@ -39,8 +39,8 @@ template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type CD::Optimize( ResolvableFunctionType& function, MatType& iterateIn, @@ -84,7 +84,7 @@ CD::Optimize( break; // Update the decision variable with the partial gradient. - iterate.col(featureIdx) -= stepSize * gradient.col(featureIdx); + iterate.col(featureIdx) -= ElemType(stepSize) * gradient.col(featureIdx); terminate |= Callback::StepTaken(*this, function, iterate, callbacks...); // Check for convergence. diff --git a/inst/include/ensmallen_bits/cd/descent_policies/random_descent.hpp b/inst/include/ensmallen_bits/cd/descent_policies/random_descent.hpp index e5eaeb1..e5c57d8 100644 --- a/inst/include/ensmallen_bits/cd/descent_policies/random_descent.hpp +++ b/inst/include/ensmallen_bits/cd/descent_policies/random_descent.hpp @@ -52,8 +52,11 @@ class RandomDescent const MatType& /* iterate */, const ResolvableFunctionType& function) { + // return randi( + // arma::distr_param(0, function.NumFeatures() - 1)); + return arma::as_scalar(arma::randi( - 1, arma::distr_param(0, function.NumFeatures() - 1))); + 1, arma::distr_param(0, function.NumFeatures() - 1))); } }; diff --git a/inst/include/ensmallen_bits/cmaes/active_cmaes.hpp b/inst/include/ensmallen_bits/cmaes/active_cmaes.hpp index 1edec07..f56f50f 100644 --- a/inst/include/ensmallen_bits/cmaes/active_cmaes.hpp +++ b/inst/include/ensmallen_bits/cmaes/active_cmaes.hpp @@ -3,8 +3,8 @@ * @author Marcus Edel * @author Suvarsha Chennareddy * - * Definition of the Active Covariance Matrix Adaptation Evolution Strategy - * as proposed by G.A Jastrebski and D.V Arnold in "Improving Evolution + * Definition of the Active Covariance Matrix Adaptation Evolution Strategy + * as proposed by G.A Jastrebski and D.V Arnold in "Improving Evolution * Strategies through Active Covariance Matrix Adaptation". * * ensmallen is free software; you may redistribute it and/or modify it under @@ -26,25 +26,25 @@ namespace ens { * Active CMA-ES is a variant of the stochastic search algorithm * CMA-ES - Covariance Matrix Adaptation Evolution Strategy. * Active CMA-ES actively reduces the uncertainty in unfavourable directions by - * exploiting the information about bad mutations in the covariance matrix - * update step. This isn't for the purpose of accelerating progress, but - * instead for speeding up the adaptation of the covariance matrix (which, in + * exploiting the information about bad mutations in the covariance matrix + * update step. This isn't for the purpose of accelerating progress, but + * instead for speeding up the adaptation of the covariance matrix (which, in * turn, will lead to faster progress). * * For more information, please refer to: * * @code * @INPROCEEDINGS{1688662, - * author={Jastrebski, G.A. and Arnold, D.V.}, - * booktitle={2006 IEEE International Conference on Evolutionary - Computation}, - * title={Improving Evolution Strategies through Active Covariance - Matrix Adaptation}, - * year={2006}, - * volume={}, - * number={}, - * pages={2814-2821}, - * doi={10.1109/CEC.2006.1688662}} + * author = {Jastrebski, G.A. and Arnold, D.V.}, + * booktitle = {2006 IEEE International Conference on Evolutionary + * Computation}, + * title = {Improving Evolution Strategies through Active Covariance + * Matrix Adaptation}, + * year = {2006}, + * volume = {}, + * number = {}, + * pages = {2814-2821}, + * doi = {10.1109/CEC.2006.1688662}} * @endcode * * Active CMA-ES can optimize separable functions. For more details, see the @@ -52,7 +52,7 @@ namespace ens { * ensmallen website. * * @tparam SelectionPolicy The selection strategy used for the evaluation step. - * @tparam TransformationPolicy The transformation strategy used to + * @tparam TransformationPolicy The transformation strategy used to * map decision variables to the desired domain during fitness evaluation * and termination. Use EmptyTransformation if the domain isn't bounded. */ @@ -62,15 +62,15 @@ class ActiveCMAES { public: /** - * Construct the Active CMA-ES optimizer with the given function and parameters. The - * defaults here are not necessarily good for the given problem, so it is - * suggested that the values used be tailored to the task at hand. The - * maximum number of iterations refers to the maximum number of points that - * are processed (i.e., one iteration equals one point; one iteration does not - * equal one pass over the dataset). + * Construct the Active CMA-ES optimizer with the given function and + * parameters. The defaults here are not necessarily good for the given + * problem, so it is suggested that the values used be tailored to the task at + * hand. The maximum number of iterations refers to the maximum number of + * points that are processed (i.e., one iteration equals one point; one + * iteration does not equal one pass over the dataset). * * @param lambda The population size (0 use the default size). - * @param transformationPolicy Instantiated transformation policy used to + * @param transformationPolicy Instantiated transformation policy used to * map the coordinates to the desired domain. * @param batchSize Batch size to use for the objective calculation. * @param maxIterations Maximum number of iterations allowed (0 means no @@ -82,7 +82,7 @@ class ActiveCMAES */ ActiveCMAES( const size_t lambda = 0, - const TransformationPolicyType& + const TransformationPolicyType& transformationPolicy = TransformationPolicyType(), const size_t batchSize = 32, const size_t maxIterations = 1000, @@ -91,38 +91,9 @@ class ActiveCMAES double stepSize = 0); /** - * Construct the Active CMA-ES optimizer with the given function and parameters - * (including lower and upper bounds). The defaults here are not necessarily - * good for the given problem, so it is suggested that the values used be - * tailored to the task at hand. The maximum number of iterations refers to - * the maximum number of points that are processed (i.e., one iteration - * equals one point; one iteration does not equal one pass over the dataset). - * - * @param lambda The population size(0 use the default size). - * @param lowerBound Lower bound of decision variables. - * @param upperBound Upper bound of decision variables. - * @param batchSize Batch size to use for the objective calculation. - * @param maxIterations Maximum number of iterations allowed(0 means no - limit). - * @param tolerance Maximum absolute tolerance to terminate algorithm. - * @param selectionPolicy Instantiated selection policy used to calculate the - * objective. - * @param stepSize Starting sigma/step size (will be modified). - */ - ActiveCMAES( - const size_t lambda = 0, - const double lowerBound = -10, - const double upperBound = 10, - const size_t batchSize = 32, - const size_t maxIterations = 1000, - const double tolerance = 1e-5, - const SelectionPolicyType& selectionPolicy = SelectionPolicyType(), - double stepSize = 0); - - /** - * Optimize the given function using Active CMA-ES. The given starting point will be - * modified to store the finishing point of the algorithm, and the final - * objective value is returned. + * Optimize the given function using Active CMA-ES. The given starting point + * will be modified to store the finishing point of the algorithm, and the + * final objective value is returned. * * @tparam SeparableFunctionType Type of the function to be optimized. * @tparam MatType Type of matrix to optimize. @@ -169,7 +140,7 @@ class ActiveCMAES const TransformationPolicyType& TransformationPolicy() const { return transformationPolicy; } //! Modify the transformation policy. - TransformationPolicyType& TransformationPolicy() + TransformationPolicyType& TransformationPolicy() { return transformationPolicy; } //! Get the step size. @@ -196,7 +167,7 @@ class ActiveCMAES SelectionPolicyType selectionPolicy; //! The transformationPolicy used to map coordinates to the suitable domain - //! while evaluating fitness. This mapping is also done after optimization + //! while evaluating fitness. This mapping is also done after optimization //! has completed. TransformationPolicyType transformationPolicy; diff --git a/inst/include/ensmallen_bits/cmaes/active_cmaes_impl.hpp b/inst/include/ensmallen_bits/cmaes/active_cmaes_impl.hpp index 047de85..cd7d9ed 100644 --- a/inst/include/ensmallen_bits/cmaes/active_cmaes_impl.hpp +++ b/inst/include/ensmallen_bits/cmaes/active_cmaes_impl.hpp @@ -18,7 +18,6 @@ // In case it hasn't been included yet. #include "active_cmaes.hpp" -#include "not_empty_transformation.hpp" #include namespace ens { @@ -42,29 +41,6 @@ ActiveCMAES::ActiveCMAES( stepSize(stepSizeIn) { /* Nothing to do. */ } -template -ActiveCMAES::ActiveCMAES( - const size_t lambda, - const double lowerBound, - const double upperBound, - const size_t batchSize, - const size_t maxIterations, - const double tolerance, - const SelectionPolicyType& selectionPolicy, - double stepSizeIn) : - lambda(lambda), - batchSize(batchSize), - maxIterations(maxIterations), - tolerance(tolerance), - selectionPolicy(selectionPolicy), - stepSize(stepSizeIn) -{ - Warn << "This is a deprecated constructor and will be removed in a " - "future version of ensmallen" << std::endl; - NotEmptyTransformation> d; - d.Assign(transformationPolicy, lowerBound, upperBound); -} - //! Optimize the function (minimize). template template::BaseMatType BaseMatType; + typedef typename ForwardType::bcol BaseColType; + typedef typename ForwardType::uvec UVecType; + // Make sure that we have the methods that we need. Long name... traits::CheckArbitrarySeparableFunctionTypeAPI< SeparableFunctionType, BaseMatType>(); @@ -105,21 +84,23 @@ typename MatType::elem_type ActiveCMAES mPosition(2, BaseMatType(iterate.n_rows, iterate.n_cols)); @@ -163,13 +144,13 @@ typename MatType::elem_type ActiveCMAES eigval; + BaseColType eigval; BaseMatType eigvec; BaseMatType eigvalZero(iterate.n_elem, 1); // eigvalZero is vector-shaped. eigvalZero.zeros(); // The current visitation order (sorted by population objectives). - arma::uvec idx = arma::linspace(0, lambda - 1, lambda); + UVecType idx = linspace(0, lambda - 1, lambda); // Now iterate! Callback::BeginOptimization(*this, function, transformedIterate, @@ -191,21 +172,22 @@ typename MatType::elem_type ActiveCMAES::epsilon(); - arma::eig_sym(eigval, eigvec, C[idx0]); + eig_sym(eigval, eigvec, C[idx0]); for (size_t j = 0; j < lambda; ++j) { if (iterate.n_rows > iterate.n_cols) { pStep[idx(j)] = covLower * - arma::randn(iterate.n_rows, iterate.n_cols); + randn(iterate.n_rows, iterate.n_cols); } else { - pStep[idx(j)] = arma::randn(iterate.n_rows, iterate.n_cols) + pStep[idx(j)] = randn(iterate.n_rows, iterate.n_cols) * covLower.t(); } @@ -218,7 +200,7 @@ typename MatType::elem_type ActiveCMAES 1e14) @@ -308,8 +290,8 @@ typename MatType::elem_type ActiveCMAES namespace ens { template CMAES::CMAES(const size_t lambda, - const TransformationPolicyType& + const TransformationPolicyType& transformationPolicy, const size_t batchSize, const size_t maxIterations, @@ -41,35 +40,12 @@ CMAES::CMAES(const size_t lambda, stepSize(stepSizeIn) { /* Nothing to do. */ } -template -CMAES::CMAES(const size_t lambda, - const double lowerBound, - const double upperBound, - const size_t batchSize, - const size_t maxIterations, - const double tolerance, - const SelectionPolicyType& selectionPolicy, - double stepSizeIn) : - lambda(lambda), - batchSize(batchSize), - maxIterations(maxIterations), - tolerance(tolerance), - selectionPolicy(selectionPolicy), - stepSize(stepSizeIn) -{ - Warn << "This is a deprecated constructor and will be removed in a " - "future version of ensmallen" << std::endl; - NotEmptyTransformation> d; - d.Assign(transformationPolicy, lowerBound, upperBound); -} - - //! Optimize the function (minimize). template template -typename MatType::elem_type CMAES::Optimize( SeparableFunctionType& function, MatType& iterateIn, @@ -77,7 +53,10 @@ typename MatType::elem_type CMAES::BaseMatType BaseMatType; + + typedef typename ForwardType::bcol bcol; + typedef typename ForwardType::uvec UVecType; + typedef typename ForwardType::bmat BaseMatType; // Make sure that we have the methods that we need. Long name... traits::CheckArbitrarySeparableFunctionTypeAPI< @@ -95,18 +74,18 @@ typename MatType::elem_type CMAES(0, mu - 1, mu) + 1.0); - w /= arma::accu(w); + BaseMatType w = std::log(mu + 0.5) - log( + linspace(0, mu - 1, mu) + 1.0); + w /= accu(w); // Number of effective solutions. - const double muEffective = 1 / arma::accu(arma::pow(w, 2)); + const double muEffective = 1 / accu(pow(w, 2)); // Step size control parameters. BaseMatType sigma(2, 1); // sigma is vector-shaped. - if (stepSize == 0) + if (stepSize == 0) sigma(0) = transformationPolicy.InitialStepSize(); - else + else sigma(0) = stepSize; const double cs = (muEffective + 2) / (iterate.n_elem + muEffective + 5); @@ -151,7 +130,6 @@ typename MatType::elem_type CMAES::max(); @@ -170,13 +148,13 @@ typename MatType::elem_type CMAES eigval; // TODO: might need a more general type. + bcol eigval; // TODO: might need a more general type. BaseMatType eigvec; BaseMatType eigvalZero(iterate.n_elem, 1); // eigvalZero is vector-shaped. eigvalZero.zeros(); // The current visitation order (sorted by population objectives). - arma::uvec idx = arma::linspace(0, lambda - 1, lambda); + UVecType idx = linspace(0, lambda - 1, lambda); // Now iterate! Callback::BeginOptimization(*this, function, transformedIterate, @@ -196,22 +174,24 @@ typename MatType::elem_type CMAES::epsilon(); - arma::eig_sym(eigval, eigvec, C[idx0]); + eig_sym(eigval, eigvec, C[idx0]); for (size_t j = 0; j < lambda; ++j) { if (iterate.n_rows > iterate.n_cols) { - pStep[idx(j)] = covLower * - arma::randn(iterate.n_rows, iterate.n_cols); + pStep[idx(j)] = covLower * BaseMatType( + iterate.n_rows, iterate.n_cols, GetFillType::randn); } else { - pStep[idx(j)] = arma::randn(iterate.n_rows, iterate.n_cols) - * covLower.t(); + pStep[idx(j)] = BaseMatType( + iterate.n_rows, iterate.n_cols, GetFillType::randn) * + covLower.t(); } pPosition[idx(j)] = mPosition[idx0] + sigma(idx0) * pStep[idx(j)]; @@ -223,7 +203,7 @@ typename MatType::elem_type CMAES iterate.n_cols) { ps[idx1] = (1 - cs) * ps[idx0] + std::sqrt( - cs * (2 - cs) * muEffective) * - eigvec * diagmat(1 / eigval) * eigvec.t() * step; + cs * (2 - cs) * muEffective) * eigvec * + diagmat(1 / eigval) * eigvec.t() * step; } else { ps[idx1] = (1 - cs) * ps[idx0] + std::sqrt( - cs * (2 - cs) * muEffective) * step * - eigvec * diagmat(1 / eigval) * eigvec.t(); + cs * (2 - cs) * muEffective) * step * eigvec * + diagmat(1 / eigval) * eigvec.t(); } - const ElemType psNorm = arma::norm(ps[idx1]); + const ElemType psNorm = norm(ps[idx1]); sigma(idx1) = sigma(idx0) * std::exp(cs / ds * (psNorm / enn - 1)); if (std::isnan(sigma(idx1)) || sigma(idx1) > 1e14) { Warn << "The step size diverged to " << sigma(idx1) << "; " - << "terminating with failure. Try a smaller step size?" << std::endl; + << "terminating with failure. Try a smaller step size?" << std::endl; iterate = transformationPolicy.Transform(iterate); @@ -278,20 +256,20 @@ typename MatType::elem_type CMAES iterate.n_cols) { C[idx1] = (1 - c1 - cmu) * C[idx0] + c1 * - (pc[idx1] * pc[idx1].t()); + (pc[idx1] * pc[idx1].t()); } else { C[idx1] = (1 - c1 - cmu) * C[idx0] + c1 * - (pc[idx1].t() * pc[idx1]); + (pc[idx1].t() * pc[idx1]); } } else @@ -301,12 +279,12 @@ typename MatType::elem_type CMAES iterate.n_cols) { C[idx1] = (1 - c1 - cmu) * C[idx0] + c1 * (pc[idx1] * - pc[idx1].t() + (cc * (2 - cc)) * C[idx0]); + pc[idx1].t() + (cc * (2 - cc)) * C[idx0]); } else { C[idx1] = (1 - c1 - cmu) * C[idx0] + c1 * - (pc[idx1].t() * pc[idx1] + (cc * (2 - cc)) * C[idx0]); + (pc[idx1].t() * pc[idx1] + (cc * (2 - cc)) * C[idx0]); } } @@ -314,21 +292,19 @@ typename MatType::elem_type CMAES patience) { Info << "CMA-ES: minimized within tolerance " << tolerance << "; " - << "terminating optimization." << std::endl; + << "terminating optimization." << std::endl; iterate = transformationPolicy.Transform(iterate); Callback::EndOptimization(*this, function, iterate, callbacks...); diff --git a/inst/include/ensmallen_bits/cmaes/not_empty_transformation.hpp b/inst/include/ensmallen_bits/cmaes/not_empty_transformation.hpp deleted file mode 100644 index 5252a42..0000000 --- a/inst/include/ensmallen_bits/cmaes/not_empty_transformation.hpp +++ /dev/null @@ -1,42 +0,0 @@ -/** - * @file not_empty_transformation.hpp - * @author Suvarsha Chennareddy - * - * Check whether TransformationPolicyType is EmptyTransformation. - * - * ensmallen is free software; you may redistribute it and/or modify it under - * the terms of the 3-clause BSD license. You should have received a copy of - * the 3-clause BSD license along with ensmallen. If not, see - * http://www.opensource.org/licenses/BSD-3-Clause for more information. - */ -#ifndef ENSMALLEN_CMAES_NOT_EMPTY_TRANSFORMATION -#define ENSMALLEN_CMAES_NOT_EMPTY_TRANSFORMATION - -/** - * This partial specialization is used to throw an exception when the - * TransformationPolicyType is EmptyTransformation and call a constructor with - * parameters 'lowerBound' and 'upperBound' otherwise. This shall be removed - * when the deprecated constructor is removed in the next major version of - * ensmallen. - */ -template -struct NotEmptyTransformation : std::true_type -{ - void Assign(T1& obj, double lowerBound, double upperBound) - { - obj = T1(lowerBound, upperBound); - } -}; - -template class T, typename... A, typename... B> -struct NotEmptyTransformation, T> : std::false_type -{ - void Assign(T& /* obj */, - double /* lowerBound */, - double /* upperBound */) - { - throw std::logic_error("TransformationPolicyType is EmptyTransformation"); - } -}; - -#endif diff --git a/inst/include/ensmallen_bits/cmaes/pop_cmaes.hpp b/inst/include/ensmallen_bits/cmaes/pop_cmaes.hpp index 7614305..17e1bcc 100644 --- a/inst/include/ensmallen_bits/cmaes/pop_cmaes.hpp +++ b/inst/include/ensmallen_bits/cmaes/pop_cmaes.hpp @@ -6,7 +6,7 @@ * Definition of the IPOP Covariance Matrix Adaptation Evolution Strategy * as proposed by A. Auger and N. Hansen in "A Restart CMA Evolution * Strategy With Increasing Population Size" and BIPOP Covariance Matrix - * Adaptation Evolution Strategy as proposed by N. Hansen in "Benchmarking + * Adaptation Evolution Strategy as proposed by N. Hansen in "Benchmarking * a BI-population CMA-ES on the BBOB-2009 function testbed". * * ensmallen is free software; you may redistribute it and/or modify it under @@ -24,55 +24,59 @@ namespace ens { /** * Population-based CMA-ES (POP-CMA-ES) that can operate as either IPOP-CMA-ES * or BIPOP-CMA-ES based on a flag. - * + * * IPOP CMA-ES is a variant of the stochastic search algorithm * CMA-ES - Covariance Matrix Adaptation Evolution Strategy. - * IPOP CMA-ES, also known as CMAES with increasing population size, + * IPOP CMA-ES, also known as CMAES with increasing population size, * incorporates a restart strategy that involves gradually increasing - * the population size. This approach is specifically designed to + * the population size. This approach is specifically designed to * enhance the performance of CMA-ES on multi-modal functions. * * For more information, please refer to: * * @code * @INPROCEEDINGS{1554902, - * author={Auger, A. and Hansen, N.}, - * booktitle={2005 IEEE Congress on Evolutionary Computation}, - * title={A restart CMA evolution strategy with increasing population size}, - * year={2005}, - * volume={2}, - * number={}, - * pages={1769-1776 Vol. 2}, - * doi={10.1109/CEC.2005.1554902}} + * author = {Auger, A. and Hansen, N.}, + * booktitle = {2005 IEEE Congress on Evolutionary Computation}, + * title = {A restart CMA evolution strategy with increasing population + * size}, + * year = {2005}, + * volume = {2}, + * number = {}, + * pages = {1769-1776 Vol. 2}, + * doi = {10.1109/CEC.2005.1554902}} * @endcode - * + * * IPOP CMA-ES can optimize separable functions. For more details, see the * documentation on function types included with this distribution or on the * ensmallen website. - * + * * BI-Population CMA-ES is a variant of the stochastic search algorithm * CMA-ES - Covariance Matrix Adaptation Evolution Strategy. - * It implements a dual restart strategy with varying population sizes: one + * It implements a dual restart strategy with varying population sizes: one * increasing and one with smaller, varied sizes. This BI-population approach - * is designed to optimize performance on multi-modal function testbeds by + * is designed to optimize performance on multi-modal function testbeds by * leveraging different exploration and exploitation dynamics. * * For more information, please refer to: * * @code * @inproceedings{hansen2009benchmarking, - * title={Benchmarking a BI-population CMA-ES on the BBOB-2009 function testbed}, - * author={Hansen, Nikolaus}, - * booktitle={Proceedings of the 11th annual conference companion on genetic and evolutionary computation conference: late breaking papers}, - * pages={2389--2396}, - * year={2009}} + * title = {Benchmarking a BI-population CMA-ES on the BBOB-2009 function + * testbed}, + * author = {Hansen, Nikolaus}, + * booktitle = {Proceedings of the 11th annual conference companion on genetic + * and evolutionary computation conference: late breaking + * papers}, + * pages = {2389--2396}, + * year = {2009}} * @endcode * * BI-Population CMA-ES can efficiently handle separable, multimodal, and weak - * structure functions across various dimensions, as demonstrated in the + * structure functions across various dimensions, as demonstrated in the * comprehensive results of the BBOB-2009 function testbed. The optimizer - * utilizes an interlaced multistart strategy to balance between broad - * exploration and intensive exploitation, adjusting population sizes and + * utilizes an interlaced multistart strategy to balance between broad + * exploration and intensive exploitation, adjusting population sizes and * step-sizes dynamically. */ template public: /** * Construct the POP-CMA-ES optimizer with the given parameters. - * Other than the same CMA-ES parameters, it also adds the maximum number of - * restarts, the increase in population factor, the maximum number of + * Other than the same CMA-ES parameters, it also adds the maximum number of + * restarts, the increase in population factor, the maximum number of * evaluations, as well as a flag indicating to use BIPOP or not. * The suggested values are not necessarily good for the given problem, so it * is suggested that the values used be tailored to the task at hand. The * maximum number of iterations refers to the maximum number of points that * are processed (i.e., one iteration equals one point; one iteration does not * equal one pass over the dataset). - * + * * @param lambda The initial population size (0 use the default size). * @param transformationPolicy Instantiated transformation policy used to * map the coordinates to the desired domain. @@ -107,7 +111,7 @@ class POP_CMAES : public CMAES * @param maxFunctionEvaluations Maximum number of function evaluations. */ POP_CMAES(const size_t lambda = 0, - const TransformationPolicyType& transformationPolicy = + const TransformationPolicyType& transformationPolicy = TransformationPolicyType(), const size_t batchSize = 32, const size_t maxIterations = 1000, @@ -161,11 +165,13 @@ class POP_CMAES : public CMAES // Define IPOP_CMAES and BIPOP_CMAES using the POP_CMAES template template> -using IPOP_CMAES = POP_CMAES; +using IPOP_CMAES = POP_CMAES< + SelectionPolicyType, TransformationPolicyType, false>; template> -using BIPOP_CMAES = POP_CMAES; +using BIPOP_CMAES = POP_CMAES< + SelectionPolicyType, TransformationPolicyType, true>; } // namespace ens diff --git a/inst/include/ensmallen_bits/cmaes/pop_cmaes_impl.hpp b/inst/include/ensmallen_bits/cmaes/pop_cmaes_impl.hpp index 38b8011..2fa26c1 100644 --- a/inst/include/ensmallen_bits/cmaes/pop_cmaes_impl.hpp +++ b/inst/include/ensmallen_bits/cmaes/pop_cmaes_impl.hpp @@ -6,7 +6,7 @@ * Implementation of the IPOP Covariance Matrix Adaptation Evolution Strategy * as proposed by A. Auger and N. Hansen in "A Restart CMA Evolution * Strategy With Increasing Population Size" and BIPOP Covariance Matrix - * Adaptation Evolution Strategy as proposed by N. Hansen in "Benchmarking + * Adaptation Evolution Strategy as proposed by N. Hansen in "Benchmarking * a BI-population CMA-ES on the BBOB-2009 function testbed". * * ensmallen is free software; you may redistribute it and/or modify it under @@ -48,7 +48,7 @@ template template -typename MatType::elem_type POP_CMAES::Optimize( SeparableFunctionType& function, MatType& iterateIn, @@ -65,9 +65,9 @@ typename MatType::elem_type POP_CMAES::Optimize(function, iterate, sbc, - callbacks...); + ElemType overallObjective = CMAES::Optimize(function, iterate, sbc, + callbacks...); overallSBC = sbc; ElemType objective; @@ -85,7 +85,7 @@ typename MatType::elem_type POP_CMAESPopulationSize() << "." << std::endl; - + iterate = iterateIn; // Optimize using the CMAES object. - objective = CMAES::Optimize(function, iterate, sbc, + objective = CMAES::Optimize(function, iterate, sbc, callbacks...); evaluations = this->FunctionEvaluations(); @@ -110,10 +110,10 @@ typename MatType::elem_type POP_CMAES(); - size_t smallLambda = static_cast(defaultLambda * std::pow(0.5 * + size_t smallLambda = static_cast(defaultLambda * std::pow(0.5 * currentLargeLambda / defaultLambda, u * u)); double stepSizeSmall = 2 * std::pow(10, -2 * arma::randu()); - + this->PopulationSize() = smallLambda; this->StepSize() = stepSizeSmall; @@ -121,10 +121,10 @@ typename MatType::elem_type POP_CMAESPopulationSize() << "." << std::endl; iterate = iterateIn; - + // Optimize using the CMAES object. - objective = CMAES::Optimize(function, iterate, sbc, + objective = CMAES::Optimize(function, iterate, sbc, callbacks...); evaluations = this->FunctionEvaluations(); @@ -160,4 +160,4 @@ typename MatType::elem_type POP_CMAES + template void Reproduce(std::vector& population, const MatType& fitnessValues, - arma::uvec& index); + IndexType& index); //! Modify weights with some noise for the evolution of next generation. - template - void Mutate(std::vector& population, arma::uvec& index); + template + void Mutate(std::vector& population, IndexType& index); /** * Crossover parents and create new childs. Two parents create two new childs. diff --git a/inst/include/ensmallen_bits/cne/cne_impl.hpp b/inst/include/ensmallen_bits/cne/cne_impl.hpp index 24d1812..6e74a75 100644 --- a/inst/include/ensmallen_bits/cne/cne_impl.hpp +++ b/inst/include/ensmallen_bits/cne/cne_impl.hpp @@ -47,6 +47,7 @@ typename MatType::elem_type CNE::Optimize(ArbitraryFunctionType& function, // Convenience typedefs. typedef typename MatType::elem_type ElemType; typedef typename MatTypeTraits::BaseMatType BaseMatType; + typedef typename ForwardType::uvec UVecType; // Make sure that we have the methods that we need. Long name... traits::CheckArbitraryFunctionTypeAPI population; for (size_t i = 0 ; i < populationSize; ++i) { - population.push_back(arma::randn(iterate.n_rows, - iterate.n_cols) + iterate); + population.push_back(BaseMatType(iterate.n_rows, iterate.n_cols, + GetFillType::randn) + iterate); } // Store the number of elements in the objective matrix. @@ -164,13 +165,13 @@ typename MatType::elem_type CNE::Optimize(ArbitraryFunctionType& function, } //! Reproduce candidates to create the next generation. -template +template inline void CNE::Reproduce(std::vector& population, const MatType& fitnessValues, - arma::uvec& index) + IndexType& index) { // Sort fitness values. Smaller fitness value means better performance. - index = arma::sort_index(fitnessValues); + index = sort_index(fitnessValues); // First parent. size_t mom; @@ -241,17 +242,20 @@ inline void CNE::Crossover(std::vector& population, } //! Modify weights with some noise for the evolution of next generation. -template -inline void CNE::Mutate(std::vector& population, arma::uvec& index) +template +inline void CNE::Mutate(std::vector& population, IndexType& index) { + typedef typename MatType::elem_type ElemType; + // Mutate the whole matrix with the given rate and probability. // The best candidate is not altered. for (size_t i = 1; i < populationSize; i++) { - population[index(i)] += (arma::randu(population[index(i)].n_rows, - population[index(i)].n_cols) < mutationProb) % - (mutationSize * arma::randn(population[index(i)].n_rows, - population[index(i)].n_cols)); + population[index(i)] += conv_to::from( + randu(population[index(i)].n_rows, + population[index(i)].n_cols) < ElemType(mutationProb)) % + (ElemType(mutationSize) * MatType(population[index(i)].n_rows, + population[index(i)].n_cols, GetFillType::randn)); } } diff --git a/inst/include/ensmallen_bits/de/de.hpp b/inst/include/ensmallen_bits/de/de.hpp index 93c41aa..75449a6 100644 --- a/inst/include/ensmallen_bits/de/de.hpp +++ b/inst/include/ensmallen_bits/de/de.hpp @@ -45,10 +45,10 @@ namespace ens { * * @code * @techreport{storn1995, - * title = {Differential Evolution—a simple and efficient adaptive scheme - * for global optimization over continuous spaces}, - * author = {Storn, Rainer and Price, Kenneth}, - * year = 1995 + * title = {Differential Evolution—a simple and efficient adaptive scheme + * for global optimization over continuous spaces}, + * author = {Storn, Rainer and Price, Kenneth}, + * year = 1995 * } * @endcode * diff --git a/inst/include/ensmallen_bits/de/de_impl.hpp b/inst/include/ensmallen_bits/de/de_impl.hpp index 09e55a0..c44cef6 100644 --- a/inst/include/ensmallen_bits/de/de_impl.hpp +++ b/inst/include/ensmallen_bits/de/de_impl.hpp @@ -40,14 +40,16 @@ typename MatType::elem_type DE::Optimize(FunctionType& function, // Convenience typedefs. typedef typename MatType::elem_type ElemType; typedef typename MatTypeTraits::BaseMatType BaseMatType; + typedef typename ForwardType::vec ColType; BaseMatType& iterate = (BaseMatType&) iterateIn; // Population matrix. Each column is a candidate. std::vector population; population.resize(populationSize); + // Vector of fitness values corresponding to each candidate. - arma::Col fitnessValues; + ColType fitnessValues; // Make sure that we have the methods that we need. Long name... traits::CheckArbitraryFunctionTypeAPI< @@ -57,13 +59,13 @@ typename MatType::elem_type DE::Optimize(FunctionType& function, // Population Size must be at least 3 for DE to work. if (populationSize < 3) { - throw std::logic_error("CNE::Optimize(): population size should be at least" + throw std::logic_error("DE::Optimize(): population size should be at least" " 3!"); } // Initialize helper variables. fitnessValues.set_size(populationSize); - ElemType lastBestFitness = DBL_MAX; + ElemType lastBestFitness = std::numeric_limits::max(); BaseMatType bestElement; // Controls early termination of the optimization process. @@ -82,7 +84,7 @@ typename MatType::elem_type DE::Optimize(FunctionType& function, if (fitnessValues[i] < lastBestFitness) { - lastBestFitness = fitnessValues[i]; + lastBestFitness = ElemType(fitnessValues[i]); bestElement = population[i]; } } @@ -111,16 +113,17 @@ typename MatType::elem_type DE::Optimize(FunctionType& function, while (m == member && m == l); // Generate new "mutant" from two randomly chosen members. - BaseMatType mutant = bestElement + differentialWeight * + BaseMatType mutant = bestElement + ElemType(differentialWeight) * (population[l] - population[m]); // Perform crossover. - const BaseMatType cr = arma::randu(iterate.n_rows); + BaseMatType cr; + cr.randu(iterate.n_rows, 1); for (size_t it = 0; it < iterate.n_rows; it++) { - if (cr[it] >= crossoverRate) + if (cr[it] >= ElemType(crossoverRate)) { - mutant[it] = iterate[it]; + mutant(it) = ElemType(iterate(it)); } } @@ -158,7 +161,7 @@ typename MatType::elem_type DE::Optimize(FunctionType& function, } // Update helper variables. - lastBestFitness = fitnessValues.min(); + lastBestFitness = ElemType(fitnessValues.min()); for (size_t it = 0; it < populationSize; it++) { if (fitnessValues[it] == lastBestFitness) diff --git a/inst/include/ensmallen_bits/demon_adam/demon_adam.hpp b/inst/include/ensmallen_bits/demon_adam/demon_adam.hpp index e524531..e37b24e 100644 --- a/inst/include/ensmallen_bits/demon_adam/demon_adam.hpp +++ b/inst/include/ensmallen_bits/demon_adam/demon_adam.hpp @@ -31,11 +31,11 @@ namespace ens { * * @code * @misc{ - * title = {Decaying momentum helps neural network training}, - * author = {John Chen and Cameron Wolfe and Zhao Li - * and Anastasios Kyrillidis}, - * url = {https://arxiv.org/abs/1910.04952} - * year = {2019} + * title = {Decaying momentum helps neural network training}, + * author = {John Chen and Cameron Wolfe and Zhao Li + * and Anastasios Kyrillidis}, + * url = {https://arxiv.org/abs/1910.04952} + * year = {2019} * } * * DemonAdam can optimize differentiable separable functions. For more details, diff --git a/inst/include/ensmallen_bits/demon_adam/demon_adam_update.hpp b/inst/include/ensmallen_bits/demon_adam/demon_adam_update.hpp index 47f6b36..b7581da 100644 --- a/inst/include/ensmallen_bits/demon_adam/demon_adam_update.hpp +++ b/inst/include/ensmallen_bits/demon_adam/demon_adam_update.hpp @@ -90,6 +90,7 @@ class DemonAdamUpdate // Convenient typedef. typedef typename UpdateRule::template Policy InstUpdateRuleType; + typedef typename MatType::elem_type ElemType; /** * This constructor is called by the SGD Optimize() method before the start @@ -103,7 +104,8 @@ class DemonAdamUpdate const size_t rows, const size_t cols) : parent(parent), - adamUpdate(new InstUpdateRuleType(parent.adamUpdateInst, rows, cols)) + adamUpdate(new InstUpdateRuleType(parent.adamUpdateInst, rows, cols)), + betaInit(ElemType(parent.betaInit)) { /* Nothing to do here */ } /** @@ -125,12 +127,12 @@ class DemonAdamUpdate const double stepSize, const GradType& gradient) { - double decayRate = 1; + ElemType decayRate = 1; if (parent.t > 0) - decayRate = 1.0 - (double) parent.t / (double) parent.T; + decayRate = 1 - ElemType((double) parent.t / (double) parent.T); - const double betaDecay = parent.betaInit * decayRate; - const double beta = betaDecay / ((1.0 - parent.betaInit) + betaDecay); + const ElemType betaDecay = betaInit * decayRate; + const ElemType beta = betaDecay / ((1 - betaInit) + betaDecay); // Perform the update. iterate *= beta; @@ -143,11 +145,14 @@ class DemonAdamUpdate } private: - //! Instantiated parent object. + // Instantiated parent object. DemonAdamUpdate& parent; - //! The update policy. + // The update policy. InstUpdateRuleType* adamUpdate; + + // Optimizer parameter converted to the element type of the optimization. + ElemType betaInit; }; private: diff --git a/inst/include/ensmallen_bits/demon_sgd/demon_sgd.hpp b/inst/include/ensmallen_bits/demon_sgd/demon_sgd.hpp index ddbf1d2..4c8d514 100644 --- a/inst/include/ensmallen_bits/demon_sgd/demon_sgd.hpp +++ b/inst/include/ensmallen_bits/demon_sgd/demon_sgd.hpp @@ -25,11 +25,11 @@ namespace ens { * * @code * @misc{ - * title = {Decaying momentum helps neural network training}, - * author = {John Chen and Cameron Wolfe and Zhao Li - * and Anastasios Kyrillidis}, - * url = {https://arxiv.org/abs/1910.04952} - * year = {2019} + * title = {Decaying momentum helps neural network training}, + * author = {John Chen and Cameron Wolfe and Zhao Li + * and Anastasios Kyrillidis}, + * url = {https://arxiv.org/abs/1910.04952} + * year = {2019} * } * * DemonSGD can optimize differentiable separable functions. For more details, diff --git a/inst/include/ensmallen_bits/demon_sgd/demon_sgd_update.hpp b/inst/include/ensmallen_bits/demon_sgd/demon_sgd_update.hpp index dc8b7c5..41638a3 100644 --- a/inst/include/ensmallen_bits/demon_sgd/demon_sgd_update.hpp +++ b/inst/include/ensmallen_bits/demon_sgd/demon_sgd_update.hpp @@ -78,6 +78,8 @@ class DemonSGDUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -89,7 +91,8 @@ class DemonSGDUpdate Policy(DemonSGDUpdate& parent, const size_t /* rows */, const size_t /* cols */) : - parent(parent) + parent(parent), + betaInit(ElemType(parent.betaInit)) { /* Nothing to do here */ } /** @@ -103,34 +106,37 @@ class DemonSGDUpdate const double stepSize, const GradType& gradient) { - double decayRate = 1; + ElemType decayRate = 1; if (parent.t > 0) - decayRate = 1.0 - (double) parent.t / (double) parent.T; + decayRate = 1 - ElemType((double) parent.t / (double) parent.T); - const double betaDecay = parent.betaInit * decayRate; - const double beta = betaDecay / ((1.0 - parent.betaInit) + betaDecay); + const ElemType betaDecay = betaInit * decayRate; + const ElemType beta = betaDecay / ((1 - betaInit) + betaDecay); // Perform the update. iterate *= beta; - iterate -= stepSize * gradient; + iterate -= ElemType(stepSize) * gradient; // Increment the iteration counter variable. ++parent.t; } private: - //! Instantiated parent object. + // Instantiated parent object. DemonSGDUpdate& parent; + + // Optimizer parameter converted to the element type of the optimization. + ElemType betaInit; }; private: - //! The number of momentum iterations. + // The number of momentum iterations. size_t T; - //! Initial momentum coefficient. + // Initial momentum coefficient. double betaInit; - //! The number of iterations. + // The number of iterations. size_t t; }; diff --git a/inst/include/ensmallen_bits/ens_version.hpp b/inst/include/ensmallen_bits/ens_version.hpp index 86c29ba..d6bf013 100644 --- a/inst/include/ensmallen_bits/ens_version.hpp +++ b/inst/include/ensmallen_bits/ens_version.hpp @@ -12,20 +12,20 @@ // This follows the Semantic Versioning pattern defined in https://semver.org/. -#define ENS_VERSION_MAJOR 2 +#define ENS_VERSION_MAJOR 3 // The minor version is two digits so regular numerical comparisons of versions // work right. The first minor version of a release is always 10. -#define ENS_VERSION_MINOR 22 -#define ENS_VERSION_PATCH 1 +#define ENS_VERSION_MINOR 10 +#define ENS_VERSION_PATCH 0 // If this is a release candidate, it will be reflected in the version name // (i.e. the version name will be "RC1", "RC2", etc.). Otherwise the version // name will typically be a seemingly arbitrary set of words that does not // contain the capitalized string "RC". -#define ENS_VERSION_NAME "E-Bike Excitement" +#define ENS_VERSION_NAME "Unexpected Rain" // Incorporate the date the version was released. -#define ENS_VERSION_YEAR "2024" -#define ENS_VERSION_MONTH "12" -#define ENS_VERSION_DAY "02" +#define ENS_VERSION_YEAR "2025" +#define ENS_VERSION_MONTH "09" +#define ENS_VERSION_DAY "25" namespace ens { diff --git a/inst/include/ensmallen_bits/eve/eve.hpp b/inst/include/ensmallen_bits/eve/eve.hpp index cc38591..c240c6d 100644 --- a/inst/include/ensmallen_bits/eve/eve.hpp +++ b/inst/include/ensmallen_bits/eve/eve.hpp @@ -106,7 +106,7 @@ class Eve typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/eve/eve_impl.hpp b/inst/include/ensmallen_bits/eve/eve_impl.hpp index 3237a4e..7cf4c7d 100644 --- a/inst/include/ensmallen_bits/eve/eve_impl.hpp +++ b/inst/include/ensmallen_bits/eve/eve_impl.hpp @@ -49,8 +49,8 @@ template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type Eve::Optimize(SeparableFunctionType& function, MatType& iterateIn, CallbackTypes&&... callbacks) @@ -126,29 +126,37 @@ Eve::Optimize(SeparableFunctionType& function, if (terminate) break; - m *= beta1; - m += (1 - beta1) * gradient; + m *= ElemType(beta1); + m += (1 - ElemType(beta1)) * gradient; - v *= beta2; - v += (1 - beta2) * (gradient % gradient); + v *= ElemType(beta2); + v += (1 - ElemType(beta2)) * (gradient % gradient); - const double biasCorrection1 = 1.0 - std::pow(beta1, (double) (i + 1)); - const double biasCorrection2 = 1.0 - std::pow(beta2, (double) (i + 1)); + const ElemType biasCorrection1 = + 1 - std::pow(ElemType(beta1), ElemType(i + 1)); + const ElemType biasCorrection2 = + 1 - std::pow(ElemType(beta2), ElemType(i + 1)); if (i > 0) { const ElemType d = std::abs(objective - lastObjective) / - (std::min(objective, lastObjective) + epsilon); + (std::min(objective, lastObjective) + ElemType(epsilon)); - dt *= beta3; - dt += (1 - beta3) * std::min(std::max(d, ElemType(1.0 / clip)), + dt *= ElemType(beta3); + dt += (1 - ElemType(beta3)) * std::min(std::max(d, ElemType(1.0 / clip)), ElemType(clip)); } lastObjective = objective; - iterate -= stepSize / dt * (m / biasCorrection1) / - (arma::sqrt(v / biasCorrection2) + epsilon); + // TODO: remove in ensmallen 4.0.0. + #if defined(ENS_OLD_SEPARABLE_STEP_BEHAVIOR) + iterate -= ElemType(stepSize) / dt * (m / biasCorrection1) / + (sqrt(v / biasCorrection2) + ElemType(epsilon)); + #else + iterate -= (ElemType(stepSize) / (dt * effectiveBatchSize)) * + (m / biasCorrection1) / (sqrt(v / biasCorrection2) + ElemType(epsilon)); + #endif terminate |= Callback::StepTaken(*this, f, iterate, callbacks...); diff --git a/inst/include/ensmallen_bits/fasta/fasta.hpp b/inst/include/ensmallen_bits/fasta/fasta.hpp new file mode 100644 index 0000000..3f41815 --- /dev/null +++ b/inst/include/ensmallen_bits/fasta/fasta.hpp @@ -0,0 +1,220 @@ +/** + * @file fasta.hpp + * @author Ryan Curtin + * + * An implementation of FASTA (Fast Adaptive Shrinkage/Thresholding Algorithm). + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_FASTA_FASTA_HPP +#define ENSMALLEN_FASTA_FASTA_HPP + +#include "../fbs/l1_penalty.hpp" +#include "../fbs/l1_constraint.hpp" + +namespace ens { + +/** + * FASTA (Fast Adaptive Shrinkage/Thresholding Algorithm) is a proximal + * gradient optimization technique for optimizing a function of the form + * + * h(x) = f(x) + g(x) + * + * where f(x) is a differentiable function and g(x) is an arbitrary + * non-differentiable function. In such a situation, standard gradient descent + * techniques cannot work because of the non-differentiability of g(x). To work + * around this, FASTA takes a _forward step_ that is just a gradient descent + * step on f(x), and then a _backward step_ that is the _proximal operator_ + * corresponding to g(x). This continues until convergence. + * + * This implementation of FASTA allows specification of the backward step (or + * proximal operator) via the `BackwardStepType` template parameter. When using + * FBS, the differentiable `FunctionType` given to `Optimize()` should be f(x), + * *not* the combined function h(x). g(x) should be specified by the choice of + * `BackwardStepType` (e.g. `L1Penalty` or `L1Maximum`). The `Optimize()` + * function will then return optimized coordinates for h(x), not f(x). + * + * For more information, see the following paper: + * + * ``` + * @article{goldstein2014field, + * title={A field guide to forward-backward splitting with a FASTA + * implementation}, + * author={Goldstein, Tom and Studer, Christoph and Baraniuk, Richard}, + * journal={arXiv preprint arXiv:1411.3406}, + * year={2014} + * } + * ``` + */ +template +class FASTA +{ + public: + /** + * Construct the FASTA optimizer with the given options, using a + * default-constructed BackwardStepType. + */ + FASTA(const size_t maxIterations = 10000, + const double tolerance = 1e-7, + const size_t maxLineSearchSteps = 50, + const double stepSizeAdjustment = 2.0, + const size_t lineSearchLookback = 10, + const bool estimateStepSize = true, + const size_t estimateTrials = 10, + const double maxStepSize = 0.001); + + /** + * Construct the FASTA optimizer with the given options. + */ + FASTA(BackwardStepType backwardStepType, + const size_t maxIterations = 10000, + const double tolerance = 1e-7, + const size_t maxLineSearchSteps = 50, + const double stepSizeAdjustment = 2.0, + const size_t lineSearchLookback = 10, + const bool estimateStepSize = true, + const size_t estimateTrials = 10, + const double maxStepSize = 0.001); + + /** + * Optimize the given function using FASTA. The given starting + * point will be modified to store the finishing point of the algorithm, + * the final objective value is returned. + * + * The FunctionType template class must provide the following functions: + * + * double Evaluate(const arma::mat& coordinates); + * void Gradient(const arma::mat& coordinates, + * arma::mat& gradient); + * + * @tparam FunctionType Type of function to be optimized. + * @tparam MatType Type of objective matrix. + * @tparam GradType Type of gradient matrix (default is MatType). + * @tparam CallbackTypes Types of callback functions. + * @param function Function to be optimized. + * @param iterate Input with starting point, and will be modified to save + * the output optimial solution coordinates. + * @param callbacks Callback functions. + * @return Objective value at the final solution. + */ + template + typename std::enable_if::value, + typename MatType::elem_type>::type + Optimize(FunctionType& function, + MatType& iterate, + CallbackTypes&&... callbacks); + + //! Forward the MatType as GradType. + template + typename MatType::elem_type Optimize(FunctionType& function, + MatType& iterate, + CallbackTypes&&... callbacks) + { + return Optimize(function, iterate, + std::forward(callbacks)...); + } + + //! Get the backward step object. + const BackwardStepType& BackwardStep() const { return backwardStep; } + //! Modify the backward step object. + BackwardStepType& BackwardStep() { return backwardStep; } + + //! Get the maximum number of iterations (0 indicates no limit). + size_t MaxIterations() const { return maxIterations; } + //! Modify the maximum number of iterations (0 indicates no limit). + size_t& MaxIterations() { return maxIterations; } + + //! Get the tolerance on the gradient norm for termination. + double Tolerance() const { return tolerance; } + //! Modify the tolerance on the gradient norm for termination. + double& Tolerance() { return tolerance; } + + //! Get the maximum number of line search steps. + size_t MaxLineSearchSteps() const { return maxLineSearchSteps; } + //! Modify the maximum number of line search steps. + size_t& MaxLineSearchSteps() { return maxLineSearchSteps; } + + //! Get the step size adjustment parameter. + double StepSizeAdjustment() const { return stepSizeAdjustment; } + //! Modify the step size adjustment parameter. + double& StepSizeAdjustment() { return stepSizeAdjustment; } + + //! Get the maximum number of iterations to look back during a line search. + size_t LineSearchLookback() const { return lineSearchLookback; } + //! Modify the maximum number of iterations to look back during a line search. + size_t& LineSearchLookback() { return lineSearchLookback; } + + //! Get whether or not to estimate the initial step size. + bool EstimateStepSize() const { return estimateStepSize; } + //! Modify whether or not to estimate the initial step size. + bool& EstimateStepSize() { return estimateStepSize; } + + //! Get the number of trials to use for Lipschitz constant estimation. + size_t EstimateTrials() const { return estimateTrials; } + //! Modify the number of trials to use for Lipschitz constant estimation. + size_t& EstimateTrials() { return estimateTrials; } + + //! Get the maximum step size. If Optimize() has been called, this will + //! contain the estimated maximum step size value. + double MaxStepSize() const { return maxStepSize; } + //! Modify the step size (ignored if EstimateStepSize() is true). + double& MaxStepSize() { return maxStepSize; } + + private: + //! Utility function: fill with random values. + template + static void RandomFill(MatType& x, + const size_t rows, + const size_t cols, + const typename MatType::elem_type maxVal); + + template + static void RandomFill(arma::SpMat& x, + const size_t rows, + const size_t cols, + const eT maxVal); + + template + void EstimateLipschitzStepSize(FunctionType& f, const MatType& x); + + //! The instantiated backward step object. + BackwardStepType backwardStep; + + //! The maximum number of allowed iterations. + size_t maxIterations; + + //! The tolerance for termination. + double tolerance; + + //! The maximum number of line search trials. + size_t maxLineSearchSteps; + + //! The step size adjustment parameter for the line search. + double stepSizeAdjustment; + + //! The maximum number of iterations to look back during a line search. + size_t lineSearchLookback; + + //! Whether or not to try and estimate the initial step size. + bool estimateStepSize; + + //! Number of trials to use for initial step size estimation. + size_t estimateTrials; + + //! The maximum step size to use (estimated if estimateStepSize is true). + double maxStepSize; +}; + +} // namespace ens + +// Include implementation. +#include "fasta_impl.hpp" + +#endif diff --git a/inst/include/ensmallen_bits/fasta/fasta_impl.hpp b/inst/include/ensmallen_bits/fasta/fasta_impl.hpp new file mode 100644 index 0000000..ee2331a --- /dev/null +++ b/inst/include/ensmallen_bits/fasta/fasta_impl.hpp @@ -0,0 +1,546 @@ +/** + * @file fasta_impl.hpp + * @author Ryan Curtin + * + * Implementation of FASTA (Fast Adaptive Shrinkage/Thresholding Algorithm). + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_FASTA_FASTA_IMPL_HPP +#define ENSMALLEN_FASTA_FASTA_IMPL_HPP + +// In case it hasn't been included yet. +#include "fasta.hpp" + +#include + +namespace ens { + +//! Constructor of the FBS class. +template +FASTA::FASTA(const size_t maxIterations, + const double tolerance, + const size_t maxLineSearchSteps, + const double stepSizeAdjustment, + const size_t lineSearchLookback, + const bool estimateStepSize, + const size_t estimateTrials, + const double maxStepSize) : + maxIterations(maxIterations), + tolerance(tolerance), + maxLineSearchSteps(maxLineSearchSteps), + stepSizeAdjustment(stepSizeAdjustment), + lineSearchLookback(lineSearchLookback), + estimateStepSize(estimateStepSize), + estimateTrials(estimateTrials), + maxStepSize(maxStepSize) +{ + // Check estimateSteps parameter. + if (estimateStepSize && estimateTrials == 0) + { + throw std::invalid_argument("FASTA::FASTA(): estimateTrials must be greater" + " than 0!"); + } + + if (lineSearchLookback == 0) + { + throw std::invalid_argument("FASTA::FASTA(): lineSearchLookback cannot be " + "0!"); + } +} + +template +FASTA::FASTA(BackwardStepType backwardStep, + const size_t maxIterations, + const double tolerance, + const size_t maxLineSearchSteps, + const double stepSizeAdjustment, + const size_t lineSearchLookback, + const bool estimateStepSize, + const size_t estimateTrials, + const double maxStepSize) : + backwardStep(std::move(backwardStep)), + maxIterations(maxIterations), + tolerance(tolerance), + maxLineSearchSteps(maxLineSearchSteps), + stepSizeAdjustment(stepSizeAdjustment), + lineSearchLookback(lineSearchLookback), + estimateStepSize(estimateStepSize), + estimateTrials(estimateTrials), + maxStepSize(maxStepSize) +{ + // Check estimateSteps parameter. + if (estimateStepSize && estimateTrials == 0) + { + throw std::invalid_argument("FASTA::FASTA(): estimateTrials must be greater" + " than 0!"); + } + + if (lineSearchLookback == 0) + { + throw std::invalid_argument("FASTA::FASTA(): lineSearchLookback cannot be " + "0!"); + } +} + +//! Optimize the function (minimize). +template +template +typename std::enable_if::value, + typename MatType::elem_type>::type +FASTA::Optimize(FunctionType& function, + MatType& iterateIn, + CallbackTypes&&... callbacks) +{ + // Convenience typedefs. + typedef typename MatType::elem_type ElemType; + typedef typename MatTypeTraits::BaseMatType BaseMatType; + typedef typename MatTypeTraits::BaseMatType BaseGradType; + + typedef Function FullFunctionType; + FullFunctionType& f = static_cast(function); + + // Make sure we have all necessary functions. + traits::CheckFunctionTypeAPI(); + RequireFloatingPointType(); + RequireFloatingPointType(); + RequireSameInternalTypes(); + + // Sanity check: make sure lineSearchLookback is valid. + if (lineSearchLookback == 0) + { + throw std::invalid_argument("FASTA::FASTA(): lineSearchLookback cannot be " + "0!"); + } + + // Here we make a copy because we will use std::move() internally, and if + // iterateIn is an alias, this is unsafe. We will copy the final result back + // to iterateIn at the end. + BaseMatType x(iterateIn); + + // To keep track of the function value. + ElemType currentFObj = f.Evaluate(x); + ElemType currentGObj = backwardStep.Evaluate(x); + ElemType currentObj = currentFObj + currentGObj; + + // This will be the denominator of the normalized residual termination + // condition. + ElemType firstResidual = ElemType(0); + + // This will be used in the non-monotone line search, to track the last + // several function values. + arma::Col lastFObjs(lineSearchLookback); + lastFObjs.fill(std::numeric_limits::min()); + size_t currentObjPos = 0; + + BaseGradType g(x.n_rows, x.n_cols); + BaseMatType lastXHat; // Used for residual checks. + BaseMatType lastX; // Used for residual and alpha reset checks. + BaseMatType xHat; // Used for residual checks. + BaseMatType lpaX = x; // Used for alpha reset check. + ElemType alpha = ElemType(1); // Initialize alpha^1 = 1. + ElemType lastAlpha = alpha; + + // Controls early termination of the optimization process. + bool terminate = false; + + // First, estimate the Lipschitz constant to set the initial/maximum step + // size, if the user asked us to. + if (estimateStepSize) + EstimateLipschitzStepSize(f, x); + + // Keep track of the last step size we used. + ElemType currentStepSize = (ElemType) maxStepSize; + ElemType lastStepSize = (ElemType) maxStepSize; + + Callback::BeginOptimization(*this, f, x, callbacks...); + for (size_t i = 1; i != maxIterations && !terminate; ++i) + { + // During this optimization, we want to optimize h(x) = f(x) + g(x). + // f(x) is `f`, but g(x) is specified by `BackwardStepType`. + + // The first step is to compute a step size via a non-monotone line search. + // To do this, we need to compute the gradient f'(y) as required by the line + // search condition in Eq. (38). Note that our code does a little sleight + // of hand, and so `x` stores what the paper calls `y^k` here. (See the + // code for the adaptive step below.) + currentFObj = f.EvaluateWithGradient(x, g); + terminate |= Callback::EvaluateWithGradient(*this, f, x, currentFObj, g, + callbacks...); + + // Use backtracking non-monotone line search to find the best step size. + // This is the version from the FASTA paper, but with a minor modification: + // we start our search at the last step size, and allow the search to + // increase the step size up to the maximum step size if it can. This is a + // more effective heuristic than simply starting at the largest allowable + // step size and shrinking from there, especially in regions where the + // gradient norm is small. It is also more effective than simply starting + // at the last step size and shrinking from there, as it prevents getting + // "stuck" with a very small step size. + bool lsDone = false; + size_t lsTrial = 0; + bool increasing = false; // Will be set during the first iteration. + ElemType lastFObj = ElemType(0); + BaseMatType lsLastX; // Only used in increasing mode. + BaseMatType lsLastXHat; // Only used in increasing mode. + BaseMatType xDiff; + + lastX = std::move(x); + lastStepSize = currentStepSize; + currentStepSize = std::min(currentStepSize, (ElemType) maxStepSize); + + // Ensure that the last `lineSearchLookback` objective values are recorded + // properly. + lastFObjs[currentObjPos] = currentFObj; + currentObjPos = (currentObjPos + 1) % lineSearchLookback; + const ElemType strictMaxFObj = currentFObj; + const ElemType maxFObj = lastFObjs.max(); + + while (!lsDone && !terminate) + { + if (lsTrial == maxLineSearchSteps) + { + if (increasing) + { + Warn << "FASTA::Optimize(): line search reached maximum number of " + << "steps (" << maxLineSearchSteps << "); using step size " + << currentStepSize << "." << std::endl; + break; // The step size is still valid. + } + else + { + Warn << "FASTA::Optimize(): could not find valid step size in range " + << "(0, " << maxStepSize << "]! Terminating optimization." + << std::endl; + terminate = true; + break; + } + } + + // If the step size has converged to zero, we are done. + if (currentStepSize == ElemType(0)) + { + Warn << "FASTA::Optimize(): computed zero step size; terminating " + << "optimization." << std::endl; + terminate = true; + break; + } + + // Perform forward update into x. + xHat = lastX - currentStepSize * g; + // (We must store xHat separately for the residual, so this copy is + // necessary.) + x = xHat; + backwardStep.ProximalStep(x, currentStepSize); + + // Compute objective of new point. + const ElemType fObj = f.Evaluate(x); + terminate |= Callback::Evaluate(*this, f, x, fObj, callbacks...); + + // Compute the quadratic approximation of the objective (the condition in + // Eq. (38)). + xDiff = (x - lastX); + + // Note: since we allow the step size to increase, we have to modify the + // non-monotone line search a little bit to keep things from diverging. + // Specifically, if we are increasing the step size, then we force a + // monotone line search (by looking only at the previous function value). + // It is only when we are decreasing the step size that we allow + // relaxation. + const ElemType relaxedCond = maxFObj + dot(xDiff, g) + + (1 / (2 * currentStepSize)) * dot(xDiff, xDiff); + const ElemType strictCond = strictMaxFObj + dot(xDiff, g) + + (1 / (2 * currentStepSize)) * dot(xDiff, xDiff); + + // If we're on the first iteration, we don't know if we should be + // searching for a step size by increasing or decreasing the step size. + // (Remember that our valid ranges of step sizes are [0, maxStepSize], and + // we are starting at lastStepSize.) + // + // Thus, if the condition is satisfied, let's try increasing the step size + // until it's no longer satisfied. Otherwise, we will have to decrease + // the step size. + if (lsTrial == 0) + { + increasing = ((fObj <= strictCond) && (std::isfinite(fObj))); + } + + if (increasing) + { + // If we are in "increasing" mode, then termination occurs on the first + // iteration when the strict condition is *not* satisfied (and we use + // the last step size). + if ((fObj > strictCond) || (!std::isfinite(fObj))) + { + lsDone = true; + x = std::move(lsLastX); + xHat = std::move(lsLastXHat); + currentFObj = lastFObj; + currentStepSize = lastStepSize; // Take one step backwards. + } + else if (currentStepSize == (ElemType) maxStepSize) + { + // The condition is still satisfied, but the step size will be too big + // if we take another step. Go back to the maximum step size. + lsDone = true; + currentFObj = fObj; + } + else + { + // The condition is still satisfied; increase the step size. + lastStepSize = currentStepSize; + currentStepSize *= ElemType(stepSizeAdjustment); + lsLastX = std::move(x); + lsLastXHat = std::move(xHat); + lastFObj = fObj; + ++lsTrial; + } + } + else + { + // If we are in "decreasing" mode, then termination occurs on the first + // iteration when the relaxed condition is satisfied. + if ((fObj <= relaxedCond) && (std::isfinite(fObj))) + { + lsDone = true; + currentFObj = fObj; + } + else + { + // The condition is not yet satisfied; decrease the step size. + currentStepSize /= ElemType(stepSizeAdjustment); + ++lsTrial; + } + } + } + + if (!lsDone) + { + // The line search failed, so terminate. + Warn << "FASTA::Optimize(): non-monotone line search failed after " + << maxLineSearchSteps << " steps; terminating optimization." + << std::endl; + x = std::move(lastX); + terminate = true; + } + + // If we terminated during the line search, we are done. + if (terminate) + break; + + // Now that we have taken a step, compute the full objective by computing + // g(x). + currentGObj = backwardStep.Evaluate(x); + currentObj = currentFObj + currentGObj; + + // Output current objective function. + Info << "FASTA::Optimize(): iteration " << i << ", combined objective " + << currentObj << " (f(x) = " << currentFObj << ", g(x) = " + << currentGObj << "), step size " << currentStepSize << "." + << std::endl; + + // Sanity check for divergence. + if ((i > 1) && !std::isfinite(currentObj)) + { + Warn << "FASTA::Optimize(): objective diverged to " + << currentObj << "; terminating optimization." << std::endl; + terminate = true; + break; + } + + // Now, check for convergence. The FASTA convergence check uses both the + // normalized residual and the relative residual, stopping when either + // becomes sufficiently small. The check depends on x before and after the + // proximal step. + + // Compute residual. This is Eq. (40) in the paper. + const ElemType residual = norm(g + (xHat - x) / currentStepSize, 2); + + // If this is the first iteration, store the residual as the first residual. + if (i == 1) + firstResidual = residual; + + // First, check the normalized residual for convergence. This is Eq. (43) + // in the paper. + const ElemType eps = 20 * std::numeric_limits::epsilon(); + const ElemType normalizedResidual = residual / (firstResidual + eps); + + if ((i < 10) && (normalizedResidual < ElemType(1e-5))) + { + // Heuristic: sometimes the optimization starts in such an awful place + // that we are able to make huge amounts of progress in the first few + // iterations. In this case, reset the firstResidual to the slightly + // better point we get to by the tenth iterate. + firstResidual = residual; + } + else if ((i > 10) && (normalizedResidual < tolerance)) + { + Info << "FASTA::Optimize(): normalized residual minimized within " + << "tolerance " << tolerance << "; terminating optimization." + << std::endl; + break; + } + + // Next, check the relative residual for convergence. This is Eq. (42) in + // the paper. + const ElemType gNorm = norm(g, 2); + const ElemType proxStepNorm = norm((xHat - x) / currentStepSize, 2); + + const ElemType relativeResidual = residual / + (std::max(gNorm, proxStepNorm) + 20 * eps); + + if (relativeResidual < tolerance) + { + Info << "FASTA::Optimize(): relative residual minimized within " + << "tolerance " << tolerance << "; terminating optimization." + << std::endl; + break; + } + + // Compute updated prediction parameter alpha. + lastAlpha = alpha; + alpha = (1 + std::sqrt(1 + 4 * std::pow(alpha, ElemType(2)))) / 2; + + // Take a predictive step. + BaseMatType y = x + ((lastAlpha - 1) / alpha) * (x - lpaX); + + // Sometimes alpha can get to be too large; this restart scheme is taken + // originally from O'Donoghue and Candes, "Adaptive restart for accelerated + // gradient schemes", 2012. + // + // The notation is confusing here when compared with Eq. (37) in the paper. + // This is because the paper is poorly notated, although it's not clear much + // has been done here to improve things. To translate: + // + // Paper Code Explanation + // + // y^k lastX This is the result of the predictive step on the + // previous iteration. In our code, we apply the + // predictive step to x, which next iteration becomes + // lastX. + // + // x^k x This is the iterate before the predictive step, this + // iteration. + // + // x^k-1 lpaX "Last Pre-Accelerated X"---we have to take a specific + // step to store this. + // + const ElemType restartCheck = dot(lastX - x, x - lpaX); + if (restartCheck > 0) + { + Info << "FASTA::Optimize(): alpha too large (" << alpha << "); reset to " + << "1." << std::endl; + alpha = ElemType(1); + lastAlpha = ElemType(1); + } + + lpaX = std::move(x); + x = std::move(y); + + terminate |= Callback::StepTaken(*this, f, x, callbacks...); + } + + if (!terminate) + { + Info << "FASTA::Optimize(): maximum iterations (" << maxIterations + << ") reached; terminating optimization." << std::endl; + } + + Callback::EndOptimization(*this, f, x, callbacks...); + + ((BaseMatType&) iterateIn) = x; + return currentObj; +} // Optimize() + +template +template +void FASTA::RandomFill( + MatType& x, + const size_t rows, + const size_t cols, + const typename MatType::elem_type maxVal) +{ + x.randu(rows, cols); + x *= maxVal; +} + +template +template +void FASTA::RandomFill( + arma::SpMat& x, + const size_t rows, + const size_t cols, + const eT maxVal) +{ + eT density = eT(0.1); + // Try and keep the matrix from having too many elements. + if (rows * cols > 100000) + density = eT(0.01); + else if (rows * cols > 1000000) + density = eT(0.001); + else if (rows * cols > 10000000) + density = eT(0.0001); + + x.sprandu(rows, cols, density); + + // Make sure we got at least some nonzero elements... + while (x.n_nonzero == 0) + { + if (x.n_elem < 10) + x.sprandu(rows, cols, 1.0); + else + x.sprandu(rows, cols, 0.5); + } + + x *= maxVal; +} + +template +template +void FASTA::EstimateLipschitzStepSize( + FunctionType& f, + const MatType& x) +{ + typedef typename MatType::elem_type ElemType; + + // Sanity check for estimateSteps parameter. + if (estimateTrials == 0) + { + throw std::invalid_argument("FASTA::Optimize(): estimateTrials must be " + "greater than 0!"); + } + + const ElemType xMax = std::max(ElemType(1), 2 * x.max()); + ElemType sum = ElemType(0); + MatType x1, x2, gx1, gx2; + + for (size_t t = 0; t < estimateTrials; ++t) + { + RandomFill(x1, x.n_rows, x.n_cols, xMax); + RandomFill(x2, x.n_rows, x.n_cols, xMax); + + f.Gradient(x1, gx1); + f.Gradient(x2, gx2); + + // Compute a Lipschitz constant estimate. + const ElemType lEst = norm(gx1 - gx2, 2) / norm(x1 - x2, 2); + sum += lEst; + } + + sum /= estimateTrials; + if (sum == 0) + maxStepSize = std::numeric_limits::max(); + else + maxStepSize = (10 / sum); + + Info << "FASTA::Optimize(): estimated a maximum step size of " + << maxStepSize << "." << std::endl; +} + +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/fbs/fbs.hpp b/inst/include/ensmallen_bits/fbs/fbs.hpp new file mode 100644 index 0000000..5cb6b0d --- /dev/null +++ b/inst/include/ensmallen_bits/fbs/fbs.hpp @@ -0,0 +1,153 @@ +/** + * @file fbs.hpp + * @author Ryan Curtin + * + * An implementation of Forward-Backward Splitting (FBS). + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_FBS_FBS_HPP +#define ENSMALLEN_FBS_FBS_HPP + +#include "l1_penalty.hpp" +#include "l1_constraint.hpp" + +namespace ens { + +/** + * Forward-Backward Splitting is a proximal gradient optimization technique for + * optimizing a function of the form + * + * h(x) = f(x) + g(x) + * + * where f(x) is a differentiable function and g(x) is an arbitrary + * non-differentiable function. In such a situation, standard gradient descent + * techniques cannot work because of the non-differentiability of g(x). To work + * around this, FBS takes a _forward step_ that is just a gradient descent step + * on f(x), and then a _backward step_ that is the _proximal operator_ + * corresponding to g(x). This continues until convergence. + * + * This implementation of FBS allows specification of the backward step (or + * proximal operator) via the `BackwardStepType` template parameter. When using + * FBS, the differentiable `FunctionType` given to `Optimize()` should be f(x), + * *not* the combined function h(x). g(x) should be specified by the choice of + * `BackwardStepType` (e.g. `L1Penalty` or `L1Maximum`). The `Optimize()` + * function will then return optimized coordinates for h(x), not f(x). + * + * For more information, see the following paper: + * + * ``` + * @article{goldstein2014field, + * title={A field guide to forward-backward splitting with a FASTA + * implementation}, + * author={Goldstein, Tom and Studer, Christoph and Baraniuk, Richard}, + * journal={arXiv preprint arXiv:1411.3406}, + * year={2014} + * } + * ``` + */ +template +class FBS +{ + public: + /** + * Construct the FBS optimizer with the given options, using a + * default-constructed BackwardStepType. + */ + FBS(const double stepSize = 0.001, + const size_t maxIterations = 10000, + const double tolerance = 1e-10); + + /** + * Construct the FBS optimizer with the given options. + */ + FBS(BackwardStepType backwardStepType, + const double stepSize = 0.001, + const size_t maxIterations = 10000, + const double tolerance = 1e-10); + + /** + * Optimize the given function using FBS. The given starting + * point will be modified to store the finishing point of the algorithm, + * the final objective value is returned. + * + * FunctionType template class must provide the following functions: + * + * double Evaluate(const arma::mat& coordinates); + * void Gradient(const arma::mat& coordinates, + * arma::mat& gradient); + * + * @tparam FunctionType Type of function to be optimized. + * @tparam MatType Type of objective matrix. + * @tparam GradType Type of gradient matrix (default is MatType). + * @tparam CallbackTypes Types of callback functions. + * @param function Function to be optimized. + * @param iterate Input with starting point, and will be modified to save + * the output optimial solution coordinates. + * @param callbacks Callback functions. + * @return Objective value at the final solution. + */ + template + typename std::enable_if::value, + typename MatType::elem_type>::type + Optimize(FunctionType& function, + MatType& iterate, + CallbackTypes&&... callbacks); + + //! Forward the MatType as GradType. + template + typename MatType::elem_type Optimize(FunctionType& function, + MatType& iterate, + CallbackTypes&&... callbacks) + { + return Optimize(function, iterate, + std::forward(callbacks)...); + } + + //! Get the backward step object. + const BackwardStepType& BackwardStep() const { return backwardStep; } + //! Modify the backward step object. + BackwardStepType& BackwardStep() { return backwardStep; } + + //! Get the step size. + double StepSize() const { return stepSize; } + //! Modify the step size. + double& StepSize() { return stepSize; } + + //! Get the maximum number of iterations (0 indicates no limit). + size_t MaxIterations() const { return maxIterations; } + //! Modify the maximum number of iterations (0 indicates no limit). + size_t& MaxIterations() { return maxIterations; } + + //! Get the tolerance for termination. + double Tolerance() const { return tolerance; } + //! Modify the tolerance for termination. + double& Tolerance() { return tolerance; } + + private: + //! The instantiated backward step object. + BackwardStepType backwardStep; + + //! The step size for FBS steps. + double stepSize; + + //! The maximum number of allowed iterations. + size_t maxIterations; + + //! The tolerance for termination. + double tolerance; +}; + +} // namespace ens + +// Include implementation. +#include "fbs_impl.hpp" + +#endif diff --git a/inst/include/ensmallen_bits/fbs/fbs_impl.hpp b/inst/include/ensmallen_bits/fbs/fbs_impl.hpp new file mode 100644 index 0000000..d28d6df --- /dev/null +++ b/inst/include/ensmallen_bits/fbs/fbs_impl.hpp @@ -0,0 +1,141 @@ +/** + * @file fbs_impl.hpp + * @author Ryan Curtin + * + * Implementation of Forward-Backward Splitting (FBS). + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_FBS_FBS_IMPL_HPP +#define ENSMALLEN_FBS_FBS_IMPL_HPP + +// In case it hasn't been included yet. +#include "fbs.hpp" + +#include + +namespace ens { + +//! Constructor of the FBS class. +template +FBS::FBS(const double stepSize, + const size_t maxIterations, + const double tolerance) : + stepSize(stepSize), + maxIterations(maxIterations), + tolerance(tolerance) +{ /* Nothing to do. */ } + +template +FBS::FBS(BackwardStepType backwardStep, + const double stepSize, + const size_t maxIterations, + const double tolerance) : + backwardStep(std::move(backwardStep)), + stepSize(stepSize), + maxIterations(maxIterations), + tolerance(tolerance) +{ /* Nothing to do. */ } + +//! Optimize the function (minimize). +template +template +typename std::enable_if::value, + typename MatType::elem_type>::type +FBS::Optimize(FunctionType& function, + MatType& iterateIn, + CallbackTypes&&... callbacks) +{ + // Convenience typedefs. + typedef typename MatType::elem_type ElemType; + typedef typename MatTypeTraits::BaseMatType BaseMatType; + typedef typename MatTypeTraits::BaseMatType BaseGradType; + + typedef Function FullFunctionType; + FullFunctionType& f = static_cast(function); + + // Make sure we have all necessary functions. + traits::CheckFunctionTypeAPI(); + RequireFloatingPointType(); + RequireFloatingPointType(); + RequireSameInternalTypes(); + + BaseMatType& iterate = (BaseMatType&) iterateIn; + + // To keep track of the function value. + ElemType currentObjective = std::numeric_limits::max(); + ElemType currentFObjective = currentObjective; + ElemType currentGObjective = currentObjective; + ElemType lastObjective = currentObjective; + + BaseGradType gradient(iterate.n_rows, iterate.n_cols); + + // Controls early termination of the optimization process. + bool terminate = false; + + Callback::BeginOptimization(*this, f, iterate, callbacks...); + for (size_t i = 1; i != maxIterations && !terminate; ++i) + { + // During this optimization, we want to optimize h(x) = f(x) + g(x). + // f(x) is `f`, but g(x) is specified by `BackwardStepType`. + + // First compute f(x) and f'(x). + currentFObjective = f.EvaluateWithGradient(iterate, gradient); + // Now compute g(x) to get the full objective. + currentGObjective = backwardStep.Evaluate(iterate); + + lastObjective = currentObjective; + currentObjective = currentFObjective + currentGObjective; + + terminate |= Callback::EvaluateWithGradient(*this, f, iterate, + currentObjective, gradient, callbacks...); + + // Output current objective function. + Info << "FBS::Optimize(): iteration " << i << ", combined objective " + << currentObjective << " (f(x) = " << currentFObjective << ", g(x) = " + << currentGObjective << ")." << std::endl; + + // Check for convergence. + if ((i > 1) && (std::abs(currentObjective - lastObjective) < tolerance)) + { + Info << "FBS::Optimize(): minimized within objective tolerance " + << tolerance << "; terminating optimization." << std::endl; + + Callback::EndOptimization(*this, f, iterate, callbacks...); + return currentObjective; + } + + if ((i > 1) && !std::isfinite(currentObjective)) + { + Warn << "FBS::Optimize(): objective diverged to " << currentObjective + << "; terminating optimization." << std::endl; + + Callback::EndOptimization(*this, f, iterate, callbacks...); + return currentObjective; + } + + // Perform forward update. + iterate -= ElemType(stepSize) * gradient; + // Now perform backward step (proximal update). + backwardStep.ProximalStep(iterate, stepSize); + + terminate |= Callback::StepTaken(*this, f, iterate, callbacks...); + } + + if (!terminate) + { + Info << "FBS::Optimize(): maximum iterations (" << maxIterations + << ") reached; terminating optimization." << std::endl; + } + + Callback::EndOptimization(*this, f, iterate, callbacks...); + return currentObjective; +} // Optimize() + +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/fbs/l1_constraint.hpp b/inst/include/ensmallen_bits/fbs/l1_constraint.hpp new file mode 100644 index 0000000..9513eff --- /dev/null +++ b/inst/include/ensmallen_bits/fbs/l1_constraint.hpp @@ -0,0 +1,81 @@ +/** + * @file l1_constraint.hpp + * @author Ryan Curtin + * + * An implementation of the proximal operator for the L1 constraint. + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_FBS_L1_CONSTRAINT_HPP +#define ENSMALLEN_FBS_L1_CONSTRAINT_HPP + +namespace ens { + +/** + * The L1Constraint applies a specific constraint that the L1 norm of the + * parameters must be less than or equal to the given lambda value. + * + * Implementationally, this means that the proximal step is a projection onto + * the L1 ball of radius lambda. If the constraint is satisfied, `Evaluate()` + * will return 0. Otherwise, it will return infinity. + * + * This class is meant to be used with the FBS optimizer, and any other + * optimizer that uses a proximal operator/step. + */ +class L1Constraint +{ + public: + /** + * Construct an L1Constraint with the given maximum L1 norm for the + * coordinates (lambda). + */ + L1Constraint(const double lambda = 0.0); + + /** + * If the L1 norm of the coordinates is less than or equal to lambda, this + * returns 0. Otherwise, it returns infinity. + */ + template + typename MatType::elem_type Evaluate(const MatType& coordinates) const; + + /** + * Apply a proximal step to the given `coordinates`, assuming that the forward + * step took a step of size `stepSize`. This projects `coordinates` back onto + * the surface of the L1-ball with radius `lambda`, if the L1 norm of + * `coordinates` is greater than `lambda`. + * + * This may apply the proximal step multiple times to account for numerical + * stability issues during projection. + */ + template + void ProximalStep(MatType& coordinates, const double stepSize) const; + + //! Get the L1 constraint to use when applying the proximal step. + double Lambda() const { return lambda; } + //! Modify the L1 constraint to use when applying the proximal step. + double& Lambda() { return lambda; } + + private: + //! The L1 constraint value to use. + double lambda; + + //! Helper function: extract only nonzero elements from sparse objects, or + //! extract the entire dense object. + template + inline arma::Col ExtractNonzeros( + const MatType& coordinates) const; + + template + inline arma::Col ExtractNonzeros(const arma::SpMat& coordinates) + const; +}; + +} // namespace ens + +// Include implementation. +#include "l1_constraint_impl.hpp" + +#endif diff --git a/inst/include/ensmallen_bits/fbs/l1_constraint_impl.hpp b/inst/include/ensmallen_bits/fbs/l1_constraint_impl.hpp new file mode 100644 index 0000000..aff3ed2 --- /dev/null +++ b/inst/include/ensmallen_bits/fbs/l1_constraint_impl.hpp @@ -0,0 +1,201 @@ +/** + * @file l1_constraint_impl.hpp + * @author Ryan Curtin + * + * An implementation of the proximal operator for the L1 constraint. + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_FBS_L1_CONSTRAINT_IMPL_HPP +#define ENSMALLEN_FBS_L1_CONSTRAINT_IMPL_HPP + +// In case it hasn't been included yet. +#include "l1_constraint.hpp" + +namespace ens { + +inline L1Constraint::L1Constraint(const double lambda) : lambda(lambda) +{ + // Nothing to do. +} + +template +typename MatType::elem_type L1Constraint::Evaluate(const MatType& coordinates) + const +{ + typedef typename MatType::elem_type eT; + + // Allow some amount of tolerance for floating-point errors. + const eT l1Norm = norm(vectorise(coordinates), 1); + if (l1Norm <= lambda) + return eT(0); + else if (std::numeric_limits::has_infinity) + return std::numeric_limits::infinity(); + else + return std::numeric_limits::max(); +} + +template +void L1Constraint::ProximalStep(MatType& coordinates, + const double /* stepSize */) + const +{ + // First determine whether projection is necessary. + if (norm(vectorise(coordinates), 1) <= lambda) + { + return; + } + + // An empty vector can't be projected. + if (coordinates.n_elem == 0) + { + return; + } + + // We use the algorithm denoted in Figure 2 of the following paper: + // + // ``` + // @inproceedings{duchi2008efficient, + // title={Efficient projections onto the L1-ball for learning in high + // dimensions}, + // author={Duchi, John and Shalev-Shwartz, Shai and Singer, Yoram and + // Chandra, Tushar}, + // booktitle={Proceedings of the 25th international conference on + // Machine learning}, + // pages={272--279}, + // year={2008} + // } + // ``` + // + // This is an iterative algorithm that has a quicksort feel, where we try to + // determine the "pivot" element that tells us how much we need to shrink. In + // the original paper, they maintain lists indicating whether a point is above + // or below the pivot, but it is more expedient (and efficient) to simply copy + // the coordinates array and partially sort it in-place. + + typedef typename MatType::elem_type eT; + arma::Col work = ExtractNonzeros(coordinates); + size_t firstUpperElement = 0; + size_t lastUpperElement = work.n_elem; + eT rho = eT(0); // This is the quantity we aim to find to perform the projection. + eT s = eT(0); + + while (lastUpperElement > firstUpperElement) + { + const size_t k = arma::randi( + arma::distr_param((int) firstUpperElement, (int) lastUpperElement - 1)); + const eT v = work[k]; + + // Now perform a half-quicksort such that all elements greater than v are in + // the first part of the array. + size_t left = firstUpperElement; + size_t right = lastUpperElement - 1; + while (left <= right) + { + while ((left < lastUpperElement) && (work[left] >= v)) + ++left; + while ((right > firstUpperElement) && (work[right] < v)) + --right; + + if (left >= right) + break; + + // work[left] is less than v, and work[right] is not. Since we want all + // elements greater than or equal to v on the left, swap. + const eT tmp = work[left]; + work[left] = work[right]; + work[right] = tmp; + } + + // Now, work[0] through work[left - 1] are in the greater set G. + const eT sDelta = accu(work.subvec(firstUpperElement, left - 1)); + const size_t rhoDelta = (left - firstUpperElement); + + if ((s + sDelta) - ((eT) (rho + rhoDelta)) * v < eT(lambda)) + { + s += sDelta; + rho += rhoDelta; + firstUpperElement = left; + } + else + { + // v was an element that was less than rho, so, shrink the array and try + // again with larger elements. We actually want to shrink the array so + // that it does not include v, so we need to find the first element that + // is v (since there may be duplicates). + size_t firstVIndex = left - 1; + while ((work[firstVIndex] == v) && (firstVIndex >= firstUpperElement)) + --firstVIndex; + lastUpperElement = firstVIndex + 1; + } + } + + const eT theta = (s - eT(lambda)) / rho; + // This is a single-line implementation of the .transform() below; we use the + // single-line implementation so it works with Bandicoot. + // + // coordinates.transform( + // [theta](eT val) + // { + // if (val > 0) + // return std::max(val - theta, eT(0)); + // else + // return std::min(val + theta, eT(0)); + // }); + coordinates = sign(coordinates) % clamp( + abs(coordinates) - theta, eT(0), std::numeric_limits::max()); + + // Sanity check: ensure we actually ended up inside the L1 ball. This might + // not happen due to floating-point inaccuracies. If so, try again. + const eT newNorm = norm(coordinates, 1); + if (newNorm > eT(lambda) && eT(lambda) > eT(0)) + { + // Shrink the L1 ball by the amount of the error. + eT newLambda = (eT(lambda) - 2 * (newNorm - eT(lambda))); + if (newLambda == eT(lambda)) + { + // Make sure we at least remove a few ULPs. + newLambda = eT(lambda) - + 5 * (eT(lambda) - eT(std::nexttoward(lambda, 0.0))); + } + + L1Constraint newConstraint(newLambda); + newConstraint.ProximalStep(coordinates, 0.0 /* ignored */); + } +} + +// Helper function: extract only nonzero elements from sparse objects, or +// extract the entire dense object. +template +inline arma::Col L1Constraint::ExtractNonzeros( + const MatType& coordinates) const +{ + typedef typename MatType::elem_type ElemType; + return conv_to>::from(vectorise(abs(coordinates))); +} + +template +inline arma::Col L1Constraint::ExtractNonzeros( + const arma::SpMat& coordinates) const +{ + arma::Col result(coordinates.n_nonzero); + typename arma::SpMat::const_iterator it = coordinates.begin(); + size_t i = 0; + while (it != coordinates.end()) + { + // Extract only nonzero values. Note we use the absolute value because that + // is what the algorithm requires (not the original value). + result[i] = std::abs(*it); + ++it; + ++i; + } + + return result; +} + +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/fbs/l1_penalty.hpp b/inst/include/ensmallen_bits/fbs/l1_penalty.hpp new file mode 100644 index 0000000..3cd45f6 --- /dev/null +++ b/inst/include/ensmallen_bits/fbs/l1_penalty.hpp @@ -0,0 +1,63 @@ +/** + * @file l1_penalty.hpp + * @author Ryan Curtin + * + * An implementation of the proximal operator for the L1 penalty (also known as + * the shrinkage operator). + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_FBS_L1_PENALTY_HPP +#define ENSMALLEN_FBS_L1_PENALTY_HPP + +namespace ens { + +/** + * The L1Penalty applies a non-differentiable L1-norm penalty to the coordinates + * during optimization: + * + * `lambda * || coordinates ||_1` + * + * This class is meant to be used with the FBS optimizer, and any other + * optimizer that uses a proximal operator/step. + */ +class L1Penalty +{ + public: + /** + * Construct an L1Penalty object with a given penalty `lambda`. + */ + L1Penalty(const double lambda = 0.0); + + /** + * Evaluate the L1 penalty function: `lambda * || coordinates ||_1`. + */ + template + typename MatType::elem_type Evaluate(const MatType& coordinates) const; + + /** + * After taking a forward step of size `stepSize`, apply a backwards step / + * proximal operator that applies the L1 penalty to `coordinates`. + */ + template + void ProximalStep(MatType& coordinates, const double stepSize) const; + + //! Get the L1 penalty to use when applying the proximal step. + double Lambda() const { return lambda; } + //! Modify the L1 penalty to use when applying the proximal step. + double& Lambda() { return lambda; } + + private: + //! The L1 penalty value to use. + double lambda; +}; + +} // namespace ens + +// Include implementation. +#include "l1_penalty_impl.hpp" + +#endif diff --git a/inst/include/ensmallen_bits/fbs/l1_penalty_impl.hpp b/inst/include/ensmallen_bits/fbs/l1_penalty_impl.hpp new file mode 100644 index 0000000..3112118 --- /dev/null +++ b/inst/include/ensmallen_bits/fbs/l1_penalty_impl.hpp @@ -0,0 +1,63 @@ +/** + * @file l1_penalty_impl.hpp + * @author Ryan Curtin + * + * An implementation of the proximal operator for the L1 penalty (also known as + * the shrinkage operator). + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_FBS_L1_PENALTY_IMPL_HPP +#define ENSMALLEN_FBS_L1_PENALTY_IMPL_HPP + +// In case it hasn't been included yet. +#include "l1_penalty.hpp" + +namespace ens { + +inline L1Penalty::L1Penalty(const double lambda) : lambda(lambda) +{ + // Nothing to do. +} + +template +typename MatType::elem_type L1Penalty::Evaluate(const MatType& coordinates) + const +{ + // Compute the L1 penalty. + return norm(vectorise(coordinates), 1) * typename MatType::elem_type(lambda); +} + +template +void L1Penalty::ProximalStep(MatType& coordinates, + const double stepSize) const +{ + // Apply the backwards step coordinate-wise. If `MatType` is sparse, this + // only applies to nonzero elements, which is just fine. + typedef typename MatType::elem_type eT; + + // This is equivalent to the following .transform() implementation (which is + // easier to read but will not work with Bandicoot): + // + //arma::Mat c2 = conv_to>::from(coordinates); + //c2.transform([this, stepSize](eT val) { return (val > eT(0)) ? + // (std::max(eT(0), val - eT(lambda * stepSize))) : + // (std::min(eT(0), val + eT(lambda * stepSize))); }); + // coordinates.transform([this, stepSize](eT val) { return (val > eT(0)) ? + // (std::max(eT(0), val - eT(lambda * stepSize))) : + // (std::min(eT(0), val + eT(lambda * stepSize))); }); + // + coordinates = sign(coordinates) % clamp( + abs(coordinates) - eT(lambda * stepSize), eT(0), + std::numeric_limits::max()); + + //coordinates.print("coordinates"); + //c2.print("c2"); +} + +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/fista/fista.hpp b/inst/include/ensmallen_bits/fista/fista.hpp new file mode 100644 index 0000000..abea032 --- /dev/null +++ b/inst/include/ensmallen_bits/fista/fista.hpp @@ -0,0 +1,214 @@ +/** + * @file fista.hpp + * @author Ryan Curtin + * + * An implementation of FISTA (Fast Iterative Shrinkage-Thresholding Algorithm). + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_FISTA_FISTA_HPP +#define ENSMALLEN_FISTA_FISTA_HPP + +#include "../fbs/l1_penalty.hpp" +#include "../fbs/l1_constraint.hpp" + +namespace ens { + +/** + * FISTA (Fast Iterative Shrinkage-Thresholding Algorithm) is a proximal + * gradient optimization technique for optimizing a function of the form + * + * h(x) = f(x) + g(x) + * + * where f(x) is a differentiable function and g(x) is an arbitrary + * non-differentiable function. In such a situation, standard gradient descent + * techniques cannot work because of the non-differentiability of g(x). To work + * around this, FISTA takes a _forward step_ that is just a gradient descent + * step on f(x), and then a _backward step_ that is the _proximal operator_ + * corresponding to g(x). This continues until convergence. + * + * This implementation of FISTA allows specification of the backward step (or + * proximal operator) via the `BackwardStepType` template parameter. When using + * FBS, the differentiable `FunctionType` given to `Optimize()` should be f(x), + * *not* the combined function h(x). g(x) should be specified by the choice of + * `BackwardStepType` (e.g. `L1Penalty` or `L1Maximum`). The `Optimize()` + * function will then return optimized coordinates for h(x), not f(x). + * + * For more information, see the following paper: + * + * ``` + * @article{beck2009fast, + * title={A fast iterative shrinkage-thresholding algorithm for linear inverse + * problems}, + * author={Beck, Amir and Teboulle, Marc}, + * journal={SIAM Journal On Imaging Sciences}, + * volume={2}, + * number={1}, + * pages={183--202}, + * year={2009}, + * publisher={SIAM} + * } + * ``` + */ +template +class FISTA +{ + public: + /** + * Construct the FISTA optimizer with the given options, using a + * default-constructed BackwardStepType. + */ + FISTA(const size_t maxIterations = 10000, + const double tolerance = 1e-10, + const size_t maxLineSearchSteps = 50, + const double stepSizeAdjustment = 2.0, + const bool estimateStepSize = true, + const size_t estimateTrials = 10, + const double maxStepSize = 0.001); + + /** + * Construct the FISTA optimizer with the given options. + */ + FISTA(BackwardStepType backwardStepType, + const size_t maxIterations = 10000, + const double tolerance = 1e-10, + const size_t maxLineSearchSteps = 50, + const double stepSizeAdjustment = 2.0, + const bool estimateStepSize = true, + const size_t estimateTrials = 10, + const double maxStepSize = 0.001); + + /** + * Optimize the given function using FISTA. The given starting + * point will be modified to store the finishing point of the algorithm, + * the final objective value is returned. + * + * The FunctionType template class must provide the following functions: + * + * double Evaluate(const arma::mat& coordinates); + * void Gradient(const arma::mat& coordinates, + * arma::mat& gradient); + * + * @tparam FunctionType Type of function to be optimized. + * @tparam MatType Type of objective matrix. + * @tparam GradType Type of gradient matrix (default is MatType). + * @tparam CallbackTypes Types of callback functions. + * @param function Function to be optimized. + * @param iterate Input with starting point, and will be modified to save + * the output optimial solution coordinates. + * @param callbacks Callback functions. + * @return Objective value at the final solution. + */ + template + typename std::enable_if::value, + typename MatType::elem_type>::type + Optimize(FunctionType& function, + MatType& iterate, + CallbackTypes&&... callbacks); + + //! Forward the MatType as GradType. + template + typename MatType::elem_type Optimize(FunctionType& function, + MatType& iterate, + CallbackTypes&&... callbacks) + { + return Optimize(function, iterate, + std::forward(callbacks)...); + } + + //! Get the backward step object. + const BackwardStepType& BackwardStep() const { return backwardStep; } + //! Modify the backward step object. + BackwardStepType& BackwardStep() { return backwardStep; } + + //! Get the maximum number of iterations (0 indicates no limit). + size_t MaxIterations() const { return maxIterations; } + //! Modify the maximum number of iterations (0 indicates no limit). + size_t& MaxIterations() { return maxIterations; } + + //! Get the tolerance on the gradient norm for termination. + double Tolerance() const { return tolerance; } + //! Modify the tolerance on the gradient norm for termination. + double& Tolerance() { return tolerance; } + + //! Get the maximum number of line search steps. + size_t MaxLineSearchSteps() const { return maxLineSearchSteps; } + //! Modify the maximum number of line search steps. + size_t& MaxLineSearchSteps() { return maxLineSearchSteps; } + + //! Get the step size adjustment parameter. + double StepSizeAdjustment() const { return stepSizeAdjustment; } + //! Modify the step size adjustment parameter. + double& StepSizeAdjustment() { return stepSizeAdjustment; } + + //! Get whether or not to estimate the initial step size. + bool EstimateStepSize() const { return estimateStepSize; } + //! Modify whether or not to estimate the initial step size. + bool& EstimateStepSize() { return estimateStepSize; } + + //! Get the number of trials to use for Lipschitz constant estimation. + size_t EstimateTrials() const { return estimateTrials; } + //! Modify the number of trials to use for Lipschitz constant estimation. + size_t& EstimateTrials() { return estimateTrials; } + + //! Get the maximum step size. If Optimize() has been called, this will + //! contain the estimated maximum step size value. + double MaxStepSize() const { return maxStepSize; } + //! Modify the step size (ignored if EstimateStepSize() is true). + double& MaxStepSize() { return maxStepSize; } + + private: + //! Utility function: fill with random values. + template + static void RandomFill(MatType& x, + const size_t rows, + const size_t cols, + const typename MatType::elem_type maxVal); + + template + static void RandomFill(arma::SpMat& x, + const size_t rows, + const size_t cols, + const eT maxVal); + + template + void EstimateLipschitzStepSize(FunctionType& f, const MatType& x); + + //! The instantiated backward step object. + BackwardStepType backwardStep; + + //! The maximum number of allowed iterations. + size_t maxIterations; + + //! The tolerance for termination. + double tolerance; + + //! The maximum number of line search trials. + size_t maxLineSearchSteps; + + //! The step size adjustment parameter for the line search. + double stepSizeAdjustment; + + //! Whether or not to try and estimate the initial step size. + bool estimateStepSize; + + //! Number of trials to use for initial step size estimation. + size_t estimateTrials; + + //! The maximum step size to use (estimated if estimateStepSize is true). + double maxStepSize; +}; + +} // namespace ens + +// Include implementation. +#include "fista_impl.hpp" + +#endif diff --git a/inst/include/ensmallen_bits/fista/fista_impl.hpp b/inst/include/ensmallen_bits/fista/fista_impl.hpp new file mode 100644 index 0000000..09cf69c --- /dev/null +++ b/inst/include/ensmallen_bits/fista/fista_impl.hpp @@ -0,0 +1,445 @@ +/** + * @file fista_impl.hpp + * @author Ryan Curtin + * + * Implementation of FISTA (Fast Iterative Shrinkage-Thresholding Algorithm). + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_FISTA_FISTA_IMPL_HPP +#define ENSMALLEN_FISTA_FISTA_IMPL_HPP + +// In case it hasn't been included yet. +#include "fista.hpp" + +#include + +namespace ens { + +//! Constructor of the FBS class. +template +FISTA::FISTA(const size_t maxIterations, + const double tolerance, + const size_t maxLineSearchSteps, + const double stepSizeAdjustment, + const bool estimateStepSize, + const size_t estimateTrials, + const double maxStepSize) : + maxIterations(maxIterations), + tolerance(tolerance), + maxLineSearchSteps(maxLineSearchSteps), + stepSizeAdjustment(stepSizeAdjustment), + estimateStepSize(estimateStepSize), + estimateTrials(estimateTrials), + maxStepSize(maxStepSize) +{ + // Check estimateSteps parameter. + if (estimateStepSize && estimateTrials == 0) + { + throw std::invalid_argument("FISTA::FISTA(): estimateTrials must be greater" + " than 0!"); + } +} + +template +FISTA::FISTA(BackwardStepType backwardStep, + const size_t maxIterations, + const double tolerance, + const size_t maxLineSearchSteps, + const double stepSizeAdjustment, + const bool estimateStepSize, + const size_t estimateTrials, + const double maxStepSize) : + backwardStep(std::move(backwardStep)), + maxIterations(maxIterations), + tolerance(tolerance), + maxLineSearchSteps(maxLineSearchSteps), + stepSizeAdjustment(stepSizeAdjustment), + estimateStepSize(estimateStepSize), + estimateTrials(estimateTrials), + maxStepSize(maxStepSize) +{ + // Check estimateSteps parameter. + if (estimateStepSize && estimateTrials == 0) + { + throw std::invalid_argument("FISTA::FISTA(): estimateTrials must be greater" + " than 0!"); + } +} + +//! Optimize the function (minimize). +template +template +typename std::enable_if::value, + typename MatType::elem_type>::type +FISTA::Optimize(FunctionType& function, + MatType& iterateIn, + CallbackTypes&&... callbacks) +{ + // Convenience typedefs. + typedef typename MatType::elem_type ElemType; + typedef typename MatTypeTraits::BaseMatType BaseMatType; + typedef typename MatTypeTraits::BaseMatType BaseGradType; + + typedef Function FullFunctionType; + FullFunctionType& f = static_cast(function); + + // Make sure we have all necessary functions. + traits::CheckFunctionTypeAPI(); + RequireFloatingPointType(); + RequireFloatingPointType(); + RequireSameInternalTypes(); + + // Match the notation of the paper. We force a copy here, since we use + // std::move() internally and this may be an alias. We copy back to + // `iterateIn` at the end. + BaseMatType x(iterateIn); + + // To keep track of the function value. + ElemType lastObj = std::numeric_limits::max();; + ElemType currentFObj = f.Evaluate(x); + ElemType currentGObj = backwardStep.Evaluate(x); + ElemType currentObj = currentFObj + currentGObj; + + BaseGradType g(x.n_rows, x.n_cols); // Gradient. + BaseMatType y = x; // Initialize y_1 = x_0. + BaseMatType lastX; + ElemType t = 1; // Initialize t_1 = 1. + ElemType lastT = t; + + // Controls early termination of the optimization process. + bool terminate = false; + + // First, estimate the Lipschitz constant to set the initial/maximum step + // size, if the user asked us to. + if (estimateStepSize) + EstimateLipschitzStepSize(f, x); // Sets `maxStepSize`. + + // Keep track of the last step size we used. + ElemType currentStepSize = (ElemType) maxStepSize; + ElemType lastStepSize = (ElemType) maxStepSize; + + Callback::BeginOptimization(*this, f, x, callbacks...); + for (size_t i = 1; i != maxIterations && !terminate; ++i) + { + // During this optimization, we want to optimize h(x) = f(x) + g(x). + // f(x) is `f`, but g(x) is specified by `BackwardStepType`. + + // Notation (compare with Beck and Teboulle): + // `i` represents `k`, the iteration number. + // `x` represents `x_k` in the paper. + // `y` represents `y_k` in the paper. + + // The first step is to compute a step size via a line search. To do this, + // we need to compute the gradient f'(y) as required by the quadratic + // approximation Q_L(x, y) (Eq. 2.5). + // + // We will also need the objective f(y), so we will compute that + // simultaneously. + const ElemType yObj = f.EvaluateWithGradient(y, g); + terminate |= Callback::EvaluateWithGradient(*this, f, y, yObj, g, + callbacks...); + + // Use backtracking line search to find the best step size. This is not the + // version from the FASTA paper (non-monotone line search) but instead the + // version proposed by Beck and Teboulle, with a minor modification: we + // start our search at the last step size, and allow the search to increase + // the step size up to the maximum step size if it can. This is a more + // effective heuristic than simply starting at the largest allowable step + // size and shrinking from there, especially in regions where the gradient + // norm is small. It is also more effective than simply starting at the + // last step size and shrinking from there, as it prevents getting "stuck" + // with a very small step size. + bool lsDone = false; + size_t lsTrial = 0; + bool increasing = false; // Will be set during the first iteration. + ElemType lastFObj = ElemType(0); + ElemType lastGObj = ElemType(0); + BaseMatType lsLastX; // Only used in increasing mode. + BaseMatType xDiff; + + lastX = std::move(x); + lastStepSize = currentStepSize; + currentStepSize = std::min(currentStepSize, (ElemType) maxStepSize); + + while (!lsDone && !terminate) + { + if (lsTrial == maxLineSearchSteps) + { + if (increasing) + { + Warn << "FISTA::Optimize(): line search reached maximum number of " + << "steps (" << maxLineSearchSteps << "); using step size " + << currentStepSize << "." << std::endl; + break; // The step size is still valid. + } + else + { + Warn << "FISTA::Optimize(): could not find valid step size in range " + << "(0, " << maxStepSize << "]! Terminating optimization." + << std::endl; + x = std::move(lastX); // Revert to previous coordinates. + terminate = true; + break; + } + } + + // If the step size has converged to zero, we are done. + if (currentStepSize == ElemType(0)) + { + Warn << "FISTA::Optimize(): computed zero step size; terminating " + << "optimization." << std::endl; + x = std::move(lastX); // Revert to previous coordinates. + terminate = true; + break; + } + + // Perform forward update into x. + x = y - currentStepSize * g; + backwardStep.ProximalStep(x, currentStepSize); + + // Compute F(x) = f(x) + g(x). + const ElemType fObj = f.Evaluate(x); + const ElemType gObj = backwardStep.Evaluate(x); + const ElemType lsObj = fObj + gObj; + terminate |= Callback::Evaluate(*this, f, x, fObj, callbacks...); + + // Compute Q_L(x, y) (the quadratic approximation), Eq. (2.5). + xDiff = x - y; + const ElemType q = yObj + dot(xDiff, g) + + (1 / (2 * currentStepSize)) * dot(xDiff, xDiff) + gObj; + + // If we're on the first iteration, we don't know if we should be + // searching for a step size by increasing or decreasing the step size. + // (Remember that our valid ranges of step sizes are [0, maxStepSize], and + // we are starting at lastStepSize.) + // + // Thus, if the condition is satisfied, let's try increasing the step size + // until it's no longer satisfied. Otherwise, we will have to decrease + // the step size. + if (lsTrial == 0) + { + increasing = (lsObj <= q); + } + + if (increasing) + { + // If we are in "increasing" mode, then termination occurs on the first + // iteration when the condition is *not* satisfied (and we use the last + // step size). + if ((lsObj > q) || (!std::isfinite(lsObj))) + { + lsDone = true; + if (lsTrial != 0) + x = std::move(lsLastX); + currentFObj = lastFObj; + currentGObj = lastGObj; + lastObj = currentObj; + currentObj = currentFObj + currentGObj; + currentStepSize = lastStepSize; // Take one step backwards. + } + else if (currentStepSize == (ElemType) maxStepSize) + { + // The condition is still satisfied, but the step size will be too big + // if we take another step. Go back to the maximum step size. + lsDone = true; + currentFObj = fObj; + currentGObj = gObj; + lastObj = currentObj; + currentObj = currentFObj + currentGObj; + } + else + { + // The condition is still satisfied; increase the step size. + lastStepSize = currentStepSize; + currentStepSize *= ElemType(stepSizeAdjustment); + lsLastX = std::move(x); + lastFObj = fObj; + lastGObj = gObj; + ++lsTrial; + } + } + else + { + // If we are in "decreasing" mode, then termination occurs on the first + // iteration when the condition is satisfied. + if ((lsObj <= q) && (std::isfinite(lsObj))) + { + lsDone = true; + currentFObj = fObj; + currentGObj = gObj; + lastObj = currentObj; + currentObj = currentFObj + currentGObj; + } + else + { + // The condition is not yet satisfied; decrease the step size. + currentStepSize /= ElemType(stepSizeAdjustment); + ++lsTrial; + } + } + } + + // If we terminated during the line search, we are done. + if (terminate) + break; + + if (!lsDone) + { + // The line search failed, so terminate. + Warn << "FISTA::Optimize(): line search failed after " + << maxLineSearchSteps << " steps; terminating optimization." + << std::endl; + x = std::move(lastX); + terminate = true; + break; + } + + // Output current objective function. + Info << "FISTA::Optimize(): iteration " << i << ", combined objective " + << currentObj << " (f(x) = " << currentFObj << ", g(x) = " + << currentGObj << "), step size " << currentStepSize << "." + << std::endl; + + if ((i > 1) && !std::isfinite(currentObj)) + { + Warn << "FISTA::Optimize(): objective diverged to " << currentObj + << "; terminating optimization." << std::endl; + terminate = true; + break; + } + + // Check for convergence. This is a simple check on the objective. + if ((i > 1) && (std::abs(currentObj - lastObj) < tolerance)) + { + Info << "FISTA::Optimize(): minimized within objective tolerance " + << tolerance << "; terminating optimization." << std::endl; + terminate = true; + } + + // Compute updated prediction parameter t. + lastT = t; + t = (1 + std::sqrt(1 + 4 * std::pow(t, ElemType(2)))) / 2; + + // Sometimes t can get to be too large; this restart scheme is taken + // originally from O'Donoghue and Candes, "Adaptive restart for accelerated + // gradient schemes", 2012. + const ElemType restartCheck = dot(y - x, x - lastX); + if (restartCheck > 0) + { + Info << "FISTA::Optimize(): t too large (" << t << "); reset to 1." + << std::endl; + t = 1; + lastT = 1; + } + + // Update prediction y. + y = x + ((lastT - 1) / t) * (x - lastX); + + terminate |= Callback::StepTaken(*this, f, y, callbacks...); + } + + if (!terminate) + { + Info << "FISTA::Optimize(): maximum iterations (" << maxIterations + << ") reached; terminating optimization." << std::endl; + } + + Callback::EndOptimization(*this, f, x, callbacks...); + + ((BaseMatType&) iterateIn) = x; + return currentObj; +} // Optimize() + +template +template +void FISTA::RandomFill( + MatType& x, + const size_t rows, + const size_t cols, + const typename MatType::elem_type maxVal) +{ + x.randu(rows, cols); + x *= maxVal; +} + +template +template +void FISTA::RandomFill( + arma::SpMat& x, + const size_t rows, + const size_t cols, + const eT maxVal) +{ + eT density = eT(0.1); + // Try and keep the matrix from having too many elements. + if (rows * cols > 100000) + density = eT(0.01); + else if (rows * cols > 1000000) + density = eT(0.001); + else if (rows * cols > 10000000) + density = eT(0.0001); + + x.sprandu(rows, cols, density); + + // Make sure we got at least some nonzero elements... + while (x.n_nonzero == 0) + { + if (x.n_elem < 10) + x.sprandu(rows, cols, 1.0); + else + x.sprandu(rows, cols, 0.5); + } + + x *= maxVal; +} + +template +template +void FISTA::EstimateLipschitzStepSize( + FunctionType& f, + const MatType& x) +{ + typedef typename MatType::elem_type ElemType; + + // Sanity check for estimateSteps parameter. + if (estimateTrials == 0) + { + throw std::invalid_argument("FISTA::Optimize(): estimateTrials must be " + "greater than 0!"); + } + + const ElemType xMax = std::max(ElemType(1), 2 * x.max()); + ElemType sum = ElemType(0); + MatType x1, x2, gx1, gx2; + + for (size_t t = 0; t < estimateTrials; ++t) + { + RandomFill(x1, x.n_rows, x.n_cols, xMax); + RandomFill(x2, x.n_rows, x.n_cols, xMax); + + f.Gradient(x1, gx1); + f.Gradient(x2, gx2); + + // Compute a Lipschitz constant estimate. + const ElemType lEst = norm(gx1 - gx2, 2) / norm(x1 - x2, 2); + sum += lEst; + } + + sum /= estimateTrials; + if (sum == 0) + maxStepSize = std::numeric_limits::max(); + else + maxStepSize = (10 / sum); + + Info << "FISTA::Optimize(): estimated a maximum step size of " + << maxStepSize << "." << std::endl; +} + +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/ftml/ftml.hpp b/inst/include/ensmallen_bits/ftml/ftml.hpp index 26c4183..4fcb53f 100644 --- a/inst/include/ensmallen_bits/ftml/ftml.hpp +++ b/inst/include/ensmallen_bits/ftml/ftml.hpp @@ -98,7 +98,7 @@ class FTML typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/ftml/ftml_update.hpp b/inst/include/ensmallen_bits/ftml/ftml_update.hpp index 5db2b05..a135420 100644 --- a/inst/include/ensmallen_bits/ftml/ftml_update.hpp +++ b/inst/include/ensmallen_bits/ftml/ftml_update.hpp @@ -78,6 +78,8 @@ class FTMLUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -87,11 +89,18 @@ class FTMLUpdate * @param cols Number of columns in the gradient matrix. */ Policy(FTMLUpdate& parent, const size_t rows, const size_t cols) : - parent(parent) + parent(parent), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)) { v.zeros(rows, cols); z.zeros(rows, cols); d.zeros(rows, cols); + + // Attempt to catch underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -109,19 +118,19 @@ class FTMLUpdate ++iteration; // And update the iterate. - v *= parent.beta2; - v += (1 - parent.beta2) * (gradient % gradient); + v *= beta2; + v += (1 - beta2) * (gradient % gradient); - const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration); - const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration); + const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration)); + const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration)); - MatType sigma = -parent.beta1 * d; - d = biasCorrection1 / stepSize * - (arma::sqrt(v / biasCorrection2) + parent.epsilon); + MatType sigma = -beta1 * d; + d = biasCorrection1 / ElemType(stepSize) * + (sqrt(v / biasCorrection2) + epsilon); sigma += d; - z *= parent.beta1; - z += (1 - parent.beta1) * gradient - sigma % iterate; + z *= beta1; + z += (1 - beta1) * gradient - sigma % iterate; iterate = -z / d; } @@ -140,6 +149,11 @@ class FTMLUpdate // The number of iterations. size_t iteration; + + // Optimization parameters converted to the type of the optimization. + ElemType epsilon; + ElemType beta1; + ElemType beta2; }; private: diff --git a/inst/include/ensmallen_bits/function/arma_traits.hpp b/inst/include/ensmallen_bits/function/arma_traits.hpp index e13297b..cf9c67c 100644 --- a/inst/include/ensmallen_bits/function/arma_traits.hpp +++ b/inst/include/ensmallen_bits/function/arma_traits.hpp @@ -122,6 +122,17 @@ template<> inline void RequireDenseFloatingPointType() { } template<> inline void RequireDenseFloatingPointType() { } +#if defined(ARMA_HAVE_FP16) +template<> +inline void RequireDenseFloatingPointType() { } +#endif + +#ifdef ENS_HAVE_COOT +template<> +inline void RequireDenseFloatingPointType() { } +template<> +inline void RequireDenseFloatingPointType() { } +#endif template void RequireFloatingPointType() @@ -144,6 +155,19 @@ template<> inline void RequireFloatingPointType() { } template<> inline void RequireFloatingPointType() { } +#if defined(ARMA_HAVE_FP16) +template<> +inline void RequireFloatingPointType() { } +template<> +inline void RequireFloatingPointType() { } +#endif + +#ifdef ENS_HAVE_COOT +template<> +inline void RequireFloatingPointType() { } +template<> +inline void RequireFloatingPointType() { } +#endif /** * Require that the internal element type of the matrix type and gradient type diff --git a/inst/include/ensmallen_bits/fw/atoms.hpp b/inst/include/ensmallen_bits/fw/atoms.hpp index ffef05f..6ba9a92 100644 --- a/inst/include/ensmallen_bits/fw/atoms.hpp +++ b/inst/include/ensmallen_bits/fw/atoms.hpp @@ -96,6 +96,7 @@ class Atoms // Find possible atom to be deleted. arma::vec gap = sqTerm - currentCoeffs % trans(gradient.t() * currentAtoms); + arma::uword ind = gap.index_min(); // Try deleting the atom. diff --git a/inst/include/ensmallen_bits/fw/constr_lpball.hpp b/inst/include/ensmallen_bits/fw/constr_lpball.hpp index 9cddbcb..df09907 100644 --- a/inst/include/ensmallen_bits/fw/constr_lpball.hpp +++ b/inst/include/ensmallen_bits/fw/constr_lpball.hpp @@ -49,7 +49,8 @@ namespace ens { * \f] * */ -class ConstrLpBallSolver +template +class ConstrLpBallSolverType { public: /** @@ -58,7 +59,7 @@ class ConstrLpBallSolver * * @param p The constraint is unit lp ball. */ - ConstrLpBallSolver(const double p) : p(p) + ConstrLpBallSolverType(const double p) : p(p) { /* Do nothing. */ } /** @@ -68,7 +69,7 @@ class ConstrLpBallSolver * @param p The constraint is unit lp ball. * @param lambda Regularization parameter. */ - ConstrLpBallSolver(const double p, const arma::vec lambda) : + ConstrLpBallSolverType(const double p, const VecType lambda) : p(p), regFlag(true), lambda(lambda) { /* Do nothing. */ } @@ -80,52 +81,51 @@ class ConstrLpBallSolver * @param s Output optimal solution in the constrained domain (lp ball). */ template - void Optimize(const MatType& v, - MatType& s) + void Optimize(const MatType& v, MatType& s) { - typedef typename MatType::elem_type ElemType; + typedef typename ForwardType::uword UWordType; if (p == std::numeric_limits::infinity()) { // l-inf ball. - s = -arma::sign(v); + s = -sign(v); if (regFlag) { // Do element-wise division. - s /= arma::conv_to>::from(lambda); + s /= conv_to::from(lambda); } } else if (p > 1.0) { // lp ball with 1>::from(lambda); + s = v / conv_to::from(lambda); else s = v; double q = 1 / (1.0 - 1.0 / p); - s = -arma::sign(v) % arma::pow(arma::abs(s), q - 1); - s = arma::normalise(s, p); + s = -sign(v) % pow(abs(s), q - 1); + s = normalise(s, p); if (regFlag) - s = s / arma::conv_to>::from(lambda); + s = s / conv_to::from(lambda); } else if (p == 1.0) { // l1 ball, also used in OMP. if (regFlag) - s = arma::abs(v / arma::conv_to>::from(lambda)); + s = abs(v / conv_to::from(lambda)); else - s = arma::abs(v); + s = abs(v); // k is the linear index of the largest element. - arma::uword k = s.index_max(); + UWordType k = s.index_max(); s.zeros(); // Take the sign of v(k). s(k) = -((0.0 < v(k)) - (v(k) < 0.0)); if (regFlag) - s = s / arma::conv_to>::from(lambda); + s = s / conv_to::from(lambda); } else { @@ -146,9 +146,9 @@ class ConstrLpBallSolver bool& RegFlag() { return regFlag; } //! Get the regularization parameter. - arma::vec Lambda() const { return lambda; } + VecType Lambda() const { return lambda; } //! Modify the regularization parameter. - arma::vec& Lambda() { return lambda; } + VecType& Lambda() { return lambda; } private: //! lp norm, 1<=p<=inf; @@ -159,9 +159,11 @@ class ConstrLpBallSolver bool regFlag = false; //! Regularization parameter. - arma::vec lambda; + VecType lambda; }; +using ConstrLpBallSolver = ConstrLpBallSolverType; + } // namespace ens #endif diff --git a/inst/include/ensmallen_bits/fw/frank_wolfe.hpp b/inst/include/ensmallen_bits/fw/frank_wolfe.hpp index 2694566..561fdc0 100644 --- a/inst/include/ensmallen_bits/fw/frank_wolfe.hpp +++ b/inst/include/ensmallen_bits/fw/frank_wolfe.hpp @@ -126,7 +126,7 @@ class FrankWolfe */ template - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(FunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/fw/frank_wolfe_impl.hpp b/inst/include/ensmallen_bits/fw/frank_wolfe_impl.hpp index 32c2fa3..b60ccf4 100644 --- a/inst/include/ensmallen_bits/fw/frank_wolfe_impl.hpp +++ b/inst/include/ensmallen_bits/fw/frank_wolfe_impl.hpp @@ -41,12 +41,12 @@ template< typename UpdateRuleType> template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type FrankWolfe::Optimize( - FunctionType& function, - MatType& iterateIn, - CallbackTypes&&... callbacks) + FunctionType& function, + MatType& iterateIn, + CallbackTypes&&... callbacks) { // Convenience typedefs. typedef typename MatType::elem_type ElemType; @@ -95,7 +95,7 @@ FrankWolfe::Optimize( if (gap < tolerance) { Info << "FrankWolfe::Optimize(): minimized within tolerance " - << tolerance << "; " << "terminating optimization." << std::endl; + << tolerance << "; terminating optimization." << std::endl; Callback::EndOptimization(*this, f, iterate, callbacks...); return currentObjective; @@ -109,8 +109,11 @@ FrankWolfe::Optimize( terminate |= Callback::StepTaken(*this, f, iterate, callbacks...); } - Info << "FrankWolfe::Optimize(): maximum iterations (" << maxIterations - << ") reached; " << "terminating optimization." << std::endl; + if (!terminate) + { + Info << "FrankWolfe::Optimize(): maximum iterations (" << maxIterations + << ") reached; terminating optimization." << std::endl; + } Callback::EndOptimization(*this, f, iterate, callbacks...); return currentObjective; diff --git a/inst/include/ensmallen_bits/fw/line_search/line_search_impl.hpp b/inst/include/ensmallen_bits/fw/line_search/line_search_impl.hpp index 50d62dd..752ee7a 100644 --- a/inst/include/ensmallen_bits/fw/line_search/line_search_impl.hpp +++ b/inst/include/ensmallen_bits/fw/line_search/line_search_impl.hpp @@ -106,7 +106,7 @@ typename MatType::elem_type LineSearch::Derivative(FunctionType& function, { GradType gradient(x0.n_rows, x0.n_cols); function.Gradient(x0 + gamma * deltaX, gradient); - return arma::dot(gradient, deltaX); + return dot(gradient, deltaX); } } // namespace ens diff --git a/inst/include/ensmallen_bits/fw/proximal/proximal_impl.hpp b/inst/include/ensmallen_bits/fw/proximal/proximal_impl.hpp index f607c71..88792f8 100644 --- a/inst/include/ensmallen_bits/fw/proximal/proximal_impl.hpp +++ b/inst/include/ensmallen_bits/fw/proximal/proximal_impl.hpp @@ -35,14 +35,24 @@ namespace ens { template inline void Proximal::ProjectToL1Ball(MatType& v, double tau) { - MatType simplexSol = arma::abs(v); + MatType simplexSol = abs(v); // Already with L1 norm <= tau. - if (arma::accu(simplexSol) <= tau) + if (accu(simplexSol) <= tau) return; - simplexSol = arma::sort(simplexSol, "descend"); - MatType simplexSum = arma::cumsum(simplexSol); + simplexSol = sort(simplexSol, "descend"); + // MatType simplexSum = arma::cumsum(simplexSol); + MatType simplexSum(simplexSol.n_rows, simplexSol.n_cols); + for (size_t col = 0; col < simplexSol.n_cols; ++col) + { + simplexSum(0, col) = simplexSol(0, col); + for (size_t row = 1; row < simplexSol.n_rows; ++row) + { + simplexSum(row, col) = simplexSum(row - 1, col) + + simplexSol(row, col); + } + } double nu = 0; size_t rho = simplexSol.n_rows - 1; @@ -72,10 +82,15 @@ inline void Proximal::ProjectToL1Ball(MatType& v, double tau) template inline void Proximal::ProjectToL0Ball(MatType& v, int tau) { - arma::uvec indices = arma::sort_index(arma::abs(v)); - arma::uword numberToKill = v.n_elem - tau; + typedef typename ForwardType::uword UWordType; + typedef typename ForwardType::uvec UVecType; + typedef typename ForwardType::bvec VecType; + + const VecType vTemp = conv_to::from(abs(v)); + UVecType indices = sort_index(vTemp); + UWordType numberToKill = v.n_elem - tau; - for (arma::uword i = 0; i < numberToKill; i++) + for (UWordType i = 0; i < numberToKill; i++) v(indices(i)) = 0.0; } diff --git a/inst/include/ensmallen_bits/fw/update_full_correction.hpp b/inst/include/ensmallen_bits/fw/update_full_correction.hpp index e486fed..1fdc04c 100644 --- a/inst/include/ensmallen_bits/fw/update_full_correction.hpp +++ b/inst/include/ensmallen_bits/fw/update_full_correction.hpp @@ -78,7 +78,7 @@ class UpdateFullCorrection atoms.ProjectedGradientEnhancement(function, tau, stepSize); arma::mat tmp; atoms.RecoverVector(tmp); - newCoords = arma::conv_to::from(tmp); + newCoords = conv_to::from(tmp); } private: diff --git a/inst/include/ensmallen_bits/fw/update_span.hpp b/inst/include/ensmallen_bits/fw/update_span.hpp index 1c113d8..7ba21dc 100644 --- a/inst/include/ensmallen_bits/fw/update_span.hpp +++ b/inst/include/ensmallen_bits/fw/update_span.hpp @@ -63,7 +63,7 @@ class UpdateSpan // to the original size. arma::mat tmp; atoms.RecoverVector(tmp); - newCoords = arma::conv_to::from(tmp); + newCoords = conv_to::from(tmp); // Prune the support. if (isPrune) @@ -72,7 +72,7 @@ class UpdateSpan double F = 0.25 * oldF + 0.75 * function.Evaluate(newCoords); atoms.PruneSupport(F, function); atoms.RecoverVector(tmp); - newCoords = arma::conv_to::from(tmp); + newCoords = conv_to::from(tmp); } } diff --git a/inst/include/ensmallen_bits/gradient_descent/gradient_descent.hpp b/inst/include/ensmallen_bits/gradient_descent/gradient_descent.hpp index b9c08a3..024c1bd 100644 --- a/inst/include/ensmallen_bits/gradient_descent/gradient_descent.hpp +++ b/inst/include/ensmallen_bits/gradient_descent/gradient_descent.hpp @@ -77,7 +77,7 @@ class GradientDescent typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(FunctionType& function, MatType& iterate, @@ -140,9 +140,9 @@ class GradientDescent const arma::Row& numCategories, CallbackTypes&&... callbacks) { - return Optimize(function, iterate, categoricalDimensions, - numCategories, std::forward(callbacks)...); + return Optimize(function, + iterate, categoricalDimensions, numCategories, + std::forward(callbacks)...); } //! Get the step size. diff --git a/inst/include/ensmallen_bits/gradient_descent/gradient_descent_impl.hpp b/inst/include/ensmallen_bits/gradient_descent/gradient_descent_impl.hpp index 5301002..813244c 100644 --- a/inst/include/ensmallen_bits/gradient_descent/gradient_descent_impl.hpp +++ b/inst/include/ensmallen_bits/gradient_descent/gradient_descent_impl.hpp @@ -34,8 +34,8 @@ template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type GradientDescent::Optimize(FunctionType& function, MatType& iterateIn, CallbackTypes&&... callbacks) @@ -101,7 +101,7 @@ GradientDescent::Optimize(FunctionType& function, lastObjective = overallObjective; // And update the iterate. - iterate -= stepSize * gradient; + iterate -= ElemType(stepSize) * gradient; terminate |= Callback::StepTaken(*this, f, iterate, callbacks...); } diff --git a/inst/include/ensmallen_bits/iqn/iqn.hpp b/inst/include/ensmallen_bits/iqn/iqn.hpp index 3bd4f63..a1dbece 100644 --- a/inst/include/ensmallen_bits/iqn/iqn.hpp +++ b/inst/include/ensmallen_bits/iqn/iqn.hpp @@ -87,11 +87,11 @@ class IQN typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, - MatType& iterate, - CallbackTypes&&... callbacks); + MatType& iterate, + CallbackTypes&&... callbacks); //! Forward the MatType as GradType. template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type IQN::Optimize(SeparableFunctionType& functionIn, MatType& iterateIn, CallbackTypes&&... callbacks) @@ -46,6 +46,7 @@ IQN::Optimize(SeparableFunctionType& functionIn, typedef typename MatType::elem_type ElemType; typedef typename MatTypeTraits::BaseMatType BaseMatType; typedef typename MatTypeTraits::BaseMatType BaseGradType; + typedef typename ForwardType::bmat ProxyMatType; typedef Function FullFunctionType; @@ -81,8 +82,8 @@ IQN::Optimize(SeparableFunctionType& functionIn, iterate.n_cols)); std::vector Q(numBatches, BaseMatType(iterate.n_elem, iterate.n_elem)); - BaseMatType initialIterate = arma::randn>(iterate.n_rows, - iterate.n_cols); + BaseMatType initialIterate = ProxyMatType(iterate.n_rows, iterate.n_cols, + GetFillType::randn); BaseGradType B(iterate.n_elem, iterate.n_elem); B.eye(); @@ -103,7 +104,7 @@ IQN::Optimize(SeparableFunctionType& functionIn, Q[f].eye(); g += y[f]; - y[f] /= (double) effectiveBatchSize; + y[f] /= (ElemType) effectiveBatchSize; i += effectiveBatchSize; } @@ -124,7 +125,7 @@ IQN::Optimize(SeparableFunctionType& functionIn, const size_t effectiveBatchSize = std::min(batchSize, numFunctions - it * batchSize); - if (arma::norm(iterate - t[it]) > 0) + if (norm(iterate - t[it]) > 0) { function.Gradient(iterate, it * batchSize, gradient, effectiveBatchSize); @@ -133,31 +134,34 @@ IQN::Optimize(SeparableFunctionType& functionIn, terminate |= Callback::Gradient(*this, function, iterate, gradient, callbacks...); - const BaseMatType s = arma::vectorise(iterate - t[it]); - const BaseGradType yy = arma::vectorise(gradient - y[it]); + const BaseMatType s = vectorise(iterate - t[it]); + const BaseGradType yy = vectorise(gradient - y[it]); const BaseGradType stochasticHessian = Q[it] + yy * yy.t() / - arma::as_scalar(yy.t() * s) - Q[it] * s * s.t() * - Q[it] / arma::as_scalar(s.t() * Q[it] * s); + as_scalar(yy.t() * s) - Q[it] * s * s.t() * + Q[it] / as_scalar(s.t() * Q[it] * s); + + const ElemType negBatches = 1 / ElemType(numBatches); // Update aggregate Hessian approximation. - B += (1.0 / numBatches) * (stochasticHessian - Q[it]); + B += negBatches * (stochasticHessian - Q[it]); // Update aggregate Hessian-variable product. - u += arma::reshape((1.0 / numBatches) * (stochasticHessian * - arma::vectorise(iterate) - Q[it] * arma::vectorise(t[it])), - u.n_rows, u.n_cols);; + u += reshape(negBatches * (stochasticHessian * + vectorise(iterate) - Q[it] * vectorise(t[it])), + u.n_rows, u.n_cols); // Update aggregate gradient. - g += (1.0 / numBatches) * (gradient - y[it]); + g += negBatches * (gradient - y[it]); // Update the function information tables. Q[it] = std::move(stochasticHessian); y[it] = std::move(gradient); t[it] = iterate; - iterate = arma::reshape(stepSize * B.i() * (u.t() - arma::vectorise(g)), - iterate.n_rows, iterate.n_cols) + (1 - stepSize) * iterate; + iterate = reshape(ElemType(stepSize) * pinv(B) * (u.t() - vectorise(g)), + iterate.n_rows, iterate.n_cols) + + (1 - ElemType(stepSize)) * iterate; terminate |= Callback::StepTaken(*this, function, iterate, callbacks...); diff --git a/inst/include/ensmallen_bits/katyusha/katyusha.hpp b/inst/include/ensmallen_bits/katyusha/katyusha.hpp index d416f8a..2126d92 100644 --- a/inst/include/ensmallen_bits/katyusha/katyusha.hpp +++ b/inst/include/ensmallen_bits/katyusha/katyusha.hpp @@ -93,7 +93,7 @@ class KatyushaType typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/katyusha/katyusha_impl.hpp b/inst/include/ensmallen_bits/katyusha/katyusha_impl.hpp index 53c3043..ddca745 100644 --- a/inst/include/ensmallen_bits/katyusha/katyusha_impl.hpp +++ b/inst/include/ensmallen_bits/katyusha/katyusha_impl.hpp @@ -45,8 +45,8 @@ template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type KatyushaType::Optimize( SeparableFunctionType& function, MatType& iterateIn, @@ -80,20 +80,20 @@ KatyushaType::Optimize( if (numFunctions % batchSize != 0) ++numBatches; // Capture last few. - const double tau1 = std::min(0.5, - std::sqrt(batchSize * convexity / (3.0 * lipschitz))); - const double tau2 = 0.5; - const double alpha = 1.0 / (3.0 * tau1 * lipschitz); - const double r = 1.0 + std::min(alpha * convexity, 1.0 / - (4.0 / innerIterations)); + const ElemType tau1 = ElemType(std::min(0.5, + std::sqrt(batchSize * convexity / (3 * lipschitz)))); + const ElemType tau2 = ElemType(0.5); + const ElemType alpha = 1 / (3 * tau1 * ElemType(lipschitz)); + const ElemType r = 1 + std::min(alpha * ElemType(convexity), + ElemType(innerIterations) / 4); // sum_{j=0}^{m-1} 1 + std::min(alpha * convexity, 1 / (4 * m)^j). - double normalizer = 1; + ElemType normalizer = 1; for (size_t i = 0; i < numBatches; i++) { - normalizer = r * (normalizer + 1.0); + normalizer = r * (normalizer + 1); } - normalizer = 1.0 / normalizer; + normalizer = 1 / normalizer; // To keep track of where we are and how things are going. ElemType overallObjective = 0; @@ -168,10 +168,10 @@ KatyushaType::Optimize( f += effectiveBatchSize; } - fullGradient /= (double) numFunctions; + fullGradient /= (ElemType) numFunctions; // To keep track of where we are and how things are going. - double cw = 1; + ElemType cw = 1; w.zeros(); for (size_t f = 0, currentFunction = 0; (f < innerIterations) && !terminate; @@ -208,7 +208,7 @@ KatyushaType::Optimize( // By the minimality definition of z_{k + 1}, we have that: // z_{k+1} − z_k + \alpha * \sigma_{k+1} + \alpha g = 0. BaseMatType zNew = z - alpha * (fullGradient + (gradient - gradient0) / - (double) batchSize); + (ElemType) batchSize); // Proximal update, choose between Option I and Option II. Shift relative // to the Lipschitz constant or take a constant step using the given step @@ -221,7 +221,7 @@ KatyushaType::Optimize( // yk = x0 − 1 / (3L) * \delta3 - ((1 - tau) / (3L)) + tau * alpha) // * \delta2 - ((1-tau)^2 / (3L) + (1 - (1 - tau)^2) * alpha) * \delta1, // k = 3. - y = iterate + 1.0 / (3.0 * lipschitz) * w; + y = iterate + 1 / (3 * ElemType(lipschitz)) * w; } else { diff --git a/inst/include/ensmallen_bits/lbfgs/lbfgs.hpp b/inst/include/ensmallen_bits/lbfgs/lbfgs.hpp index 8f58eee..d26b4ac 100644 --- a/inst/include/ensmallen_bits/lbfgs/lbfgs.hpp +++ b/inst/include/ensmallen_bits/lbfgs/lbfgs.hpp @@ -80,7 +80,7 @@ class L_BFGS typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(FunctionType& function, MatType& iterate, @@ -177,10 +177,11 @@ class L_BFGS * @param y Differences between the gradient and the old gradient matrix. */ template - double ChooseScalingFactor(const size_t iterationNum, - const MatType& gradient, - const CubeType& s, - const CubeType& y); + typename MatType::elem_type ChooseScalingFactor( + const size_t iterationNum, + const MatType& gradient, + const CubeType& s, + const CubeType& y); /** * Perform a back-tracking line search along the search direction to @@ -208,7 +209,7 @@ class L_BFGS GradType& gradient, MatType& newIterateTmp, const GradType& searchDirection, - double& finalStepSize, + ElemType& finalStepSize, CallbackTypes&... callbacks); /** @@ -224,7 +225,7 @@ class L_BFGS template void SearchDirection(const MatType& gradient, const size_t iterationNum, - const double scalingFactor, + const typename MatType::elem_type scalingFactor, const CubeType& s, const CubeType& y, MatType& searchDirection); diff --git a/inst/include/ensmallen_bits/lbfgs/lbfgs_impl.hpp b/inst/include/ensmallen_bits/lbfgs/lbfgs_impl.hpp index 5d15401..b5ac443 100644 --- a/inst/include/ensmallen_bits/lbfgs/lbfgs_impl.hpp +++ b/inst/include/ensmallen_bits/lbfgs/lbfgs_impl.hpp @@ -72,34 +72,48 @@ inline L_BFGS::L_BFGS(const size_t numBasis, * @param y Differences between the gradient and the old gradient matrix. */ template -double L_BFGS::ChooseScalingFactor(const size_t iterationNum, - const MatType& gradient, - const CubeType& s, - const CubeType& y) +typename MatType::elem_type L_BFGS::ChooseScalingFactor( + const size_t iterationNum, + const MatType& gradient, + const CubeType& s, + const CubeType& y) { - typedef typename CubeType::elem_type CubeElemType; + typedef typename CubeType::elem_type ElemType; + typedef typename ForwardType::bmat BaseMatType; - constexpr const CubeElemType tol = - 100 * std::numeric_limits::epsilon(); + constexpr const ElemType tol = + 100 * std::numeric_limits::epsilon(); - double scalingFactor; + ElemType scalingFactor; if (iterationNum > 0) { int previousPos = (iterationNum - 1) % numBasis; // Get s and y matrices once instead of multiple times. - const arma::Mat& sMat = s.slice(previousPos); - const arma::Mat& yMat = y.slice(previousPos); + const BaseMatType& sMat = s.slice(previousPos); + const BaseMatType& yMat = y.slice(previousPos); - const CubeElemType tmp = arma::dot(yMat, yMat); - const CubeElemType denom = (tmp >= tol) ? tmp : CubeElemType(1); + const ElemType tmp = dot(yMat, yMat); + const ElemType denom = (tmp >= tol) ? tmp : ElemType(1); + if (std::isinf(tmp)) + { + Warn << "L-BFGS: squared 2-norm of gradient difference is infinite; " + << "try using a higher-precision element type or setting MaxStep() " + << "to a smaller value." << std::endl; + } - scalingFactor = arma::dot(sMat, yMat) / denom; + scalingFactor = dot(sMat, yMat) / denom; } else { - const CubeElemType tmp = arma::norm(gradient, "fro"); + const ElemType tmp = norm(gradient, "fro"); + if (std::isinf(tmp)) + { + Warn << "L-BFGS: Frobenius norm of gradient difference is infinite; " + << "try using a higher-precision element type or an initial point " + << "with a smaller gradient value." << std::endl; + } - scalingFactor = (tmp >= tol) ? (1.0 / tmp) : 1.0; + scalingFactor = (tmp >= tol) ? (1 / tmp) : 1; } return scalingFactor; @@ -118,37 +132,38 @@ double L_BFGS::ChooseScalingFactor(const size_t iterationNum, template void L_BFGS::SearchDirection(const MatType& gradient, const size_t iterationNum, - const double scalingFactor, + const typename MatType::elem_type scalingFactor, const CubeType& s, const CubeType& y, MatType& searchDirection) { + typedef typename CubeType::elem_type ElemType; + typedef typename ForwardType::bmat BaseMatType; + typedef typename ForwardType::bcol BaseColType; + // Start from this point. searchDirection = gradient; // See "A Recursive Formula to Compute H * g" in "Updating quasi-Newton // matrices with limited storage" (Nocedal, 1980). - typedef typename CubeType::elem_type CubeElemType; // Temporary variables. - arma::Col rho(numBasis); - arma::Col alpha(numBasis); + BaseColType rho(numBasis); + BaseColType alpha(numBasis); size_t limit = (numBasis > iterationNum) ? 0 : (iterationNum - numBasis); for (size_t i = iterationNum; i != limit; i--) { int translatedPosition = (i + (numBasis - 1)) % numBasis; + const BaseMatType& sMat = s.slice(translatedPosition); + const BaseMatType& yMat = y.slice(translatedPosition); - const arma::Mat& sMat = s.slice(translatedPosition); - const arma::Mat& yMat = y.slice(translatedPosition); + const ElemType tmp = dot(yMat, sMat); - const CubeElemType tmp = arma::dot(yMat, sMat); - - rho[iterationNum - i] = (tmp != CubeElemType(0)) ? (1.0 / tmp) : - CubeElemType(1); + rho[iterationNum - i] = (tmp != ElemType(0)) ? (1 / tmp) : 1; alpha[iterationNum - i] = rho[iterationNum - i] * - arma::dot(sMat, searchDirection); + dot(sMat, searchDirection); searchDirection -= alpha[iterationNum - i] * yMat; } @@ -158,8 +173,8 @@ void L_BFGS::SearchDirection(const MatType& gradient, for (size_t i = limit; i < iterationNum; i++) { int translatedPosition = i % numBasis; - double beta = rho[iterationNum - i - 1] * - arma::dot(y.slice(translatedPosition), searchDirection); + ElemType beta = rho[iterationNum - i - 1] * + dot(y.slice(translatedPosition), searchDirection); searchDirection += (alpha[iterationNum - i - 1] - beta) * s.slice(translatedPosition); } @@ -222,23 +237,27 @@ bool L_BFGS::LineSearch(FunctionType& function, GradType& gradient, MatType& newIterateTmp, const GradType& searchDirection, - double& finalStepSize, + ElemType& finalStepSize, CallbackTypes&... callbacks) { // Default first step size of 1.0. - double stepSize = 1.0; - finalStepSize = 0.0; // Set only when we take the step. + ElemType stepSize = 1; + if (stepSize > ElemType(maxStep)) + stepSize = ElemType(maxStep); + if (stepSize < ElemType(minStep)) + stepSize = ElemType(minStep); + finalStepSize = 0; // Set only when we take the step. // The initial linear term approximation in the direction of the // search direction. ElemType initialSearchDirectionDotGradient = - arma::dot(gradient, searchDirection); + dot(gradient, searchDirection); // If it is not a descent direction, just report failure. - if ( (initialSearchDirectionDotGradient > 0.0) + if ( (initialSearchDirectionDotGradient > 0) || (std::isfinite(initialSearchDirectionDotGradient) == false) ) { - Warn << "L-BFGS line search direction is not a descent direction " + Warn << "L-BFGS: line search direction is not a descent direction " << "(terminating)!" << std::endl; return false; } @@ -247,17 +266,17 @@ bool L_BFGS::LineSearch(FunctionType& function, ElemType initialFunctionValue = functionValue; // Unit linear approximation to the decrease in function value. - ElemType linearApproxFunctionValueDecrease = armijoConstant * + ElemType linearApproxFunctionValueDecrease = ElemType(armijoConstant) * initialSearchDirectionDotGradient; // The number of iteration in the search. size_t numIterations = 0; // Armijo step size scaling factor for increase and decrease. - const double inc = 2.1; - const double dec = 0.5; - double width = 0; - double bestStepSize = 1.0; + const ElemType inc = ElemType(2.1); + const ElemType dec = ElemType(0.5); + ElemType width = 0; + ElemType bestStepSize = 1; ElemType bestObjective = std::numeric_limits::max(); while (true) @@ -270,7 +289,7 @@ bool L_BFGS::LineSearch(FunctionType& function, if (std::isnan(functionValue)) { - Warn << "L-BFGS objective value is NaN (terminating)!" << std::endl; + Warn << "L-BFGS: objective value is NaN (terminating)!" << std::endl; return false; } @@ -292,7 +311,7 @@ bool L_BFGS::LineSearch(FunctionType& function, else { // Check Wolfe's condition. - ElemType searchDirectionDotGradient = arma::dot(gradient, + ElemType searchDirectionDotGradient = dot(gradient, searchDirection); if (searchDirectionDotGradient < wolfe * @@ -346,8 +365,8 @@ template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type L_BFGS::Optimize(FunctionType& function, MatType& iterateIn, CallbackTypes&&... callbacks) @@ -376,8 +395,10 @@ L_BFGS::Optimize(FunctionType& function, const size_t cols = iterate.n_cols; BaseMatType newIterateTmp(rows, cols); - arma::Cube s(rows, cols, numBasis); - arma::Cube y(rows, cols, numBasis); + + typedef typename ForwardType::bcube BaseCubeType; + BaseCubeType s(rows, cols, numBasis); + BaseCubeType y(rows, cols, numBasis); // The old iterate to be saved. BaseMatType oldIterate(iterate.n_rows, iterate.n_cols); @@ -403,6 +424,7 @@ L_BFGS::Optimize(FunctionType& function, functionValue, gradient, callbacks...); ElemType prevFunctionValue; + Info << "L-BFGS: initial objective " << functionValue << "." << std::endl; // The main optimization loop. Callback::BeginOptimization(*this, f, iterate, callbacks...); @@ -417,9 +439,10 @@ L_BFGS::Optimize(FunctionType& function, // least one descent step. // TODO: to speed this up, investigate use of arma::norm2est() in Armadillo // 12.4 - if (arma::norm(gradient, 2) < minGradientNorm) + const ElemType gradNorm = norm(gradient, 2); + if (gradNorm < minGradientNorm) { - Info << "L-BFGS gradient norm too small (terminating successfully)." + Info << "L-BFGS: gradient norm too small (terminating successfully)." << std::endl; break; } @@ -427,24 +450,24 @@ L_BFGS::Optimize(FunctionType& function, // Break if the objective is not a number. if (std::isnan(functionValue)) { - Warn << "L-BFGS terminated with objective " << functionValue << "; " + Warn << "L-BFGS: terminated with objective " << functionValue << "; " << "are the objective and gradient functions implemented correctly?" << std::endl; break; } // Choose the scaling factor. - double scalingFactor = ChooseScalingFactor(itNum, gradient, s, y); - if (scalingFactor == 0.0) + ElemType scalingFactor = ChooseScalingFactor(itNum, gradient, s, y); + if (scalingFactor == 0) { - Info << "L-BFGS scaling factor computed as 0 (terminating successfully)." + Info << "L-BFGS: scaling factor computed as 0 (terminating successfully)." << std::endl; break; } if (std::isfinite(scalingFactor) == false) { - Warn << "L-BFGS scaling factor is not finite. Stopping optimization." + Warn << "L-BFGS: scaling factor is not finite. Stopping optimization." << std::endl; break; } @@ -457,31 +480,34 @@ L_BFGS::Optimize(FunctionType& function, oldIterate = iterate; oldGradient = gradient; - double stepSize; // Set by LineSearch(). + ElemType stepSize; // Set by LineSearch(). if (!LineSearch(f, functionValue, iterate, gradient, newIterateTmp, searchDirection, stepSize, callbacks...)) { - Warn << "Line search failed. Stopping optimization." << std::endl; + Warn << "L-BFGS: line search failed. Stopping optimization." + << std::endl; break; // The line search failed; nothing else to try. } // It is possible that the difference between the two coordinates is zero. // In this case we terminate successfully. - if (stepSize == 0.0) + if (stepSize == 0) { - Info << "L-BFGS step size of 0 (terminating successfully)." + Info << "L-BFGS: computed step size of 0 (terminating successfully)." << std::endl; break; } + Info << "L-BFGS: iteration " << itNum << ", objective " << functionValue + << ", step size " << stepSize << "." << std::endl; + // If we can't make progress on the gradient, then we'll also accept // a stable function value. - const double denom = std::max( - std::max(std::abs(prevFunctionValue), std::abs(functionValue)), - (ElemType) 1.0); + const ElemType denom = std::max(ElemType(1), + std::max(std::abs(prevFunctionValue), std::abs(functionValue))); if ((prevFunctionValue - functionValue) / denom <= factr) { - Info << "L-BFGS function value stable (terminating successfully)." + Info << "L-BFGS: function value stable (terminating successfully)." << std::endl; break; } @@ -499,4 +525,3 @@ L_BFGS::Optimize(FunctionType& function, } // namespace ens #endif // ENSMALLEN_LBFGS_LBFGS_IMPL_HPP - diff --git a/inst/include/ensmallen_bits/lookahead/lookahead.hpp b/inst/include/ensmallen_bits/lookahead/lookahead.hpp index d7cd2cf..65da6c7 100644 --- a/inst/include/ensmallen_bits/lookahead/lookahead.hpp +++ b/inst/include/ensmallen_bits/lookahead/lookahead.hpp @@ -131,7 +131,7 @@ class Lookahead typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/lookahead/lookahead_impl.hpp b/inst/include/ensmallen_bits/lookahead/lookahead_impl.hpp index cdb42b6..0d776b1 100644 --- a/inst/include/ensmallen_bits/lookahead/lookahead_impl.hpp +++ b/inst/include/ensmallen_bits/lookahead/lookahead_impl.hpp @@ -66,14 +66,32 @@ inline Lookahead::~Lookahead() instDecayPolicy.Clean(); } +template +size_t GetBatchSize( + const BaseOptimizerType& baseOptimizer, + const typename std::enable_if_t::value>* = 0) +{ + return baseOptimizer.BatchSize(); +} + +template +size_t GetBatchSize( + const BaseOptimizerType& baseOptimizer, + const typename std::enable_if_t::value>* = 0) +{ + return 1; +} + //! Optimize the function (minimize). template template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type Lookahead::Optimize( SeparableFunctionType& function, MatType& iterateIn, @@ -111,8 +129,9 @@ Lookahead::Optimize( if (traits::HasResetPolicySignature::value && baseOptimizer.ResetPolicy()) { - Warn << "Parameters are reset before every Optimize call; set " - << "ResetPolicy() to false."; + Warn << "Lookahead: base optimizer parameters are reset before every " + << "Optimize() call; set ResetPolicy() of the base optimizer to false " + << "to fix this problem." << std::endl; baseOptimizer.ResetPolicy() = resetPolicy; } @@ -169,9 +188,12 @@ Lookahead::Optimize( return overallObjective; } - iterate += stepSize * (iterateModel - iterate); + iterate += ElemType(stepSize) * (iterateModel - iterate); terminate |= Callback::StepTaken(*this, f, iterate, callbacks...); + Info << "Lookahead: iteration " << i << ", objective " << overallObjective + << "." << std::endl; + // Save the current objective. lastOverallObjective = overallObjective; } @@ -185,11 +207,9 @@ Lookahead::Optimize( // Find the number of functions to use. const size_t numFunctions = f.NumFunctions(); - size_t batchSize = 1; // Check if the optimizer implements the BatchSize() method and use the // parameter for the objective calculation. - if (traits::HasBatchSizeSignature::value) - batchSize = baseOptimizer.BatchSize(); + size_t batchSize = GetBatchSize(baseOptimizer); overallObjective = 0; for (size_t i = 0; i < numFunctions; i += batchSize) diff --git a/inst/include/ensmallen_bits/moead/decomposition_policies/pbi_decomposition.hpp b/inst/include/ensmallen_bits/moead/decomposition_policies/pbi_decomposition.hpp index cb4f87b..8da4197 100644 --- a/inst/include/ensmallen_bits/moead/decomposition_policies/pbi_decomposition.hpp +++ b/inst/include/ensmallen_bits/moead/decomposition_policies/pbi_decomposition.hpp @@ -63,11 +63,12 @@ class PenaltyBoundaryIntersection { typedef typename VecType::elem_type ElemType; //! A unit vector in the same direction as the provided weight vector. - const VecType referenceDirection = weight / arma::norm(weight); + const VecType referenceDirection = weight / norm(weight); //! Distance of F(x) from the idealPoint along the reference direction. - const ElemType d1 = arma::dot(candidateFitness - idealPoint, referenceDirection); + const ElemType d1 = dot(candidateFitness - idealPoint, referenceDirection); //! The perpendicular distance of F(x) from reference direction. - const ElemType d2 = arma::norm(candidateFitness - (idealPoint + d1 * referenceDirection)); + const ElemType d2 = norm(candidateFitness - (idealPoint + d1 * + referenceDirection)); return d1 + static_cast(theta) * d2; } diff --git a/inst/include/ensmallen_bits/moead/decomposition_policies/tchebycheff_decomposition.hpp b/inst/include/ensmallen_bits/moead/decomposition_policies/tchebycheff_decomposition.hpp index 0507c03..e40678c 100644 --- a/inst/include/ensmallen_bits/moead/decomposition_policies/tchebycheff_decomposition.hpp +++ b/inst/include/ensmallen_bits/moead/decomposition_policies/tchebycheff_decomposition.hpp @@ -57,7 +57,7 @@ class Tchebycheff const VecType& idealPoint, const VecType& candidateFitness) { - return arma::max(weight % arma::abs(candidateFitness - idealPoint)); + return max(weight % abs(candidateFitness - idealPoint)); } }; diff --git a/inst/include/ensmallen_bits/moead/decomposition_policies/weighted_decomposition.hpp b/inst/include/ensmallen_bits/moead/decomposition_policies/weighted_decomposition.hpp index 8007395..a04a830 100644 --- a/inst/include/ensmallen_bits/moead/decomposition_policies/weighted_decomposition.hpp +++ b/inst/include/ensmallen_bits/moead/decomposition_policies/weighted_decomposition.hpp @@ -53,7 +53,7 @@ class WeightedAverage const VecType& /* idealPoint */, const VecType& candidateFitness) { - return arma::dot(weight, candidateFitness); + return dot(weight, candidateFitness); } }; diff --git a/inst/include/ensmallen_bits/moead/moead.hpp b/inst/include/ensmallen_bits/moead/moead.hpp index f9c8e24..b271ca2 100644 --- a/inst/include/ensmallen_bits/moead/moead.hpp +++ b/inst/include/ensmallen_bits/moead/moead.hpp @@ -28,24 +28,25 @@ namespace ens { /** - * MOEA/D-DE (Multi Objective Evolutionary Algorithm based on Decompositon - - * Differential Variant) is a multiobjective optimization algorithm. This class - * implements the said optimizer. + * MOEA/D-DE (Multi Objective Evolutionary Algorithm based on Decompositon - + * Differential Variant) is a multiobjective optimization algorithm. This class + * implements the said optimizer. * - * The algorithm works by generating a candidate population from a fixed starting point. - * Reference directions are generated to guide the optimization process towards the Pareto Front. - * Further, a decomposition function is defined to decompose the problem to a scalar optimization - * objective. Utilizing genetic operators, offsprings are generated with better decomposition values - * to replace the neighboring parent solutions. + * The algorithm works by generating a candidate population from a fixed starting point. + * Reference directions are generated to guide the optimization process towards the Pareto Front. + * Further, a decomposition function is defined to decompose the problem to a scalar optimization + * objective. Utilizing genetic operators, offsprings are generated with better decomposition values + * to replace the neighboring parent solutions. * * For more information, see the following: * @code * @article{li2008multiobjective, - * title={Multiobjective optimization problems with complicated Pareto sets, MOEA/D and NSGA-II}, - * author={Li, Hui and Zhang, Qingfu}, - * journal={IEEE transactions on evolutionary computation}, - * pages={284--302}, - * year={2008}, + * title = {Multiobjective optimization problems with complicated Pareto + * sets, MOEA/D and NSGA-II}, + * author = {Li, Hui and Zhang, Qingfu}, + * journal = {IEEE transactions on evolutionary computation}, + * pages = {284--302}, + * year = {2008}, * @endcode */ template - typename MatType::elem_type Optimize(std::tuple& objectives, - MatType& iterate, - CallbackTypes&&... callbacks); + typename MatType::elem_type Optimize( + std::tuple& objectives, + MatType& iterate, + CallbackTypes&&... callbacks); + + /** + * Optimize a set of objectives. The initial population is generated + * using the initial point. The output is the best generated front. + * + * @tparam MatType The type of matrix used to store coordinates. + * @tparam CubeType The type of cube used to store the front and Pareto set. + * @tparam ArbitraryFunctionType The type of objective function. + * @tparam CallbackTypes Types of callback function. + * @param objectives std::tuple of the objective functions. + * @param iterate The initial reference point for generating population. + * @param front The generated front. + * @param paretoSet The generated Pareto set. + * @param callbacks The callback functions. + */ + template + typename MatType::elem_type Optimize( + std::tuple& objectives, + MatType& iterate, + CubeType& front, + CubeType& paretoSet, + CallbackTypes&&... callbacks); //! Retrieve population size. size_t PopulationSize() const { return populationSize; } @@ -201,14 +228,6 @@ class MOEAD { //! Modify value of upperBound. arma::vec& UpperBound() { return upperBound; } - //! Retrieve the Pareto optimal points in variable space. This returns an empty cube - //! until `Optimize()` has been called. - const arma::cube& ParetoSet() const { return paretoSet; } - - //! Retrieve the best front (the Pareto frontier). This returns an empty cube until - //! `Optimize()` has been called. - const arma::cube& ParetoFront() const { return paretoFront; } - //! Get the weight initialization policy. const InitPolicyType& InitPolicy() const { return initPolicy; } //! Modify the weight initialization policy. @@ -227,8 +246,9 @@ class MOEAD { * @param neighborSize A matrix containing indices of the neighbors. * @return std::tuple The chosen pair of indices. */ + template std::tuple Mating(size_t subProblemIdx, - const arma::umat& neighborSize, + const UMatType& neighborSize, bool sampleNeighbor); /** @@ -253,27 +273,28 @@ class MOEAD { * * @tparam ArbitraryFunctionType std::tuple of multiple function types. * @tparam MatType Type of matrix to optimize. + * @tparam ColType Type of column vector to store objectives. * @param population The elite population. * @param objectives The set of objectives. * @param calculatedObjectives Vector to store calculated objectives. */ template typename std::enable_if::type - EvaluateObjectives( - std::vector&, + EvaluateObjectives(std::vector&, std::tuple&, - std::vector >&); + std::vector&); template typename std::enable_if::type - EvaluateObjectives( - std::vector& population, + EvaluateObjectives(std::vector& population, std::tuple& objectives, - std::vector >& + std::vector& calculatedObjectives); //! Size of the population. diff --git a/inst/include/ensmallen_bits/moead/moead_impl.hpp b/inst/include/ensmallen_bits/moead/moead_impl.hpp index 6aab814..dab45ea 100644 --- a/inst/include/ensmallen_bits/moead/moead_impl.hpp +++ b/inst/include/ensmallen_bits/moead/moead_impl.hpp @@ -78,27 +78,52 @@ MOEAD(const size_t populationSize, decompPolicy(decompPolicy) { /* Nothing to do here. */ } + //! Optimize the function. +template +template +typename MatType::elem_type MOEAD:: +Optimize(std::tuple& objectives, + MatType& iterateIn, + CallbackTypes&&... callbacks) +{ + typedef typename ForwardType::bcube CubeType; + CubeType paretoFront, paretoSet; + return Optimize(objectives, iterateIn, paretoFront, paretoSet, + std::forward(callbacks)...); +} + //! Optimize the function. template template typename MatType::elem_type MOEAD:: Optimize(std::tuple& objectives, MatType& iterateIn, + CubeType& paretoFrontIn, + CubeType& paretoSetIn, CallbackTypes&&... callbacks) { // Population Size must be at least 3 for MOEA/D-DE to work. if (populationSize < 3) { - throw std::logic_error("MOEA/D-DE::Optimize(): population size should be at least" - " 3!"); + throw std::logic_error("MOEA/D-DE::Optimize(): population size should be " + "at least 3!"); } // Convenience typedefs. typedef typename MatType::elem_type ElemType; typedef typename MatTypeTraits::BaseMatType BaseMatType; + typedef typename ForwardType::uvec UVecType; + typedef typename ForwardType::umat UMatType; + typedef typename ForwardType::brow BaseRowType; + typedef typename ForwardType::bcol BaseColType; + typedef typename ForwardType::bmat CubeBaseMatType; + BaseMatType& iterate = (BaseMatType&) iterateIn; // Make sure that we have the methods that we need. Long name... @@ -137,69 +162,79 @@ Optimize(std::tuple& objectives, assert(upperBound.n_rows == iterate.n_rows && "The dimensions of " "upperBound are not the same as the dimensions of iterate."); + //! Useful temporaries for float-like comparisons. + const BaseMatType castedLowerBound = conv_to::from(lowerBound); + const BaseMatType castedUpperBound = conv_to::from(upperBound); + const size_t numObjectives = sizeof...(ArbitraryFunctionType); const size_t numVariables = iterate.n_rows; - //! Useful temporaries for float-like comparisons. - const BaseMatType castedLowerBound = arma::conv_to::from(lowerBound); - const BaseMatType castedUpperBound = arma::conv_to::from(upperBound); - // Controls early termination of the optimization process. bool terminate = false; - // The weight matrix. Each vector represents a decomposition subproblem (M X N). + // The weight matrix. Each vector represents a decomposition + // subproblem (M X N). const BaseMatType weights = initPolicy.template Generate( numObjectives, populationSize, epsilon); // 1.1 Storing the indices of nearest neighbors of each weight vector. - arma::umat neighborIndices(neighborSize, populationSize); + UMatType neighborIndices(neighborSize, populationSize); for (size_t i = 0; i < populationSize; ++i) { // Cache the distance between weights[i] and other weights. - const arma::Row distances = - arma::sqrt(arma::sum(arma::square(weights.col(i) - weights.each_col()))); - arma::uvec sortedIndices = arma::stable_sort_index(distances); + const BaseRowType distances = + conv_to::from( + sqrt(sum(square(weights.col(i) - weights.each_col())))); + UVecType sortedIndices = stable_sort_index(distances); // Ignore distance from self. - neighborIndices.col(i) = sortedIndices(arma::span(1, neighborSize)); + neighborIndices.col(i) = sortedIndices( + typename GetProxyType::span(1, neighborSize), 0); } // 1.2 Random generation of the initial population. std::vector population(populationSize); for (BaseMatType& individual : population) { - individual = arma::randu( - iterate.n_rows, iterate.n_cols) - 0.5 + iterate; + individual = randu( + iterate.n_rows, iterate.n_cols) - ElemType(0.5) + iterate; // Constrain all genes to be within bounds. - individual = arma::min(arma::max(individual, castedLowerBound), castedUpperBound); + individual = min(max(individual, castedLowerBound), castedUpperBound); } - Info << "MOEA/D-DE initialized successfully. Optimization started." << std::endl; + Info << "MOEA/D-DE initialized successfully. Optimization started." + << std::endl; - std::vector> populationFitness(populationSize); - std::fill(populationFitness.begin(), populationFitness.end(), - arma::Col(numObjectives, arma::fill::zeros)); + std::vector populationFitness(populationSize); + for (size_t i = 0; i < populationSize; ++i) + { + populationFitness[i].set_size(numObjectives); + populationFitness[i].zeros(); + } EvaluateObjectives(population, objectives, populationFitness); // 1.3 Initialize the ideal point z. - arma::Col idealPoint(numObjectives); + BaseColType idealPoint(numObjectives); idealPoint.fill(std::numeric_limits::max()); - for (const arma::Col& individualFitness : populationFitness) - idealPoint = arma::min(idealPoint, individualFitness); + for (const BaseColType& individualFitness : populationFitness) + idealPoint = min(idealPoint, individualFitness); Callback::BeginOptimization(*this, objectives, iterate, callbacks...); // 2 The main loop. - for (size_t generation = 1; generation <= maxGenerations && !terminate; ++generation) + for (size_t generation = 1; + generation <= maxGenerations && !terminate; ++generation) { // Shuffle indexes of subproblems. - const arma::uvec shuffle = arma::shuffle( - arma::linspace(0, populationSize - 1, populationSize)); - for (size_t subProblemIdx : shuffle) + const UVecType shuffleTemp = shuffle( + linspace(0, populationSize - 1, populationSize)); + + for (size_t i = 0; i < shuffleTemp.n_elem; ++i) { - // 2.1 Randomly select two indices in neighborIndices[subProblemIdx] and use them - // to make a child. + const size_t subProblemIdx = shuffleTemp(i); + // 2.1 Randomly select two indices in neighborIndices[subProblemIdx] + // and use them to make a child. size_t r1, r2, r3; r1 = subProblemIdx; // Randomly choose to sample from the population or the neighbors. @@ -216,19 +251,21 @@ Optimize(std::tuple& objectives, if (arma::randu() < crossoverProb) { candidate(geneIdx) = population[r1](geneIdx) + - differentialWeight * (population[r2](geneIdx) - - population[r3](geneIdx)); + ElemType(differentialWeight) * (population[r2](geneIdx) - + population[r3](geneIdx)); // Boundary conditions. if (candidate(geneIdx) < castedLowerBound(geneIdx)) { candidate(geneIdx) = castedLowerBound(geneIdx) + - arma::randu() * (population[r1](geneIdx) - castedLowerBound(geneIdx)); + arma::randu() * + (population[r1](geneIdx) - castedLowerBound(geneIdx)); } if (candidate(geneIdx) > castedUpperBound(geneIdx)) { candidate(geneIdx) = castedUpperBound(geneIdx) - - arma::randu() * (castedUpperBound(geneIdx) - population[r1](geneIdx)); + arma::randu() * + (castedUpperBound(geneIdx) - population[r1](geneIdx)); } } else @@ -238,10 +275,10 @@ Optimize(std::tuple& objectives, Mutate(candidate, 1.0 / static_cast(numVariables), castedLowerBound, castedUpperBound); - arma::Col candidateFitness(numObjectives); + BaseColType candidateFitness(numObjectives); //! Creating temp vectors to pass to EvaluateObjectives. std::vector candidateContainer { candidate }; - std::vector> fitnessContainer { candidateFitness }; + std::vector fitnessContainer { candidateFitness }; EvaluateObjectives(candidateContainer, objectives, fitnessContainer); candidateFitness = std::move(fitnessContainer[0]); //! Flush out the dummy containers. @@ -249,17 +286,18 @@ Optimize(std::tuple& objectives, candidateContainer.clear(); // 2.4 Update of ideal point. - idealPoint = arma::min(idealPoint, candidateFitness); + idealPoint = min(idealPoint, candidateFitness); // 2.5 Update of the population. size_t replaceCounter = 0; const size_t sampleSize = sampleNeighbor ? neighborSize : populationSize; - const arma::uvec idxShuffle = arma::shuffle( - arma::linspace(0, sampleSize - 1, sampleSize)); + const arma::uvec idxShuffle = shuffle( + linspace(0, sampleSize - 1, sampleSize)); - for (size_t idx : idxShuffle) + for (size_t i = 0; i < idxShuffle.n_elem; ++i) { + const size_t idx = idxShuffle(i); // Preserve diversity by controlling replacement of neighbors // by child solution. if (replaceCounter >= maxReplace) @@ -269,9 +307,11 @@ Optimize(std::tuple& objectives, neighborIndices(idx, subProblemIdx) : idx; const ElemType candidateDecomposition = decompPolicy.template - Apply>(weights.col(pick), idealPoint, candidateFitness); - const ElemType parentDecomposition = decompPolicy.template - Apply>(weights.col(pick), idealPoint, populationFitness[pick]); + Apply(conv_to::from(weights.col(pick)), + idealPoint, candidateFitness); + const ElemType parentDecomposition = decompPolicy.template + Apply(conv_to::from(weights.col(pick)), + idealPoint, populationFitness[pick]); if (candidateDecomposition < parentDecomposition) { @@ -291,24 +331,24 @@ Optimize(std::tuple& objectives, } // End of pass over all the generations. // Set the candidates from the Pareto Set as the output. - paretoSet.set_size(population[0].n_rows, population[0].n_cols, population.size()); + paretoSetIn.set_size( + population[0].n_rows, population[0].n_cols, population.size()); - // The Pareto Front is stored, can be obtained via ParetoSet() getter. for (size_t solutionIdx = 0; solutionIdx < population.size(); ++solutionIdx) { - paretoSet.slice(solutionIdx) = - arma::conv_to::from(population[solutionIdx]); + paretoSetIn.slice(solutionIdx) = + conv_to::from(population[solutionIdx]); } // Set the candidates from the Pareto Front as the output. - paretoFront.set_size(populationFitness[0].n_rows, populationFitness[0].n_cols, - populationFitness.size()); + paretoFrontIn.set_size(populationFitness[0].n_rows, + populationFitness[0].n_cols, populationFitness.size()); - // The Pareto Front is stored, can be obtained via ParetoFront() getter. - for (size_t solutionIdx = 0; solutionIdx < populationFitness.size(); ++solutionIdx) + for (size_t solutionIdx = 0; + solutionIdx < populationFitness.size(); ++solutionIdx) { - paretoFront.slice(solutionIdx) = - arma::conv_to::from(populationFitness[solutionIdx]); + paretoFrontIn.slice(solutionIdx) = + conv_to::from(populationFitness[solutionIdx]); } // Assign iterate to first element of the Pareto Set. @@ -320,19 +360,20 @@ Optimize(std::tuple& objectives, for (size_t geneIdx = 0; geneIdx < numObjectives; ++geneIdx) { - if (arma::accu(populationFitness[geneIdx]) < performance) - performance = arma::accu(populationFitness[geneIdx]); + if (accu(populationFitness[geneIdx]) < performance) + performance = accu(populationFitness[geneIdx]); } return performance; } //! Randomly chooses to select from parents or neighbors. -template +template +template inline std::tuple MOEAD:: Mating(size_t subProblemIdx, - const arma::umat& neighborIndices, + const UMatType& neighborIndices, bool sampleNeighbor) { //! Indexes of two points from the sample space. @@ -368,50 +409,55 @@ Mutate(MatType& candidate, const MatType& lowerBound, const MatType& upperBound) { - const size_t numVariables = candidate.n_rows; - for (size_t geneIdx = 0; geneIdx < numVariables; ++geneIdx) - { - // Should this gene be mutated? - if (arma::randu() > mutationRate) - continue; - - const double geneRange = upperBound(geneIdx) - lowerBound(geneIdx); - // Normalised distance from the bounds. - const double lowerDelta = (candidate(geneIdx) - lowerBound(geneIdx)) / geneRange; - const double upperDelta = (upperBound(geneIdx) - candidate(geneIdx)) / geneRange; - const double mutationPower = 1. / (distributionIndex + 1.0); - const double rand = arma::randu(); - double value, perturbationFactor; - if (rand < 0.5) - { - value = 2.0 * rand + (1.0 - 2.0 * rand) * - std::pow(upperDelta, distributionIndex + 1.0); - perturbationFactor = std::pow(value, mutationPower) - 1.0; - } - else - { - value = 2.0 * (1.0 - rand) + 2.0 *(rand - 0.5) * - std::pow(lowerDelta, distributionIndex + 1.0); - perturbationFactor = 1.0 - std::pow(value, mutationPower); - } + typedef typename MatType::elem_type ElemType; - candidate(geneIdx) += perturbationFactor * geneRange; + const size_t numVariables = candidate.n_rows; + for (size_t geneIdx = 0; geneIdx < numVariables; ++geneIdx) + { + // Should this gene be mutated? + if (arma::randu() > mutationRate) + continue; + + const double geneRange = upperBound(geneIdx) - lowerBound(geneIdx); + // Normalised distance from the bounds. + const double lowerDelta = (candidate(geneIdx) - lowerBound(geneIdx)) / + geneRange; + const double upperDelta = (upperBound(geneIdx) - candidate(geneIdx)) / + geneRange; + const double mutationPower = 1. / (distributionIndex + 1.0); + const double rand = arma::randu(); + double value, perturbationFactor; + if (rand < 0.5) + { + value = 2.0 * rand + (1.0 - 2.0 * rand) * + std::pow(upperDelta, distributionIndex + 1.0); + perturbationFactor = std::pow(value, mutationPower) - 1.0; } - //! Enforce bounds. - candidate = arma::min(arma::max(candidate, lowerBound), upperBound); + else + { + value = 2.0 * (1.0 - rand) + 2.0 * (rand - 0.5) * + std::pow(lowerDelta, distributionIndex + 1.0); + perturbationFactor = 1.0 - std::pow(value, mutationPower); + } + + candidate(geneIdx) += ElemType(perturbationFactor * geneRange); + } + //! Enforce bounds. + candidate = min(max(candidate, lowerBound), upperBound); } //! No objectives to evaluate. -template +template template typename std::enable_if::type -MOEAD:: -EvaluateObjectives( - std::vector&, +MOEAD::EvaluateObjectives( + std::vector&, std::tuple&, - std::vector >&) + std::vector&) { // Nothing to do here. } @@ -419,20 +465,21 @@ EvaluateObjectives( //! Evaluate the objectives for the entire population. template template typename std::enable_if::type MOEAD:: EvaluateObjectives( - std::vector& population, + std::vector& population, std::tuple& objectives, - std::vector >& calculatedObjectives) + std::vector& calculatedObjectives) { for (size_t i = 0; i < population.size(); i++) { calculatedObjectives[i](I) = std::get(objectives).Evaluate(population[i]); - EvaluateObjectives(population, objectives, - calculatedObjectives); + EvaluateObjectives(population, objectives, calculatedObjectives); } } diff --git a/inst/include/ensmallen_bits/moead/weight_init_policies/bbs_init.hpp b/inst/include/ensmallen_bits/moead/weight_init_policies/bbs_init.hpp index b877dd4..0fc6294 100644 --- a/inst/include/ensmallen_bits/moead/weight_init_policies/bbs_init.hpp +++ b/inst/include/ensmallen_bits/moead/weight_init_policies/bbs_init.hpp @@ -53,17 +53,17 @@ class BayesianBootstrap const size_t numPoints, const double epsilon) { - typedef typename MatType::elem_type ElemType; - typedef typename arma::Col VecType; + typedef typename ForwardType::bvec VecType; MatType weights(numObjectives, numPoints); for (size_t pointIdx = 0; pointIdx < numPoints; ++pointIdx) { - VecType referenceDirection(numObjectives + 1, arma::fill::randu); + VecType referenceDirection(numObjectives + 1, + GetFillType::randu); referenceDirection(0) = 0; referenceDirection(numObjectives) = 1; - referenceDirection = arma::sort(referenceDirection); - referenceDirection = arma::diff(referenceDirection); + referenceDirection = sort(referenceDirection); + referenceDirection = diff(referenceDirection); weights.col(pointIdx) = std::move(referenceDirection) + epsilon; } diff --git a/inst/include/ensmallen_bits/moead/weight_init_policies/dirichlet_init.hpp b/inst/include/ensmallen_bits/moead/weight_init_policies/dirichlet_init.hpp index 695b417..2788eb6 100644 --- a/inst/include/ensmallen_bits/moead/weight_init_policies/dirichlet_init.hpp +++ b/inst/include/ensmallen_bits/moead/weight_init_policies/dirichlet_init.hpp @@ -15,8 +15,8 @@ namespace ens { /** - * The Dirichlet method for initializing weights. Sampling a - * Dirichlet distribution with parameters set to one returns + * The Dirichlet method for initializing weights. Sampling a + * Dirichlet distribution with parameters set to one returns * point lying on unit simplex with uniform distribution. */ class Dirichlet @@ -43,10 +43,15 @@ class Dirichlet const size_t numPoints, const double epsilon) { - MatType weights = arma::randg(numObjectives, numPoints, - arma::distr_param(1.0, 1.0)) + epsilon; + // TODO: Replace with randg once Bandicoot supports it. Simulate randg using + // inverse transform sampling. + // arma::mat weights = arma::randg(numObjectives, numPoints, + // arma::distr_param(1.0, 1.0)) + epsilon; + MatType weights = -log(1.0 - randu( + numObjectives, numPoints)) + epsilon; + // Normalize each column. - return arma::normalise(weights, 1, 0); + return normalise(weights, 1, 0); } }; diff --git a/inst/include/ensmallen_bits/moead/weight_init_policies/uniform_init.hpp b/inst/include/ensmallen_bits/moead/weight_init_policies/uniform_init.hpp index 002033a..db63159 100644 --- a/inst/include/ensmallen_bits/moead/weight_init_policies/uniform_init.hpp +++ b/inst/include/ensmallen_bits/moead/weight_init_policies/uniform_init.hpp @@ -59,7 +59,8 @@ class Uniform //! The requested number of points is not matching any partition number. if (numPoints != validNumPoints) { - size_t nextValidNumPoints = FindNumUniformPoints(numObjectives, numPartitions + 1); + size_t nextValidNumPoints = FindNumUniformPoints( + numObjectives, numPartitions + 1); std::ostringstream oss; oss << "DasDennis::Generate(): " << "The requested numPoints " << numPoints << " cannot be generated uniformly.\n " << "Either choose numPoints as " @@ -128,8 +129,7 @@ class Uniform /** * A helper function for DasDennis */ - template + template void DasDennisHelper(AuxInfoStackType& progressStack, MatType& weights, const size_t numObjectives, @@ -138,10 +138,10 @@ class Uniform const double epsilon) { typedef typename MatType::elem_type ElemType; - typedef typename arma::Row RowType; + typedef typename ForwardType::brow RowType; size_t counter = 0; - const ElemType delta = 1.0 / (ElemType)numPartitions; + const ElemType delta = 1 / (ElemType) numPartitions; while ((counter < numPoints) && !progressStack.empty()) { @@ -154,7 +154,7 @@ class Uniform { point.insert_rows(point.n_rows, RowType(1).fill( delta * static_cast(beta))); - weights.col(counter) = point + epsilon; + weights.col(counter) = point + ElemType(epsilon); ++counter; } @@ -189,7 +189,7 @@ class Uniform //! Init the progress stack. progressStack.push_back({{}, numPartitions}); MatType weights(numObjectives, numPoints); - weights.fill(arma::datum::nan); + weights.fill(arma::Datum::nan); DasDennisHelper( progressStack, weights, diff --git a/inst/include/ensmallen_bits/nsga2/nsga2.hpp b/inst/include/ensmallen_bits/nsga2/nsga2.hpp index 06035dd..4af8612 100644 --- a/inst/include/ensmallen_bits/nsga2/nsga2.hpp +++ b/inst/include/ensmallen_bits/nsga2/nsga2.hpp @@ -40,10 +40,10 @@ namespace ens { * * @code * @article{10.1109/4235.996017, - * author = {Deb, K. and Pratap, A. and Agarwal, S. and Meyarivan, T.}, - * title = {A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II}, - * year = {2002}, - * url = {https://doi.org/10.1109/4235.996017}, + * author = {Deb, K. and Pratap, A. and Agarwal, S. and Meyarivan, T.}, + * title = {A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II}, + * year = {2002}, + * url = {https://doi.org/10.1109/4235.996017}, * journal = {Trans. Evol. Comp}} * @endcode * @@ -125,10 +125,37 @@ class NSGA2 template - typename MatType::elem_type Optimize( - std::tuple& objectives, - MatType& iterate, - CallbackTypes&&... callbacks); + typename MatType::elem_type Optimize( + std::tuple& objectives, + MatType& iterate, + CallbackTypes&&... callbacks); + + /** + * Optimize a set of objectives. The initial population is generated using the + * starting point. The output is the best generated front. + * + * @tparam ArbitraryFunctionType std::tuple of multiple objectives. + * @tparam MatType Type of matrix to optimize. + * @tparam CubeType The type of cube used to store the front and Pareto set. + * @tparam CallbackTypes Types of callback functions. + * @param objectives Vector of objective functions to optimize for. + * @param iterate Starting point. + * @param front The generated front. + * @param paretoSet The generated Pareto set. + * @param callbacks Callback functions. + * @return MatType::elem_type The minimum of the accumulated sum over the + * objective values in the best front. + */ + template + typename MatType::elem_type Optimize( + std::tuple& objectives, + MatType& iterate, + CubeType& front, + CubeType& paretoSet, + CallbackTypes&&... callbacks); //! Get the population size. size_t PopulationSize() const { return populationSize; } @@ -170,60 +197,34 @@ class NSGA2 //! Modify value of upperBound. arma::vec& UpperBound() { return upperBound; } - //! Retrieve the Pareto optimal points in variable space. This returns an empty cube - //! until `Optimize()` has been called. - const arma::cube& ParetoSet() const { return paretoSet; } - - //! Retrieve the best front (the Pareto frontier). This returns an empty cube until - //! `Optimize()` has been called. - const arma::cube& ParetoFront() const { return paretoFront; } - - /** - * Retrieve the best front (the Pareto frontier). This returns an empty - * vector until `Optimize()` has been called. Note that this function is - * deprecated and will be removed in ensmallen 3.x! Use `ParetoFront()` - * instead. - */ - [[deprecated("use ParetoFront() instead")]] const std::vector& Front() - { - if (rcFront.size() == 0) - { - // Match the old return format. - for (size_t i = 0; i < paretoFront.n_slices; ++i) - { - rcFront.push_back(arma::mat(paretoFront.slice(i))); - } - } - - return rcFront; - } - private: /** * Evaluate objectives for the elite population. * * @tparam ArbitraryFunctionType std::tuple of multiple function types. - * @tparam MatType Type of matrix to optimize. + * @tparam InputMatType Type of matrix to optimize. * @param population The elite population. * @param objectives The set of objectives. - * @param calculatedObjectives Vector to store calculated objectives. + * @param calculatedObjectives Matrix to store calculated objectives (numObjectives x 1 x populationSize). */ template typename std::enable_if::type EvaluateObjectives(std::vector&, std::tuple&, - std::vector >&); + ObjectiveMatType&); template typename std::enable_if::type - EvaluateObjectives(std::vector& population, - std::tuple& objectives, - std::vector >& - calculatedObjectives); + EvaluateObjectives( + std::vector& population, + std::tuple& objectives, + ObjectiveMatType& calculatedObjectives); /** * Reproduce candidates from the elite population to generate a new @@ -235,10 +236,11 @@ class NSGA2 * @param lowerBound Lower bound of the coordinates of the initial population. * @param upperBound Upper bound of the coordinates of the initial population. */ - template - void BinaryTournamentSelection(std::vector& population, - const MatType& lowerBound, - const MatType& upperBound); + template + void BinaryTournamentSelection( + std::vector& population, + const InputMatType& lowerBound, + const InputMatType& upperBound); /** * Crossover two parents to create a pair of new children. @@ -249,11 +251,12 @@ class NSGA2 * @param parentA First parent from elite population. * @param parentB Second parent from elite population. */ - template - void Crossover(MatType& childA, - MatType& childB, - const MatType& parentA, - const MatType& parentB); + template + void Crossover( + InputMatType& childA, + InputMatType& childB, + const InputMatType& parentA, + const InputMatType& parentB); /** * Mutate the coordinates for a candidate. @@ -264,10 +267,11 @@ class NSGA2 * @param lowerBound Lower bound of the coordinates of the initial population. * @param upperBound Upper bound of the coordinates of the initial population. */ - template - void Mutate(MatType& child, - const MatType& lowerBound, - const MatType& upperBound); + template + void Mutate( + InputMatType& child, + const InputMatType& lowerBound, + const InputMatType& upperBound); /** * Sort the candidate population using their domination count and the set of @@ -283,7 +287,7 @@ class NSGA2 void FastNonDominatedSort( std::vector >& fronts, std::vector& ranks, - std::vector >& calculatedObjectives); + MatType& calculatedObjectives); /** * Operator to check if one candidate Pareto-dominates the other. @@ -300,7 +304,7 @@ class NSGA2 */ template bool Dominates( - std::vector >& calculatedObjectives, + MatType& calculatedObjectives, size_t candidateP, size_t candidateQ); @@ -315,7 +319,7 @@ class NSGA2 template void CrowdingDistanceAssignment( const std::vector& front, - std::vector>& calculatedObjectives, + MatType& calculatedObjectives, std::vector& crowdingDistance); /** @@ -334,16 +338,17 @@ class NSGA2 * the population. * @return true if the first candidate is preferred, otherwise, false. */ - template - bool CrowdingOperator(size_t idxP, - size_t idxQ, - const std::vector& ranks, - const std::vector& crowdingDistance); + template + bool CrowdingOperator( + size_t idxP, + size_t idxQ, + const std::vector& ranks, + const std::vector& crowdingDistance); //! The number of objectives being optimised for. size_t numObjectives; - //! The numbeer of variables used per objectives. + //! The number of variables used per objectives. size_t numVariables; //! The number of candidates in the population. @@ -369,19 +374,6 @@ class NSGA2 //! Upper bound of the initial swarm. arma::vec upperBound; - - //! The set of all the Pareto optimal points. - //! Stored after Optimize() is called. - arma::cube paretoSet; - - //! The set of all the Pareto optimal objective vectors. - //! Stored after Optimize() is called. - arma::cube paretoFront; - - //! A different representation of the Pareto front, for reverse compatibility - //! purposes. This can be removed when ensmallen 3.x is released! (Along - //! with `Front()`.) This is only populated when `Front()` is called. - std::vector rcFront; }; } // namespace ens diff --git a/inst/include/ensmallen_bits/nsga2/nsga2_impl.hpp b/inst/include/ensmallen_bits/nsga2/nsga2_impl.hpp index 00c63b0..e75e878 100644 --- a/inst/include/ensmallen_bits/nsga2/nsga2_impl.hpp +++ b/inst/include/ensmallen_bits/nsga2/nsga2_impl.hpp @@ -68,6 +68,24 @@ typename MatType::elem_type NSGA2::Optimize( std::tuple& objectives, MatType& iterateIn, CallbackTypes&&... callbacks) +{ + typedef typename ForwardType::bcube CubeType; + CubeType paretoFront, paretoSet; + return Optimize(objectives, iterateIn, paretoFront, paretoSet, + std::forward(callbacks)...); +} + +//! Optimize the function. +template +typename MatType::elem_type NSGA2::Optimize( + std::tuple& objectives, + MatType& iterateIn, + CubeType& paretoFrontIn, + CubeType& paretoSetIn, + CallbackTypes&&... callbacks) { // Make sure for evolution to work at least four candidates are present. if (populationSize < 4 && populationSize % 4 != 0) @@ -79,6 +97,7 @@ typename MatType::elem_type NSGA2::Optimize( // Convenience typedefs. typedef typename MatType::elem_type ElemType; typedef typename MatTypeTraits::BaseMatType BaseMatType; + typedef typename ForwardType::bmat CubeBaseMatType; BaseMatType& iterate = (BaseMatType&) iterateIn; @@ -104,8 +123,9 @@ typename MatType::elem_type NSGA2::Optimize( numObjectives = sizeof...(ArbitraryFunctionType); numVariables = iterate.n_rows; - // Cache calculated objectives. - std::vector > calculatedObjectives(populationSize); + // Cache calculated objectives as a matrix: (numObjectives x populationSize). + arma::Mat calculatedObjectives(numObjectives, populationSize, + arma::fill::zeros); // Population size reserved to 2 * populationSize + 1 to accommodate // for the size of intermediate candidate population. @@ -121,8 +141,10 @@ typename MatType::elem_type NSGA2::Optimize( std::vector ranks; //! Useful temporaries for float-like comparisons. - const BaseMatType castedLowerBound = arma::conv_to::from(lowerBound); - const BaseMatType castedUpperBound = arma::conv_to::from(upperBound); + const BaseMatType castedLowerBound = conv_to::from( + lowerBound); + const BaseMatType castedUpperBound = conv_to::from( + upperBound); // Controls early termination of the optimization process. bool terminate = false; @@ -131,11 +153,11 @@ typename MatType::elem_type NSGA2::Optimize( // starting point. for (size_t i = 0; i < populationSize; i++) { - population.push_back(arma::randu(iterate.n_rows, - iterate.n_cols) - 0.5 + iterate); + population.push_back(randu(iterate.n_rows, + iterate.n_cols) - ElemType(0.5) + iterate); // Constrain all genes to be within bounds. - population[i] = arma::min(arma::max(population[i], castedLowerBound), castedUpperBound); + population[i] = min(max(population[i], castedLowerBound), castedUpperBound); } Info << "NSGA2 initialized successfully. Optimization started." << std::endl; @@ -143,7 +165,8 @@ typename MatType::elem_type NSGA2::Optimize( // Iterate until maximum number of generations is obtained. Callback::BeginOptimization(*this, objectives, iterate, callbacks...); - for (size_t generation = 1; generation <= maxGenerations && !terminate; generation++) + for (size_t generation = 1; generation <= maxGenerations && !terminate; + generation++) { Info << "NSGA2: iteration " << generation << "." << std::endl; @@ -152,22 +175,20 @@ typename MatType::elem_type NSGA2::Optimize( BinaryTournamentSelection(population, castedLowerBound, castedUpperBound); // Evaluate the objectives for the new population. - calculatedObjectives.resize(population.size()); - std::fill(calculatedObjectives.begin(), calculatedObjectives.end(), - arma::Col(numObjectives, arma::fill::zeros)); + calculatedObjectives.zeros(numObjectives, population.size()); EvaluateObjectives(population, objectives, calculatedObjectives); // Perform fast non dominated sort on P_t ∪ G_t. ranks.resize(population.size()); - FastNonDominatedSort(fronts, ranks, calculatedObjectives); + FastNonDominatedSort(fronts, ranks, calculatedObjectives); // Perform crowding distance assignment. crowdingDistance.resize(population.size()); std::fill(crowdingDistance.begin(), crowdingDistance.end(), 0.); for (size_t fNum = 0; fNum < fronts.size(); fNum++) { - CrowdingDistanceAssignment( - fronts[fNum], calculatedObjectives, crowdingDistance); + CrowdingDistanceAssignment(fronts[fNum], calculatedObjectives, + crowdingDistance); } // Sort based on crowding distance. @@ -178,14 +199,17 @@ typename MatType::elem_type NSGA2::Optimize( size_t idxP{}, idxQ{}; for (size_t i = 0; i < population.size(); i++) { - if (arma::approx_equal(population[i], candidateP, "absdiff", epsilon)) + if (approx_equal(population[i], candidateP, "absdiff", + ElemType(epsilon))) idxP = i; - if (arma::approx_equal(population[i], candidateQ, "absdiff", epsilon)) + if (approx_equal(population[i], candidateQ, "absdiff", + ElemType(epsilon))) idxQ = i; } - return CrowdingOperator(idxP, idxQ, ranks, crowdingDistance); + return CrowdingOperator(idxP, idxQ, ranks, + crowdingDistance); } ); @@ -198,28 +222,23 @@ typename MatType::elem_type NSGA2::Optimize( } // Set the candidates from the Pareto Set as the output. - paretoSet.set_size(population[0].n_rows, population[0].n_cols, fronts[0].size()); + paretoSetIn.set_size(population[0].n_rows, population[0].n_cols, + fronts[0].size()); // The Pareto Set is stored, can be obtained via ParetoSet() getter. for (size_t solutionIdx = 0; solutionIdx < fronts[0].size(); ++solutionIdx) { - paretoSet.slice(solutionIdx) = - arma::conv_to::from(population[fronts[0][solutionIdx]]); + paretoSetIn.slice(solutionIdx) = conv_to::from( + population[fronts[0][solutionIdx]]); } // Set the candidates from the Pareto Front as the output. - paretoFront.set_size(calculatedObjectives[0].n_rows, calculatedObjectives[0].n_cols, - fronts[0].size()); - // The Pareto Front is stored, can be obtained via ParetoFront() getter. + paretoFrontIn.set_size(calculatedObjectives.n_rows, 1, fronts[0].size()); for (size_t solutionIdx = 0; solutionIdx < fronts[0].size(); ++solutionIdx) { - paretoFront.slice(solutionIdx) = - arma::conv_to::from(calculatedObjectives[fronts[0][solutionIdx]]); + paretoFrontIn.slice(solutionIdx) = conv_to::from( + calculatedObjectives.col(fronts[0][solutionIdx])); } - // Clear rcFront, in case it is later requested by the user for reverse - // compatibility reasons. - rcFront.clear(); - // Assign iterate to first element of the Pareto Set. iterate = population[fronts[0][0]]; @@ -227,9 +246,8 @@ typename MatType::elem_type NSGA2::Optimize( ElemType performance = std::numeric_limits::max(); - for (const arma::Col& objective: calculatedObjectives) - if (arma::accu(objective) < performance) - performance = arma::accu(objective); + for (size_t i = 0; i < calculatedObjectives.n_cols; ++i) + performance = std::min(performance, arma::accu(calculatedObjectives.col(i))); return performance; } @@ -237,12 +255,13 @@ typename MatType::elem_type NSGA2::Optimize( //! No objectives to evaluate. template typename std::enable_if::type NSGA2::EvaluateObjectives( std::vector&, std::tuple&, - std::vector >&) + ObjectiveMatType&) { // Nothing to do here. } @@ -250,34 +269,39 @@ NSGA2::EvaluateObjectives( //! Evaluate the objectives for the entire population. template typename std::enable_if::type NSGA2::EvaluateObjectives( std::vector& population, std::tuple& objectives, - std::vector >& calculatedObjectives) + ObjectiveMatType& calculatedObjectives) { for (size_t i = 0; i < populationSize; i++) { - calculatedObjectives[i](I) = std::get(objectives).Evaluate(population[i]); - EvaluateObjectives(population, objectives, - calculatedObjectives); + calculatedObjectives(I, i) = + std::get(objectives).Evaluate(population[i]); + EvaluateObjectives(population, objectives, calculatedObjectives); } } //! Reproduce and generate new candidates. -template -inline void NSGA2::BinaryTournamentSelection(std::vector& population, - const MatType& lowerBound, - const MatType& upperBound) +template +void NSGA2::BinaryTournamentSelection( + std::vector& population, + const InputMatType& lowerBound, + const InputMatType& upperBound) { - std::vector children; + std::vector children; while (children.size() < population.size()) { // Choose two random parents for reproduction from the elite population. - size_t indexA = arma::randi(arma::distr_param(0, populationSize - 1)); - size_t indexB = arma::randi(arma::distr_param(0, populationSize - 1)); + size_t indexA = arma::randi( + arma::distr_param(0, populationSize - 1)); + size_t indexB = arma::randi( + arma::distr_param(0, populationSize - 1)); // Make sure that the parents differ. if (indexA == indexB) @@ -289,7 +313,7 @@ inline void NSGA2::BinaryTournamentSelection(std::vector& population, } // Initialize the children to the respective parents. - MatType childA = population[indexA], childB = population[indexB]; + InputMatType childA = population[indexA], childB = population[indexB]; Crossover(childA, childB, population[indexA], population[indexB]); @@ -302,18 +326,23 @@ inline void NSGA2::BinaryTournamentSelection(std::vector& population, } // Add the candidates to the elite population. - population.insert(std::end(population), std::begin(children), std::end(children)); + population.insert( + std::end(population), std::begin(children), std::end(children)); } //! Perform crossover of genes for the children. -template -inline void NSGA2::Crossover(MatType& childA, - MatType& childB, - const MatType& parentA, - const MatType& parentB) +template +void NSGA2::Crossover( + InputMatType& childA, + InputMatType& childB, + const InputMatType& parentA, + const InputMatType& parentB) { + typedef typename InputMatType::elem_type ElemType; + // Indices at which crossover is to occur. - const arma::umat idx = arma::randu(childA.n_rows, childA.n_cols) < crossoverProb; + const InputMatType idx = conv_to::from(randu( + childA.n_rows, childA.n_cols) < ElemType(crossoverProb)); // Use traits from parentA for indices where idx is 1 and parentB otherwise. childA = parentA % idx + parentB % (1 - idx); @@ -322,24 +351,30 @@ inline void NSGA2::Crossover(MatType& childA, } //! Perform mutation of the candidates weights with some noise. -template -inline void NSGA2::Mutate(MatType& child, - const MatType& lowerBound, - const MatType& upperBound) +template +void NSGA2::Mutate( + InputMatType& child, + const InputMatType& lowerBound, + const InputMatType& upperBound) { - child += (arma::randu(child.n_rows, child.n_cols) < mutationProb) % - (mutationStrength * arma::randn(child.n_rows, child.n_cols)); + typedef typename InputMatType::elem_type ElemType; + + child += conv_to::from( + InputMatType(child.n_rows, child.n_cols, + GetFillType::randu) < ElemType(mutationProb)) % + (ElemType(mutationStrength) * InputMatType(child.n_rows, child.n_cols, + GetFillType::randn)); // Constrain all genes to be between bounds. - child = arma::min(arma::max(child, lowerBound), upperBound); + child = min(max(child, lowerBound), upperBound); } //! Sort population into Pareto fronts. template -inline void NSGA2::FastNonDominatedSort( +void NSGA2::FastNonDominatedSort( std::vector >& fronts, std::vector& ranks, - std::vector >& calculatedObjectives) + MatType& calculatedObjectives) { std::map dominationCount; std::map > dominated; @@ -398,22 +433,24 @@ inline void NSGA2::FastNonDominatedSort( //! Check if a candidate Pareto dominates another candidate. template inline bool NSGA2::Dominates( - std::vector >& calculatedObjectives, + MatType& calculatedObjectives, size_t candidateP, size_t candidateQ) { bool allBetterOrEqual = true; bool atleastOneBetter = false; - size_t n_objectives = calculatedObjectives[0].n_elem; + const size_t n_objectives = calculatedObjectives.n_rows; for (size_t i = 0; i < n_objectives; i++) { // P is worse than Q for the i-th objective function. - if (calculatedObjectives[candidateP](i) > calculatedObjectives[candidateQ](i)) + if (calculatedObjectives(i, candidateP) > + calculatedObjectives(i, candidateQ)) allBetterOrEqual = false; // P is better than Q for the i-th objective function. - else if (calculatedObjectives[candidateP](i) < calculatedObjectives[candidateQ](i)) + else if (calculatedObjectives(i, candidateP) < + calculatedObjectives(i, candidateQ)) atleastOneBetter = true; } @@ -422,34 +459,29 @@ inline bool NSGA2::Dominates( //! Assign crowding distance to the population. template -inline void NSGA2::CrowdingDistanceAssignment( +void NSGA2::CrowdingDistanceAssignment( const std::vector& front, - std::vector>& calculatedObjectives, + MatType& calculatedObjectives, std::vector& crowdingDistance) { // Convenience typedefs. typedef typename MatType::elem_type ElemType; + typedef typename ForwardType::uvec UVecType; + typedef typename ForwardType::bcol BaseColType; size_t fSize = front.size(); // Stores the sorted indices of the fronts. - arma::uvec sortedIdx = arma::regspace(0, 1, fSize - 1); + UVecType sortedIdx = regspace(0, 1, fSize - 1); for (size_t m = 0; m < numObjectives; m++) { // Cache fValues of individuals for current objective. - arma::Col fValues(fSize); - std::transform(front.begin(), front.end(), fValues.begin(), - [&](const size_t& individual) - { - return calculatedObjectives[individual](m); - }); + BaseColType fValues(fSize); + for (size_t k = 0; k < fSize; ++k) + fValues(k) = calculatedObjectives(m, size_t(front[k])); // Sort front indices by ascending fValues for current objective. - std::sort(sortedIdx.begin(), sortedIdx.end(), - [&](const size_t& frontIdxA, const size_t& frontIdxB) - { - return (fValues(frontIdxA) < fValues(frontIdxB)); - }); + sortedIdx = sort_index(fValues, "ascend"); crowdingDistance[front[sortedIdx(0)]] = std::numeric_limits::max(); @@ -458,7 +490,7 @@ inline void NSGA2::CrowdingDistanceAssignment( ElemType minFval = fValues(sortedIdx(0)); ElemType maxFval = fValues(sortedIdx(fSize - 1)); ElemType scale = - std::abs(maxFval - minFval) == 0. ? 1. : std::abs(maxFval - minFval); + std::abs(maxFval - minFval) == 0 ? 1 : std::abs(maxFval - minFval); for (size_t i = 1; i < fSize - 1; i++) { @@ -469,16 +501,22 @@ inline void NSGA2::CrowdingDistanceAssignment( } //! Comparator for crowding distance based sorting. -template -inline bool NSGA2::CrowdingOperator(size_t idxP, - size_t idxQ, - const std::vector& ranks, - const std::vector& crowdingDistance) +template +bool NSGA2::CrowdingOperator( + size_t idxP, + size_t idxQ, + const std::vector& ranks, + const std::vector& crowdingDistance) { if (ranks[idxP] < ranks[idxQ]) + { return true; - else if (ranks[idxP] == ranks[idxQ] && crowdingDistance[idxP] > crowdingDistance[idxQ]) + } + else if (ranks[idxP] == ranks[idxQ] && + crowdingDistance[idxP] > crowdingDistance[idxQ]) + { return true; + } return false; } diff --git a/inst/include/ensmallen_bits/padam/padam_update.hpp b/inst/include/ensmallen_bits/padam/padam_update.hpp index a4a6924..3881403 100644 --- a/inst/include/ensmallen_bits/padam/padam_update.hpp +++ b/inst/include/ensmallen_bits/padam/padam_update.hpp @@ -85,6 +85,8 @@ class PadamUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -95,11 +97,19 @@ class PadamUpdate */ Policy(PadamUpdate& parent, const size_t rows, const size_t cols) : parent(parent), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), + partial(ElemType(parent.partial)), iteration(0) { m.zeros(rows, cols); v.zeros(rows, cols); vImproved.zeros(rows, cols); + + // Attempt to detect underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -117,50 +127,57 @@ class PadamUpdate ++iteration; // And update the iterate. - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - v *= parent.beta2; - v += (1 - parent.beta2) * (gradient % gradient); + v *= beta2; + v += (1 - beta2) * (gradient % gradient); - const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration); - const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration); + const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration)); + const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration)); // Element wise maximum of past and present squared gradients. - vImproved = arma::max(vImproved, v); + vImproved = max(vImproved, v); - iterate -= (stepSize * std::sqrt(biasCorrection2) / biasCorrection1) * - m / arma::pow(vImproved + parent.epsilon, parent.partial); + iterate -= (ElemType(stepSize) * + std::sqrt(biasCorrection2) / biasCorrection1) * + m / pow(vImproved + epsilon, partial); } private: - //! Instantiated parent object. + // Instantiated parent object. PadamUpdate& parent; - //! The exponential moving average of gradient values. + // The exponential moving average of gradient values. GradType m; - //! The exponential moving average of squared gradient values. + // The exponential moving average of squared gradient values. GradType v; - //! The optimal sqaured gradient value. + // The optimal squared gradient value. GradType vImproved; - //! The number of iterations. + // Parameters converted to the element type of the optimization. + ElemType epsilon; + ElemType beta1; + ElemType beta2; + ElemType partial; + + // The number of iterations. size_t iteration; }; private: - //! The epsilon value used to initialise the squared gradient parameter. + // The epsilon value used to initialise the squared gradient parameter. double epsilon; - //! The smoothing parameter. + // The smoothing parameter. double beta1; - //! The second moment coefficient. + // The second moment coefficient. double beta2; - //! Partial adaptive parameter. + // Partial adaptive parameter. double partial; }; diff --git a/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd.hpp b/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd.hpp index 75610d3..711b6ee 100644 --- a/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd.hpp +++ b/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd.hpp @@ -157,4 +157,4 @@ class ParallelSGD // Include implementation. #include "parallel_sgd_impl.hpp" -#endif +#endif \ No newline at end of file diff --git a/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd_impl.hpp b/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd_impl.hpp index e9e4429..867af58 100644 --- a/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd_impl.hpp +++ b/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd_impl.hpp @@ -132,7 +132,7 @@ typename MatType::elem_type>::type ParallelSGD::Optimize( return overallObjective; } - // Get the stepsize for this iteration + // Get the stepsize for this iteration. double stepSize = decayPolicy.StepSize(i); // Shuffle for uniform sampling of functions by each thread. @@ -180,7 +180,9 @@ typename MatType::elem_type>::type ParallelSGD::Optimize( // Call out to utility function to use the right type of OpenMP // lock. - UpdateLocation(iterate, row, i, stepSize * value); + // TODO: if batch size support > 1 is added, `stepSize` will need to + // be updated here. + UpdateLocation(iterate, row, i, ElemType(stepSize) * value); } } terminate |= Callback::StepTaken(*this, function, iterate, diff --git a/inst/include/ensmallen_bits/problems/ackley_function_impl.hpp b/inst/include/ensmallen_bits/problems/ackley_function_impl.hpp index 1fdaa8f..384a017 100644 --- a/inst/include/ensmallen_bits/problems/ackley_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/ackley_function_impl.hpp @@ -38,8 +38,9 @@ typename MatType::elem_type AckleyFunction::Evaluate( const ElemType x2 = coordinates(1); const ElemType objective = -20 * std::exp( - -0.2 * std::sqrt(0.5 * (x1 * x1 + x2 * x2))) - - std::exp(0.5 * (std::cos(c * x1) + std::cos(c * x2))) + std::exp(1) + 20; + -(std::sqrt((x1 * x1 + x2 * x2) / 2)) / 5) - + std::exp((std::cos(ElemType(c) * x1) + std::cos(ElemType(c) * x2)) / 2) + + std::exp(ElemType(1)) + 20; return objective; } @@ -65,14 +66,14 @@ inline void AckleyFunction::Gradient(const MatType& coordinates, const ElemType x2 = coordinates(1); // Aliases for different terms in the expression of the gradient. - const ElemType t0 = std::sqrt(0.5 * (x1 * x1 + x2 * x2)); - const ElemType t1 = 2.0 * std::exp(- 0.2 * t0) / (t0 + epsilon); - const ElemType t2 = 0.5 * c * - std::exp(0.5 * (std::cos(c * x1) + std::cos(c * x2))); + const ElemType t0 = std::sqrt((x1 * x1 + x2 * x2) / 2); + const ElemType t1 = 2 * std::exp(-t0 / 5) / (t0 + ElemType(epsilon)); + const ElemType t2 = ElemType(c) / 2 * + std::exp((std::cos(ElemType(c) * x1) + std::cos(ElemType(c) * x2)) / 2); gradient.set_size(2, 1); - gradient(0) = (x1 * t1) + (t2 * std::sin(c * x1)); - gradient(1) = (x2 * t1) + (t2 * std::sin(c * x2)); + gradient(0) = (x1 * t1) + (t2 * std::sin(ElemType(c) * x1)); + gradient(1) = (x2 * t1) + (t2 * std::sin(ElemType(c) * x2)); } template diff --git a/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions.hpp b/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions.hpp index e368160..43058db 100644 --- a/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions.hpp +++ b/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions.hpp @@ -23,26 +23,28 @@ namespace test { * The minimum that satisfies the constraint is x = [1, 4], with an objective * value of 70. */ +template class AugLagrangianTestFunction { public: AugLagrangianTestFunction(); - AugLagrangianTestFunction(const arma::mat& initial_point); + AugLagrangianTestFunction(const MatType& initial_point); - double Evaluate(const arma::mat& coordinates); - void Gradient(const arma::mat& coordinates, arma::mat& gradient); + typename MatType::elem_type Evaluate(const MatType& coordinates); + void Gradient(const MatType& coordinates, MatType& gradient); size_t NumConstraints() const { return 1; } - double EvaluateConstraint(const size_t index, const arma::mat& coordinates); + typename MatType::elem_type EvaluateConstraint(const size_t index, + const MatType& coordinates); void GradientConstraint(const size_t index, - const arma::mat& coordinates, - arma::mat& gradient); + const MatType& coordinates, + MatType& gradient); - const arma::mat& GetInitialPoint() const { return initialPoint; } + const MatType& GetInitialPoint() const { return initialPoint; } private: - arma::mat initialPoint; + MatType initialPoint; }; /** @@ -83,7 +85,7 @@ class GockenbachFunction template MatType GetInitialPoint() const { - return arma::conv_to::from(initialPoint); + return conv_to::from(initialPoint); } private: diff --git a/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions_impl.hpp b/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions_impl.hpp index 56a401b..ee6802c 100644 --- a/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions_impl.hpp +++ b/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions_impl.hpp @@ -20,29 +20,37 @@ namespace test { // // AugLagrangianTestFunction // -inline AugLagrangianTestFunction::AugLagrangianTestFunction() +template +inline AugLagrangianTestFunction::AugLagrangianTestFunction() { // Set the initial point to be (0, 0). initialPoint.zeros(2, 1); } -inline AugLagrangianTestFunction::AugLagrangianTestFunction( - const arma::mat& initialPoint) : +template +inline AugLagrangianTestFunction::AugLagrangianTestFunction( + const MatType& initialPoint) : initialPoint(initialPoint) { // Nothing to do. } -inline double AugLagrangianTestFunction::Evaluate(const arma::mat& coordinates) +template +inline typename MatType::elem_type AugLagrangianTestFunction::Evaluate( + const MatType& coordinates) { + typedef typename MatType::elem_type ElemType; + // f(x) = 6 x_1^2 + 4 x_1 x_2 + 3 x_2^2 - return ((6 * std::pow(coordinates[0], 2)) + + return ((6 * std::pow(coordinates[0], ElemType(2))) + (4 * (coordinates[0] * coordinates[1])) + - (3 * std::pow(coordinates[1], 2))); + (3 * std::pow(coordinates[1], ElemType(2)))); } -inline void AugLagrangianTestFunction::Gradient(const arma::mat& coordinates, - arma::mat& gradient) +template +inline void AugLagrangianTestFunction::Gradient( + const MatType& coordinates, + MatType& gradient) { // f'_x1(x) = 12 x_1 + 4 x_2 // f'_x2(x) = 4 x_1 + 6 x_2 @@ -52,8 +60,11 @@ inline void AugLagrangianTestFunction::Gradient(const arma::mat& coordinates, gradient[1] = 4 * coordinates[0] + 6 * coordinates[1]; } -inline double AugLagrangianTestFunction::EvaluateConstraint(const size_t index, - const arma::mat& coordinates) +template +inline typename MatType::elem_type +AugLagrangianTestFunction::EvaluateConstraint( + const size_t index, + const MatType& coordinates) { // We return 0 if the index is wrong (not 0). if (index != 0) @@ -63,9 +74,11 @@ inline double AugLagrangianTestFunction::EvaluateConstraint(const size_t index, return (coordinates[0] + coordinates[1] - 5); } -inline void AugLagrangianTestFunction::GradientConstraint(const size_t index, - const arma::mat& /* coordinates */, - arma::mat& gradient) +template +inline void AugLagrangianTestFunction::GradientConstraint( + const size_t index, + const MatType& /* coordinates */, + MatType& gradient) { // If the user passed an invalid index (not 0), we will return a zero // gradient. @@ -99,10 +112,12 @@ template inline typename MatType::elem_type GockenbachFunction::Evaluate( const MatType& coordinates) { + typedef typename MatType::elem_type ElemType; + // f(x) = (x_1 - 1)^2 + 2 (x_2 + 2)^2 + 3(x_3 + 3)^2 - return ((std::pow(coordinates[0] - 1, 2)) + - (2 * std::pow(coordinates[1] + 2, 2)) + - (3 * std::pow(coordinates[2] + 3, 2))); + return ((std::pow(coordinates[0] - 1, ElemType(2))) + + (2 * std::pow(coordinates[1] + 2, ElemType(2))) + + (3 * std::pow(coordinates[2] + 3, ElemType(2)))); } template @@ -124,20 +139,21 @@ inline typename MatType::elem_type GockenbachFunction::EvaluateConstraint( const size_t index, const MatType& coordinates) { - typename MatType::elem_type constraint = 0; + typedef typename MatType::elem_type ElemType; + + ElemType constraint = 0; switch (index) { case 0: // g(x) = (x_3 - x_2 - x_1 - 1) = 0 - constraint = (coordinates[2] - coordinates[1] - coordinates[0] - - typename MatType::elem_type(1)); + constraint = (coordinates[2] - coordinates[1] - coordinates[0] - 1); break; case 1: // h(x) = (x_3 - x_1^2) >= 0 // To deal with the inequality, the constraint will simply evaluate to 0 // when h(x) >= 0. - constraint = std::min(typename MatType::elem_type(0), (coordinates[2] - - std::pow(coordinates[0], typename MatType::elem_type(2)))); + constraint = std::min(ElemType(0), (coordinates[2] - + std::pow(coordinates[0], ElemType(2)))); break; } @@ -322,7 +338,7 @@ inline const arma::mat& LovaszThetaSDP::GetInitialPoint() // and because m is always positive, // r = 0.5 + sqrt(0.25 + 2m) float m = NumConstraints(); - float r = 0.5 + sqrt(0.25 + 2 * m); + float r = 0.5 + std::sqrt(0.25 + 2 * m); if (ceil(r) > vertices) r = vertices; // An upper bound on the dimension. @@ -335,9 +351,10 @@ inline const arma::mat& LovaszThetaSDP::GetInitialPoint() for (size_t j = 0; j < (size_t) vertices; j++) { if (i == j) - initialPoint(i, j) = sqrt(1.0 / r) + sqrt(1.0 / (vertices * m)); + initialPoint(i, j) = std::sqrt(1.0 / r) + + std::sqrt(1.0 / (vertices * m)); else - initialPoint(i, j) = sqrt(1.0 / (vertices * m)); + initialPoint(i, j) = std::sqrt(1.0 / (vertices * m)); } } diff --git a/inst/include/ensmallen_bits/problems/beale_function_impl.hpp b/inst/include/ensmallen_bits/problems/beale_function_impl.hpp index a382005..9833303 100644 --- a/inst/include/ensmallen_bits/problems/beale_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/beale_function_impl.hpp @@ -35,9 +35,11 @@ typename MatType::elem_type BealeFunction::Evaluate( const ElemType x1 = coordinates(0); const ElemType x2 = coordinates(1); - const ElemType objective = std::pow(1.5 - x1 + x1 * x2, 2) + - std::pow(2.25 - x1 + x1 * x2 * x2, 2) + - std::pow(2.625 - x1 + x1 * pow(x2, 3), 2); + const ElemType objective = + std::pow(ElemType(1.5) - x1 + x1 * x2, ElemType(2)) + + std::pow(ElemType(2.25) - x1 + x1 * x2 * x2, ElemType(2)) + + std::pow(ElemType(2.625) - x1 + x1 * std::pow(x2, ElemType(3)), + ElemType(2)); return objective; } @@ -64,15 +66,15 @@ inline void BealeFunction::Gradient(const MatType& coordinates, // Aliases for different terms in the expression of the gradient. const ElemType x2Sq = x2 * x2; - const ElemType x2Cub = pow(x2, 3); + const ElemType x2Cub = std::pow(x2, ElemType(3)); gradient.set_size(2, 1); - gradient(0) = ((2 * x2 - 2) * (x1 * x2 - x1 + 1.5)) + - ((2 * x2Sq - 2) * (x1 * x2Sq - x1 + 2.25)) + - ((2 * x2Cub - 2) * (x1 * x2Cub - x1 + 2.625)); - gradient(1) = (6 * x1 * x2Sq * (x1 * x2Cub - x1 + 2.625)) + - (4 * x1 * x2 * (x1 * x2Sq - x1 + 2.25)) + - (2 * x1 * (x1 * x2 - x1 + 1.5)); + gradient(0) = ((2 * x2 - 2) * (x1 * x2 - x1 + ElemType(1.5))) + + ((2 * x2Sq - 2) * (x1 * x2Sq - x1 + ElemType(2.25))) + + ((2 * x2Cub - 2) * (x1 * x2Cub - x1 + ElemType(2.625))); + gradient(1) = (6 * x1 * x2Sq * (x1 * x2Cub - x1 + ElemType(2.625))) + + (4 * x1 * x2 * (x1 * x2Sq - x1 + ElemType(2.25))) + + (2 * x1 * (x1 * x2 - x1 + ElemType(1.5))); } template diff --git a/inst/include/ensmallen_bits/problems/booth_function_impl.hpp b/inst/include/ensmallen_bits/problems/booth_function_impl.hpp index d80c999..515aed6 100644 --- a/inst/include/ensmallen_bits/problems/booth_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/booth_function_impl.hpp @@ -35,8 +35,8 @@ typename MatType::elem_type BoothFunction::Evaluate( const ElemType x1 = coordinates(0); const ElemType x2 = coordinates(1); - const ElemType objective = std::pow(x1 + 2 * x2 - 7, 2) + - std::pow(2 * x1 + x2 - 5, 2); + const ElemType objective = std::pow(x1 + 2 * x2 - 7, ElemType(2)) + + std::pow(2 * x1 + x2 - 5, ElemType(2)); return objective; } diff --git a/inst/include/ensmallen_bits/problems/colville_function_impl.hpp b/inst/include/ensmallen_bits/problems/colville_function_impl.hpp index e3726d5..8e1ebfe 100644 --- a/inst/include/ensmallen_bits/problems/colville_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/colville_function_impl.hpp @@ -37,10 +37,12 @@ typename MatType::elem_type ColvilleFunction::Evaluate( const ElemType x3 = coordinates(2); const ElemType x4 = coordinates(3); - const ElemType objective = 100 * std::pow(std::pow(x1, 2) - x2, 2) + - std::pow(x1 - 1, 2) + std::pow(x3 - 1, 2) + 90 * - std::pow(std::pow(x3, 2) - x4, 2) + 10.1 * (std::pow(x2 - 1, 2) + - std::pow(x4 - 1, 2)) + 19.8 * (x2 - 1) * (x4 - 1); + const ElemType objective = + 100 * std::pow(std::pow(x1, ElemType(2)) - x2, ElemType(2)) + + std::pow(x1 - 1, ElemType(2)) + std::pow(x3 - 1, ElemType(2)) + + 90 * std::pow(std::pow(x3, ElemType(2)) - x4, ElemType(2)) + + ElemType(10.1) * (std::pow(x2 - 1, ElemType(2)) + + std::pow(x4 - 1, ElemType(2))) + ElemType(19.8) * (x2 - 1) * (x4 - 1); return objective; } @@ -68,10 +70,12 @@ inline void ColvilleFunction::Gradient(const MatType& coordinates, const ElemType x4 = coordinates(3); gradient.set_size(4, 1); - gradient(0) = 2 * (200 * x1 * (std::pow(x1, 2) - x2) + x1 - 1); - gradient(1) = 19.8 * x4 - 200 * std::pow(x1, 2) + 220.2 * x2 - 40; - gradient(2) = 2 * (180 * x3 * (std::pow(x3, 2) - x4) + x3 - 1); - gradient(3) = 200.2 * x4 + 19.8 * x2 - 180 * std::pow(x3, 2) - 40; + gradient(0) = 2 * (200 * x1 * (std::pow(x1, ElemType(2)) - x2) + x1 - 1); + gradient(1) = ElemType(19.8) * x4 - 200 * std::pow(x1, ElemType(2)) + + ElemType(220.2) * x2 - 40; + gradient(2) = 2 * (180 * x3 * (std::pow(x3, ElemType(2)) - x4) + x3 - 1); + gradient(3) = ElemType(200.2) * x4 + ElemType(19.8) * x2 - + 180 * std::pow(x3, ElemType(2)) - 40; } template diff --git a/inst/include/ensmallen_bits/problems/cross_in_tray_function_impl.hpp b/inst/include/ensmallen_bits/problems/cross_in_tray_function_impl.hpp index e5814f0..e4bf4f0 100644 --- a/inst/include/ensmallen_bits/problems/cross_in_tray_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/cross_in_tray_function_impl.hpp @@ -35,10 +35,12 @@ typename MatType::elem_type CrossInTrayFunction::Evaluate( const ElemType x1 = coordinates(0); const ElemType x2 = coordinates(1); - const ElemType objective = -0.0001 * std::pow(std::abs(std::sin(x1) * - std::sin(x2) * std::exp(std::abs(100 - (std::sqrt(std::pow(x1, 2) + - std::pow(x2, 2)) / arma::datum::pi))) + 1), 0.1); - return objective; + // Compute objective in higher precision, then cast down. + const double objective = -0.0001 * std::pow(std::abs(std::sin(double(x1)) * + std::sin(double(x2)) * + std::exp(std::abs(100 - (std::sqrt(std::pow(double(x1), 2) + + std::pow(double(x2), 2)) / arma::datum::pi))) + 1), 0.1); + return ElemType(objective); } template diff --git a/inst/include/ensmallen_bits/problems/easom_function_impl.hpp b/inst/include/ensmallen_bits/problems/easom_function_impl.hpp index a63dd89..2a9436a 100644 --- a/inst/include/ensmallen_bits/problems/easom_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/easom_function_impl.hpp @@ -36,8 +36,8 @@ typename MatType::elem_type EasomFunction::Evaluate( const ElemType x2 = coordinates(1); const ElemType objective = -std::cos(x1) * std::cos(x2) * - std::exp(-1.0 * std::pow(x1 - arma::datum::pi, 2) - - std::pow(x2 - arma::datum::pi, 2)); + std::exp(-1 * std::pow(x1 - arma::Datum::pi, ElemType(2)) - + std::pow(x2 - arma::Datum::pi, ElemType(2))); return objective; } @@ -63,20 +63,20 @@ inline void EasomFunction::Gradient(const MatType& coordinates, const ElemType x2 = coordinates(1); gradient.set_size(2, 1); - gradient(0) = 2 * (x1 - arma::datum::pi) * - std::exp(-1.0 * std::pow(x1 - arma::datum::pi, 2) - - std::pow(x2 - arma::datum::pi, 2)) * + gradient(0) = 2 * (x1 - arma::Datum::pi) * + std::exp(-1 * std::pow(x1 - arma::Datum::pi, ElemType(2)) - + std::pow(x2 - arma::Datum::pi, ElemType(2))) * std::cos(x1) * std::cos(x2) + - std::exp(-1.0 * std::pow(x1 - arma::datum::pi, 2) - - std::pow(x2 - arma::datum::pi, 2)) * + std::exp(-1 * std::pow(x1 - arma::Datum::pi, ElemType(2)) - + std::pow(x2 - arma::Datum::pi, ElemType(2))) * std::sin(x1) * std::cos(x2); - gradient(1) = 2 * (x2 - arma::datum::pi) * - std::exp(-1.0 * std::pow(x1 - arma::datum::pi, 2) - - std::pow(x2 - arma::datum::pi, 2)) * + gradient(1) = 2 * (x2 - arma::Datum::pi) * + std::exp(-1 * std::pow(x1 - arma::Datum::pi, ElemType(2)) - + std::pow(x2 - arma::Datum::pi, ElemType(2))) * std::cos(x1) * std::cos(x2) + - std::exp(-1.0 * std::pow(x1 - arma::datum::pi, 2) - - std::pow(x2 - arma::datum::pi, 2)) * + std::exp(-1 * std::pow(x1 - arma::Datum::pi, ElemType(2)) - + std::pow(x2 - arma::Datum::pi, ElemType(2))) * std::cos(x1) * std::sin(x2); } diff --git a/inst/include/ensmallen_bits/problems/fonseca_fleming_function.hpp b/inst/include/ensmallen_bits/problems/fonseca_fleming_function.hpp index 34bc95b..1de61db 100644 --- a/inst/include/ensmallen_bits/problems/fonseca_fleming_function.hpp +++ b/inst/include/ensmallen_bits/problems/fonseca_fleming_function.hpp @@ -46,14 +46,12 @@ class FonsecaFlemingFunction * Evaluate the objectives with the given coordinate. * * @param coords The function coordinates. - * @return arma::Col + * @return Col */ - arma::Col Evaluate(const MatType& coords) - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - arma::Col objectives(numObjectives); + typename ForwardType::bvec Evaluate(const MatType& coords) + { + typename ForwardType::bvec objectives(numObjectives); objectives(0) = objectiveA.Evaluate(coords); objectives(1) = objectiveB.Evaluate(coords); @@ -64,21 +62,18 @@ class FonsecaFlemingFunction //! Get the starting point. MatType GetInitialPoint() { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - return arma::Col(numVariables, 1, arma::fill::zeros); + return MatType(numVariables, 1, GetFillType::zeros); } struct ObjectiveA { typename MatType::elem_type Evaluate(const MatType& coords) { - return 1.0 - exp( - -pow(static_cast(coords[0]) - 1.0 / sqrt(3.0), 2.0) - -pow(static_cast(coords[1]) - 1.0 / sqrt(3.0), 2.0) - -pow(static_cast(coords[2]) - 1.0 / sqrt(3.0), 2.0) - ); + return typename MatType::elem_type(1.0 - std::exp( + -std::pow(static_cast(coords[0]) - 1.0 / std::sqrt(3.0), 2.0) + -std::pow(static_cast(coords[1]) - 1.0 / std::sqrt(3.0), 2.0) + -std::pow(static_cast(coords[2]) - 1.0 / std::sqrt(3.0), 2.0) + )); } } objectiveA; @@ -86,11 +81,11 @@ class FonsecaFlemingFunction { typename MatType::elem_type Evaluate(const MatType& coords) { - return 1.0 - exp( - -pow(static_cast(coords[0]) + 1.0 / sqrt(3.0), 2.0) - -pow(static_cast(coords[1]) + 1.0 / sqrt(3.0), 2.0) - -pow(static_cast(coords[2]) + 1.0 / sqrt(3.0), 2.0) - ); + return typename MatType::elem_type(1.0 - std::exp( + -std::pow(static_cast(coords[0]) + 1.0 / std::sqrt(3.0), 2.0) + -std::pow(static_cast(coords[1]) + 1.0 / std::sqrt(3.0), 2.0) + -std::pow(static_cast(coords[2]) + 1.0 / std::sqrt(3.0), 2.0) + )); } } objectiveB; @@ -100,6 +95,7 @@ class FonsecaFlemingFunction return std::make_tuple(objectiveA, objectiveB); } }; + } // namespace test } // namespace ens diff --git a/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function.hpp b/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function.hpp index 3ec963f..196d540 100644 --- a/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function.hpp +++ b/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function.hpp @@ -113,14 +113,14 @@ class GeneralizedRosenbrockFunction template const MatType GetInitialPoint() const { - return arma::conv_to::from(initialPoint); + return conv_to::from(initialPoint); } //! Get the final point. template const MatType GetFinalPoint() const { - return arma::ones(initialPoint.n_rows, initialPoint.n_cols); + return ones(initialPoint.n_rows, initialPoint.n_cols); } //! Get the final objective. diff --git a/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function_impl.hpp b/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function_impl.hpp index b332f13..82b0b3d 100644 --- a/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function_impl.hpp @@ -51,12 +51,15 @@ typename MatType::elem_type GeneralizedRosenbrockFunction::Evaluate( const size_t begin, const size_t batchSize) const { - typename MatType::elem_type objective = 0.0; + typedef typename MatType::elem_type ElemType; + + ElemType objective = 0; for (size_t j = begin; j < begin + batchSize; ++j) { const size_t p = visitationOrder[j]; - objective += 100 * std::pow((std::pow(coordinates[p], 2) - - coordinates[p + 1]), 2) + std::pow(1 - coordinates[p], 2); + objective += 100 * std::pow((std::pow(coordinates[p], ElemType(2)) - + coordinates[p + 1]), ElemType(2)) + + std::pow(1 - coordinates[p], ElemType(2)); } return objective; @@ -66,11 +69,14 @@ template typename MatType::elem_type GeneralizedRosenbrockFunction::Evaluate( const MatType& coordinates) const { - typename MatType::elem_type fval = 0; + typedef typename MatType::elem_type ElemType; + + ElemType fval = 0; for (size_t i = 0; i < (n - 1); i++) { - fval += 100 * std::pow(std::pow(coordinates[i], 2) - - coordinates[i + 1], 2) + std::pow(1 - coordinates[i], 2); + fval += 100 * std::pow(std::pow(coordinates[i], ElemType(2)) - + coordinates[i + 1], ElemType(2)) + + std::pow(1 - coordinates[i], ElemType(2)); } return fval; @@ -83,13 +89,16 @@ inline void GeneralizedRosenbrockFunction::Gradient( GradType& gradient, const size_t batchSize) const { + typedef typename MatType::elem_type ElemType; + gradient.zeros(n); for (size_t j = begin; j < begin + batchSize; ++j) { const size_t p = visitationOrder[j]; - gradient[p] = 400 * (std::pow(coordinates[p], 3) - coordinates[p] * - coordinates[p + 1]) + 2 * (coordinates[p] - 1); - gradient[p + 1] = 200 * (coordinates[p + 1] - std::pow(coordinates[p], 2)); + gradient[p] = 400 * (std::pow(coordinates[p], ElemType(3)) - + coordinates[p] * coordinates[p + 1]) + 2 * (coordinates[p] - 1); + gradient[p + 1] = + 200 * (coordinates[p + 1] - std::pow(coordinates[p], ElemType(2))); } } @@ -98,18 +107,23 @@ inline void GeneralizedRosenbrockFunction::Gradient( const MatType& coordinates, GradType& gradient) const { + typedef typename MatType::elem_type ElemType; + gradient.zeros(n); for (size_t i = 0; i < (n - 1); i++) { - gradient[i] = 400 * (std::pow(coordinates[i], 3) - coordinates[i] * - coordinates[i + 1]) + 2 * (coordinates[i] - 1); + gradient[i] = 400 * (std::pow(coordinates[i], ElemType(3)) - + coordinates[i] * coordinates[i + 1]) + 2 * (coordinates[i] - 1); if (i > 0) - gradient[i] += 200 * (coordinates[i] - std::pow(coordinates[i - 1], 2)); + { + gradient[i] += + 200 * (coordinates[i] - std::pow(coordinates[i - 1], ElemType(2))); + } } gradient[n - 1] = 200 * (coordinates[n - 1] - - std::pow(coordinates[n - 2], 2)); + std::pow(coordinates[n - 2], ElemType(2))); } } // namespace test diff --git a/inst/include/ensmallen_bits/problems/goldstein_price_function_impl.hpp b/inst/include/ensmallen_bits/problems/goldstein_price_function_impl.hpp index 29d7e55..913f58c 100644 --- a/inst/include/ensmallen_bits/problems/goldstein_price_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/goldstein_price_function_impl.hpp @@ -36,12 +36,13 @@ typename MatType::elem_type GoldsteinPriceFunction::Evaluate( const ElemType x1 = coordinates(0); const ElemType x2 = coordinates(1); - const ElemType x1Sq = std::pow(x1, 2); - const ElemType x2Sq = std::pow(x2, 2); + const ElemType x1Sq = std::pow(x1, ElemType(2)); + const ElemType x2Sq = std::pow(x2, ElemType(2)); const ElemType x1x2 = x1 * x2; - const ElemType objective = (1 + std::pow(x1 + x2 + 1, 2) * (19 - 14 * x1 + 3 * - x1Sq - 14 * x2 + 6 * x1x2 + 3 * x2Sq)) * (30 + std::pow(2 * x1 - 3 * x2, - 2) * (18 - 32 * x1 + 12 * x1Sq + 48 * x2 - 36 * x1x2 + 27 * x2Sq)); + const ElemType objective = (1 + std::pow(x1 + x2 + 1, ElemType(2)) * + (19 - 14 * x1 + 3 * x1Sq - 14 * x2 + 6 * x1x2 + 3 * x2Sq)) * + (30 + std::pow(2 * x1 - 3 * x2, ElemType(2)) * + (18 - 32 * x1 + 12 * x1Sq + 48 * x2 - 36 * x1x2 + 27 * x2Sq)); return objective; } @@ -67,22 +68,26 @@ inline void GoldsteinPriceFunction::Gradient(const MatType& coordinates, const ElemType x2 = coordinates(1); gradient.set_size(2, 1); - gradient(0) = (std::pow(2 * x1 - 3 * x2, 2) * (24 * x1 - 36 * x2 - 32) + (8 * - x1 - 12 * x2) * (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 + - 48 * x2 + 18)) * (std::pow(x1 + x2 + 1, 2) * (3 * x1 * x1 + 6 * x1 * x2 - - 14 * x1 + 3 * x2 * x2 - 14 * x2 + 19) + 1) + (std::pow(2 * x1 - 3 * x2, - 2) * (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 + 48 * x2 + - 18) + 30) * (std::pow(x1 + x2 + 1, 2) * (6 * x1 + 6 * x2 - 14) + (2 * x1 + - 2 * x2 + 2) * (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 * - x2 + 19)); + gradient(0) = (std::pow(2 * x1 - 3 * x2, ElemType(2)) * + (24 * x1 - 36 * x2 - 32) + (8 * x1 - 12 * x2) * + (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 + 48 * x2 + 18)) * + (std::pow(x1 + x2 + 1, ElemType(2)) * + (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 * x2 + 19) + 1) + + (std::pow(2 * x1 - 3 * x2, ElemType(2)) * + (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 + 48 * x2 + 18) + + 30) * (std::pow(x1 + x2 + 1, ElemType(2)) * (6 * x1 + 6 * x2 - 14) + + (2 * x1 + 2 * x2 + 2) * + (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 * x2 + 19)); gradient(1) = ((- 12 * x1 + 18 * x2) * (12 * x1 * x1 - 36 * x1 * x2 - 32 * - x1 + 27 * x2 * x2 + 48 * x2 + 18) + std::pow(2 * x1 - 3 * x2, 2) * (-36 * - x1 + 54 * x2 + 48)) * (std::pow(x1 + x2 + 1, 2) * (3 * x1 * x1 + 6 * x1 * - x2 - 14 * x1 + 3 * x2 * x2 - 14 * x2 + 19) + 1) + (std::pow(2 * x1 - 3 * - x2, 2) * (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 + 48 * x2 + - 18) + 30) * (std::pow(x1 + x2 + 1, 2) * (6 * x1 + 6 * x2 - 14) + (2 * x1 + - 2 * x2 + 2) * (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 * - x2 + 19)); + x1 + 27 * x2 * x2 + 48 * x2 + 18) + + std::pow(2 * x1 - 3 * x2, ElemType(2)) * (-36 * x1 + 54 * x2 + 48)) * + (std::pow(x1 + x2 + 1, ElemType(2)) * + (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 * x2 + 19) + 1) + + (std::pow(2 * x1 - 3 * x2, ElemType(2)) * + (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 + 48 * x2 + 18) + + 30) * (std::pow(x1 + x2 + 1, ElemType(2)) * (6 * x1 + 6 * x2 - 14) + + (2 * x1 + 2 * x2 + 2) * + (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 * x2 + 19)); } template diff --git a/inst/include/ensmallen_bits/problems/gradient_descent_test_function_impl.hpp b/inst/include/ensmallen_bits/problems/gradient_descent_test_function_impl.hpp index cf38ee6..b1c447f 100644 --- a/inst/include/ensmallen_bits/problems/gradient_descent_test_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/gradient_descent_test_function_impl.hpp @@ -21,7 +21,7 @@ template inline typename MatType::elem_type GDTestFunction::Evaluate( const MatType& coordinates) const { - MatType temp = arma::trans(coordinates) * coordinates; + MatType temp = trans(coordinates) * coordinates; return temp(0, 0); } diff --git a/inst/include/ensmallen_bits/problems/himmelblau_function.hpp b/inst/include/ensmallen_bits/problems/himmelblau_function.hpp index b34f942..26f35f9 100644 --- a/inst/include/ensmallen_bits/problems/himmelblau_function.hpp +++ b/inst/include/ensmallen_bits/problems/himmelblau_function.hpp @@ -60,6 +60,12 @@ class HimmelblauFunction template MatType GetInitialPoint() const { return MatType("5; -5"); } + //! Get the final point of the optimization. + template + MatType GetFinalPoint() const { return MatType("3; 2"); } + + double GetFinalObjective() const { return 0.0; } + /** * Evaluate a function for a particular batch-size. * diff --git a/inst/include/ensmallen_bits/problems/himmelblau_function_impl.hpp b/inst/include/ensmallen_bits/problems/himmelblau_function_impl.hpp index 52dfc1f..da1cae8 100644 --- a/inst/include/ensmallen_bits/problems/himmelblau_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/himmelblau_function_impl.hpp @@ -35,8 +35,8 @@ typename MatType::elem_type HimmelblauFunction::Evaluate( const ElemType x1 = coordinates(0); const ElemType x2 = coordinates(1); - const ElemType objective = std::pow(x1 * x1 + x2 - 11 , 2) + - std::pow(x1 + x2 * x2 - 7, 2); + const ElemType objective = std::pow(x1 * x1 + x2 - 11, ElemType(2)) + + std::pow(x1 + x2 * x2 - 7, ElemType(2)); return objective; } diff --git a/inst/include/ensmallen_bits/problems/levy_function_n13.hpp b/inst/include/ensmallen_bits/problems/levy_function_n13.hpp index 06fc85a..f19d338 100644 --- a/inst/include/ensmallen_bits/problems/levy_function_n13.hpp +++ b/inst/include/ensmallen_bits/problems/levy_function_n13.hpp @@ -36,7 +36,7 @@ namespace test { class LevyFunctionN13 { public: - //! Initialize the BealeFunction. + //! Initialize the LevyFunctionN13 object. LevyFunctionN13(); /** diff --git a/inst/include/ensmallen_bits/problems/levy_function_n13_impl.hpp b/inst/include/ensmallen_bits/problems/levy_function_n13_impl.hpp index d554254..1c691c4 100644 --- a/inst/include/ensmallen_bits/problems/levy_function_n13_impl.hpp +++ b/inst/include/ensmallen_bits/problems/levy_function_n13_impl.hpp @@ -35,11 +35,12 @@ typename MatType::elem_type LevyFunctionN13::Evaluate( const ElemType x1 = coordinates(0); const ElemType x2 = coordinates(1); - const ElemType objective = std::pow(std::sin(3 * arma::datum::pi * x1), 2) + - (std::pow(x1 - 1, 2) * (1 + std::pow( - std::sin(3 * arma::datum::pi * x2), 2))) + - (std::pow(x2 - 1, 2) * (1 + std::pow( - std::sin(2 * arma::datum::pi * x2), 2))); + const ElemType objective = + std::pow(std::sin(3 * arma::Datum::pi * x1), ElemType(2)) + + (std::pow(x1 - 1, ElemType(2)) * (1 + std::pow( + std::sin(3 * arma::Datum::pi * x2), ElemType(2)))) + + (std::pow(x2 - 1, ElemType(2)) * (1 + std::pow( + std::sin(2 * arma::Datum::pi * x2), ElemType(2)))); return objective; } @@ -65,15 +66,19 @@ inline void LevyFunctionN13::Gradient(const MatType& coordinates, const ElemType x2 = coordinates(1); gradient.set_size(2, 1); - gradient(0) = (2 * x1 - 2) * (std::pow(std::sin(3 * arma::datum::pi * x2), - 2) + 1) + 6 * arma::datum::pi * std::sin(3 * arma::datum::pi * x1) * - std::cos(3 * arma::datum::pi * x1); - - gradient(1) = 6 * arma::datum::pi * std::pow(x1 - 1, 2) * std::sin(3 * - arma::datum::pi * x2) * std::cos(3 * arma::datum::pi * x2) + - 4 * arma::datum::pi * std::pow(x2 - 1, 2) * std::sin(2 * - arma::datum::pi * x2) * std::cos(2 * arma::datum::pi * x2) + - (2 * x2 - 2) * (std::pow(std::sin(2 * arma::datum::pi * x2), 2) + 1); + gradient(0) = (2 * x1 - 2) * + (std::pow(std::sin(3 * arma::Datum::pi * x2), ElemType(2)) + + 1) + 6 * arma::Datum::pi * + std::sin(3 * arma::Datum::pi * x1) * + std::cos(3 * arma::Datum::pi * x1); + + gradient(1) = 6 * arma::Datum::pi * std::pow(x1 - 1, ElemType(2)) * + std::sin(3 * arma::Datum::pi * x2) * + std::cos(3 * arma::Datum::pi * x2) + + 4 * arma::Datum::pi * std::pow(x2 - 1, ElemType(2)) * + std::sin(2 * arma::Datum::pi * x2) * + std::cos(2 * arma::Datum::pi * x2) + (2 * x2 - 2) * + (std::pow(std::sin(2 * arma::Datum::pi * x2), ElemType(2)) + 1); } template diff --git a/inst/include/ensmallen_bits/problems/logistic_regression_function.hpp b/inst/include/ensmallen_bits/problems/logistic_regression_function.hpp index 53c69df..770aa0e 100644 --- a/inst/include/ensmallen_bits/problems/logistic_regression_function.hpp +++ b/inst/include/ensmallen_bits/problems/logistic_regression_function.hpp @@ -26,29 +26,34 @@ template class LogisticRegressionFunction { public: + typedef typename MatType::elem_type ElemType; + typedef typename ForwardType::brow BaseRowType; + + template LogisticRegressionFunction(MatType& predictors, - arma::Row& responses, + LabelsType& responses, const double lambda = 0); + template LogisticRegressionFunction(MatType& predictors, - arma::Row& responses, + LabelsType& responses, MatType& initialPoint, const double lambda = 0); - //! Return the initial point for the optimization. + // Return the initial point for the optimization. const MatType& InitialPoint() const { return initialPoint; } - //! Modify the initial point for the optimization. + // Modify the initial point for the optimization. MatType& InitialPoint() { return initialPoint; } - //! Return the regularization parameter (lambda). - const double& Lambda() const { return lambda; } - //! Modify the regularization parameter (lambda). - double& Lambda() { return lambda; } + // Return the regularization parameter (lambda). + const ElemType& Lambda() const { return lambda; } + // Modify the regularization parameter (lambda). + ElemType& Lambda() { return lambda; } - //! Return the matrix of predictors. + // Return the matrix of predictors. const MatType& Predictors() const { return predictors; } //! Return the vector of responses. - const arma::Row& Responses() const { return responses; } + const BaseRowType& Responses() const { return responses; } /** * Shuffle the order of function visitation. This may be called by the @@ -67,7 +72,7 @@ class LogisticRegressionFunction * * @param parameters Vector of logistic regression parameters. */ - typename MatType::elem_type Evaluate(const MatType& parameters) const; + ElemType Evaluate(const MatType& parameters) const; /** * Evaluate the logistic regression log-likelihood function with the given @@ -86,9 +91,9 @@ class LogisticRegressionFunction * @param batchSize Number of points to be passed at a time to use for * objective function evaluation. */ - typename MatType::elem_type Evaluate(const MatType& parameters, - const size_t begin, - const size_t batchSize = 1) const; + ElemType Evaluate(const MatType& parameters, + const size_t begin, + const size_t batchSize = 1) const; /** * Evaluate the gradient of the logistic regression log-likelihood function @@ -130,33 +135,34 @@ class LogisticRegressionFunction * be computed. * @param gradient Sparse matrix to output gradient into. */ + template void PartialGradient(const MatType& parameters, const size_t j, - arma::sp_mat& gradient) const; + GradType& gradient) const; /** * Evaluate the objective function and gradient of the logistic regression * log-likelihood function simultaneously with the given parameters. */ template - typename MatType::elem_type EvaluateWithGradient( + ElemType EvaluateWithGradient( const MatType& parameters, GradType& gradient) const; template - typename MatType::elem_type EvaluateWithGradient( + ElemType EvaluateWithGradient( const MatType& parameters, const size_t begin, GradType& gradient, const size_t batchSize = 1) const; - //! Return the initial point for the optimization. + // Return the initial point for the optimization. const MatType& GetInitialPoint() const { return initialPoint; } - //! Return the number of separable functions (the number of predictor points). + // Return the number of separable functions (the number of predictor points). size_t NumFunctions() const { return predictors.n_cols; } - //! Return the number of features(add 1 for the intercept term). + // Return the number of features(add 1 for the intercept term). size_t NumFeatures() const { return predictors.n_rows + 1; } /** @@ -174,8 +180,9 @@ class LogisticRegressionFunction * @param decisionBoundary Decision boundary (default 0.5). * @return Percentage of responses that are predicted correctly. */ + template double ComputeAccuracy(const MatType& predictors, - const arma::Row& responses, + const LabelsType& responses, const MatType& parameters, const double decisionBoundary = 0.5) const; @@ -191,22 +198,25 @@ class LogisticRegressionFunction * @param parameters Vector of logistic regression parameters. * @param decisionBoundary Decision boundary (default 0.5). */ + template void Classify(const MatType& dataset, - arma::Row& labels, + LabelsType& labels, const MatType& parameters, const double decisionBoundary = 0.5) const; private: - //! The initial point, from which to start the optimization. + // The initial point, from which to start the optimization. MatType initialPoint; - //! The matrix of data points (predictors). This is an alias until shuffling - //! is done. + // The matrix of data points (predictors). This is an alias until shuffling + // is done. MatType& predictors; - //! The vector of responses to the input data points. This is an alias until - //! shuffling is done. - arma::Row& responses; - //! The regularization parameter for L2-regularization. - double lambda; + // The vector of responses to the input data points, converted to the same + // type as the data. + BaseRowType responses; + // The regularization parameter for L2-regularization. + ElemType lambda; + // This is lambda/2, cached for convenience. + ElemType halfLambda; }; // Convenience typedefs. diff --git a/inst/include/ensmallen_bits/problems/logistic_regression_function_impl.hpp b/inst/include/ensmallen_bits/problems/logistic_regression_function_impl.hpp index e9a9148..291ad29 100644 --- a/inst/include/ensmallen_bits/problems/logistic_regression_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/logistic_regression_function_impl.hpp @@ -19,17 +19,26 @@ namespace ens { namespace test { template +template LogisticRegressionFunction::LogisticRegressionFunction( MatType& predictors, - arma::Row& responses, - const double lambda) : + LabelsType& responsesIn, + const double lambdaIn) : // We promise to be well-behaved... the elements won't be modified. predictors(predictors), - responses(responses), - lambda(lambda) + // On old Armadillo versions, we cannot do both a sparse-to-dense conversion + // and element type conversion in one shot. + #if ARMA_VERSION_MAJOR < 12 || \ + (ARMA_VERSION_MAJOR == 12 && ARMA_VERSION_MINOR < 8) + responses(conv_to::from(conv_to::bmat>::from(responsesIn))), + #else + responses(conv_to::from(responsesIn)), + #endif + lambda(ElemType(lambdaIn)), + halfLambda(ElemType(lambdaIn / 2.0)) { - initialPoint = arma::Row(predictors.n_rows + 1, - arma::fill::zeros); + initialPoint = arma::Row(predictors.n_rows + 1, arma::fill::zeros); // Sanity check. if (responses.n_elem != predictors.n_cols) @@ -44,21 +53,55 @@ LogisticRegressionFunction::LogisticRegressionFunction( } template +template LogisticRegressionFunction::LogisticRegressionFunction( MatType& predictors, - arma::Row& responses, + LabelsType& responsesIn, MatType& initialPoint, - const double lambda) : + const double lambdaIn) : initialPoint(initialPoint), predictors(predictors), - responses(responses), - lambda(lambda) + // On old Armadillo versions, we cannot do both a sparse-to-dense conversion + // and element type conversion in one shot. + #if ARMA_VERSION_MAJOR < 12 || \ + (ARMA_VERSION_MAJOR == 12 && ARMA_VERSION_MINOR < 8) + responses(conv_to::from(conv_to::bmat>::from(responsesIn))), + #else + responses(conv_to::from(responsesIn)), + #endif + lambda(ElemType(lambdaIn)), + halfLambda(ElemType(lambdaIn / 2.0)) { // To check if initialPoint is compatible with predictors. if (initialPoint.n_rows != (predictors.n_rows + 1) || initialPoint.n_cols != 1) - this->initialPoint = arma::Row( - predictors.n_rows + 1, arma::fill::zeros); + { + this->initialPoint = arma::Row(predictors.n_rows + 1, + arma::fill::zeros); + } +} + +template +void ShuffleImpl(MatType& predictors, MatType& responses, + const typename std::enable_if_t::value>* = 0) +{ + MatType allData = shuffle(join_cols(predictors, responses), 1); + + predictors = allData.rows(0, allData.n_rows - 2); + responses = allData.row(allData.n_rows - 1); +} + +template +void ShuffleImpl(MatType& predictors, BaseRowType& responses, + const typename std::enable_if_t::value>* = 0) +{ + // For sparse data shuffle() is not available. + arma::uvec ordering = shuffle(linspace(0, predictors.n_cols - 1, + predictors.n_cols)); + + predictors = predictors.cols(ordering); + responses = responses.cols(ordering); } /** @@ -67,20 +110,7 @@ LogisticRegressionFunction::LogisticRegressionFunction( template void LogisticRegressionFunction::Shuffle() { - MatType newPredictors; - arma::Row newResponses; - - arma::uvec ordering = arma::shuffle(arma::linspace(0, - predictors.n_cols - 1, predictors.n_cols)); - - newPredictors.set_size(predictors.n_rows, predictors.n_cols); - for (size_t i = 0; i < predictors.n_cols; ++i) - newPredictors.col(i) = predictors.col(ordering[i]); - newResponses = responses.cols(ordering); - - // Take ownership of the new data. - predictors = std::move(newPredictors); - responses = std::move(newResponses); + ShuffleImpl(predictors, responses); } /** @@ -97,19 +127,18 @@ typename MatType::elem_type LogisticRegressionFunction::Evaluate( // f(w) = sum(y log(sig(w'x)) + (1 - y) log(sig(1 - w'x))). // We want to minimize this function. L2-regularization is just lambda // multiplied by the squared l2-norm of the parameters then divided by two. - typedef typename MatType::elem_type ElemType; + typedef typename ForwardType::brow BaseRowType; // For the regularization, we ignore the first term, which is the intercept // term and take every term except the last one in the decision variable. - const ElemType regularization = 0.5 * lambda * - arma::dot(parameters.tail_cols(parameters.n_elem - 1), - parameters.tail_cols(parameters.n_elem - 1)); + const ElemType regularization = halfLambda * + dot(parameters.tail_cols(parameters.n_elem - 1), + parameters.tail_cols(parameters.n_elem - 1)); // Calculate vectors of sigmoids. The intercept term is parameters(0, 0) and // does not need to be multiplied by any of the predictors. - const arma::Row sigmoid = 1.0 / (1.0 + - arma::exp(-(parameters(0, 0) + - parameters.tail_cols(parameters.n_elem - 1) * predictors))); + const BaseRowType sigmoid = 1 / (1 + exp(-(parameters(0, 0) + + parameters.tail_cols(parameters.n_elem - 1) * predictors))); // Assemble full objective function. Often the objective function and the // regularization as given are divided by the number of features, but this @@ -117,9 +146,8 @@ typename MatType::elem_type LogisticRegressionFunction::Evaluate( // terms for computational efficiency. Note that the conversion causes some // copy and slowdown, but this is so negligible compared to the rest of the // calculation it is not worth optimizing for. - const ElemType result = arma::accu(arma::log(1.0 - - arma::conv_to>::from(responses) + sigmoid % - (2 * arma::conv_to>::from(responses) - 1.0))); + const ElemType result = accu( + log(1 - responses + sigmoid % (2 * responses - 1))); // Invert the result, because it's a minimization. return regularization - result; @@ -135,25 +163,23 @@ typename MatType::elem_type LogisticRegressionFunction::Evaluate( const size_t begin, const size_t batchSize) const { - typedef typename MatType::elem_type ElemType; + typedef typename ForwardType::brow BaseRowType; // Calculate the regularization term. - const ElemType regularization = lambda * - (batchSize / (2.0 * predictors.n_cols)) * - arma::dot(parameters.tail_cols(parameters.n_elem - 1), - parameters.tail_cols(parameters.n_elem - 1)); + const ElemType regularization = halfLambda * + (batchSize / ElemType(predictors.n_cols)) * + dot(parameters.tail_cols(parameters.n_elem - 1), + parameters.tail_cols(parameters.n_elem - 1)); // Calculate the sigmoid function values. - const arma::Row sigmoid = 1.0 / (1.0 + - arma::exp(-(parameters(0, 0) + - parameters.tail_cols(parameters.n_elem - 1) * - predictors.cols(begin, begin + batchSize - 1)))); + const BaseRowType sigmoid = 1 / (1 + exp(-(parameters(0, 0) + + parameters.tail_cols(parameters.n_elem - 1) * + predictors.cols(begin, begin + batchSize - 1)))); // Compute the objective for the given batch size from a given point. - arma::Row respD = arma::conv_to>::from( - responses.subvec(begin, begin + batchSize - 1)); - const ElemType result = arma::accu(arma::log(1.0 - respD + sigmoid % - (2 * respD - 1.0))); + const ElemType result = accu(log( + 1 - responses.subvec(begin, begin + batchSize - 1) + + sigmoid % (2 * responses.subvec(begin, begin + batchSize - 1) - 1))); // Invert the result, because it's a minimization. return regularization - result; @@ -166,16 +192,14 @@ void LogisticRegressionFunction::Gradient( const MatType& parameters, GradType& gradient) const { - typedef typename MatType::elem_type ElemType; // Regularization term. - MatType regularization; - regularization = lambda * parameters.tail_cols(parameters.n_elem - 1); + MatType regularization = lambda * parameters.tail_cols(parameters.n_elem - 1); - const arma::Row sigmoids = (1 / (1 + arma::exp(-parameters(0, 0) + const BaseRowType sigmoids = (1 / (1 + exp(-parameters(0, 0) - parameters.tail_cols(parameters.n_elem - 1) * predictors))); - gradient.set_size(arma::size(parameters)); - gradient[0] = -arma::accu(responses - sigmoids); + gradient.set_size(size(parameters)); + gradient[0] = -accu(responses - sigmoids); gradient.tail_cols(parameters.n_elem - 1) = (sigmoids - responses) * predictors.t() + regularization; } @@ -185,26 +209,24 @@ void LogisticRegressionFunction::Gradient( template template void LogisticRegressionFunction::Gradient( - const MatType& parameters, - const size_t begin, - GradType& gradient, - const size_t batchSize) const + const MatType& parameters, + const size_t begin, + GradType& gradient, + const size_t batchSize) const { - typedef typename MatType::elem_type ElemType; - // Regularization term. - MatType regularization; - regularization = lambda * parameters.tail_cols(parameters.n_elem - 1) + MatType regularization = lambda * parameters.tail_cols(parameters.n_elem - 1) / predictors.n_cols * batchSize; - const arma::Row exponents = parameters(0, 0) + + const BaseRowType exponents = parameters(0, 0) + parameters.tail_cols(parameters.n_elem - 1) * predictors.cols(begin, begin + batchSize - 1); + // Calculating the sigmoid function values. - const arma::Row sigmoids = 1.0 / (1.0 + arma::exp(-exponents)); + const BaseRowType sigmoids = 1 / (1 + exp(-exponents)); gradient.set_size(parameters.n_rows, parameters.n_cols); - gradient[0] = -arma::accu(responses.subvec(begin, begin + batchSize - 1) - + gradient[0] = -accu(responses.subvec(begin, begin + batchSize - 1) - sigmoids); gradient.tail_cols(parameters.n_elem - 1) = (sigmoids - responses.subvec(begin, begin + batchSize - 1)) * @@ -215,27 +237,26 @@ void LogisticRegressionFunction::Gradient( * Evaluate the partial gradient of the logistic regression objective * function with respect to the individual features in the parameter. */ -template +template +template void LogisticRegressionFunction::PartialGradient( const MatType& parameters, const size_t j, - arma::sp_mat& gradient) const + GradType& gradient) const { - const arma::Row diffs = responses - - (1 / (1 + arma::exp(-parameters(0, 0) - - parameters.tail_cols(parameters.n_elem - 1) * - predictors))); + const BaseRowType diffs = responses - (1 / (1 + exp(-parameters(0, 0) - + parameters.tail_cols(parameters.n_elem - 1) * predictors))); - gradient.set_size(arma::size(parameters)); + gradient.set_size(size(parameters)); if (j == 0) { - gradient[j] = -arma::accu(diffs); + gradient[j] = -accu(diffs); } else { - gradient[j] = arma::dot(-predictors.row(j - 1), diffs) + lambda * - parameters(0, j); + gradient[j] = dot(-predictors.row(j - 1), diffs) + lambda * + parameters(0, j); } } @@ -246,30 +267,24 @@ LogisticRegressionFunction::EvaluateWithGradient( const MatType& parameters, GradType& gradient) const { - typedef typename MatType::elem_type ElemType; - // Regularization term. - MatType regularization = lambda * - parameters.tail_cols(parameters.n_elem - 1); + MatType regularization = lambda * parameters.tail_cols(parameters.n_elem - 1); - const ElemType objectiveRegularization = lambda / 2.0 * - arma::dot(parameters.tail_cols(parameters.n_elem - 1), - parameters.tail_cols(parameters.n_elem - 1)); + const ElemType objectiveRegularization = halfLambda * + dot(parameters.tail_cols(parameters.n_elem - 1), + parameters.tail_cols(parameters.n_elem - 1)); // Calculate the sigmoid function values. - const arma::Row sigmoids = 1.0 / (1.0 + - arma::exp(-(parameters(0, 0) + - parameters.tail_cols(parameters.n_elem - 1) * predictors))); + const BaseRowType sigmoids = 1 / (1 + exp(-(parameters(0, 0) + + parameters.tail_cols(parameters.n_elem - 1) * predictors))); - gradient.set_size(arma::size(parameters)); - gradient[0] = -arma::accu(responses - sigmoids); + gradient.set_size(size(parameters)); + gradient[0] = -accu(responses - sigmoids); gradient.tail_cols(parameters.n_elem - 1) = (sigmoids - responses) * predictors.t() + regularization; // Now compute the objective function using the sigmoids. - ElemType result = arma::accu(arma::log(1.0 - - arma::conv_to>::from(responses) + sigmoids % - (2 * arma::conv_to>::from(responses) - 1.0))); + ElemType result = accu(log(1 - responses + sigmoids % (2 * responses - 1))); // Invert the result, because it's a minimization. return objectiveRegularization - result; @@ -284,65 +299,64 @@ LogisticRegressionFunction::EvaluateWithGradient( GradType& gradient, const size_t batchSize) const { - typedef typename MatType::elem_type ElemType; + typedef typename ForwardType::brow BaseRowType; // Regularization term. - MatType regularization = - lambda * parameters.tail_cols(parameters.n_elem - 1) / predictors.n_cols * + MatType regularization = lambda * + parameters.tail_cols(parameters.n_elem - 1) / predictors.n_cols * batchSize; - const ElemType objectiveRegularization = lambda * - (batchSize / (2.0 * predictors.n_cols)) * - arma::dot(parameters.tail_cols(parameters.n_elem - 1), - parameters.tail_cols(parameters.n_elem - 1)); + const ElemType objectiveRegularization = halfLambda * + (batchSize / ElemType(predictors.n_cols)) * + dot(parameters.tail_cols(parameters.n_elem - 1), + parameters.tail_cols(parameters.n_elem - 1)); // Calculate the sigmoid function values. - const arma::Row sigmoids = 1.0 / (1.0 + - arma::exp(-(parameters(0, 0) + - parameters.tail_cols(parameters.n_elem - 1) * - predictors.cols(begin, begin + batchSize - 1)))); + const BaseRowType sigmoids = 1 / (1 + exp(-(parameters(0, 0) + + parameters.tail_cols(parameters.n_elem - 1) * + predictors.cols(begin, begin + batchSize - 1)))); gradient.set_size(parameters.n_rows, parameters.n_cols); - gradient[0] = -arma::accu(responses.subvec(begin, begin + batchSize - 1) - + gradient[0] = -accu(responses.subvec(begin, begin + batchSize - 1) - sigmoids); gradient.tail_cols(parameters.n_elem - 1) = (sigmoids - - responses.subvec(begin, begin + batchSize - 1)) * + responses.cols(begin, begin + batchSize - 1)) * predictors.cols(begin, begin + batchSize - 1).t() + regularization; // Now compute the objective function using the sigmoids. - arma::Row respD = arma::conv_to>::from( - responses.subvec(begin, begin + batchSize - 1)); - const ElemType result = arma::accu(arma::log(1.0 - respD + sigmoids % - (2 * respD - 1.0))); + const ElemType result = accu(log( + 1 - responses.subvec(begin, begin + batchSize - 1) + + sigmoids % (2 * responses.subvec(begin, begin + batchSize - 1) - 1))); // Invert the result, because it's a minimization. return objectiveRegularization - result; } template +template void LogisticRegressionFunction::Classify( const MatType& dataset, - arma::Row& labels, + LabelsType& labels, const MatType& parameters, const double decisionBoundary) const { - // Calculate sigmoid function for each point. The (1.0 - decisionBoundary) + // Calculate sigmoid function for each point. The (1 - decisionBoundary) // term correctly sets an offset so that floor() returns 0 or 1 correctly. - labels = arma::conv_to>::from((1.0 / - (1.0 + arma::exp(-parameters(0) - + labels = conv_to::from((1 / (1 + exp(-parameters(0) - parameters.tail_cols(parameters.n_elem - 1) * dataset))) + - (1.0 - decisionBoundary)); + ElemType(1 - decisionBoundary)); } template +template double LogisticRegressionFunction::ComputeAccuracy( const MatType& predictors, - const arma::Row& responses, + const LabelsType& responses, const MatType& parameters, const double decisionBoundary) const { // Predict responses using the current model. - arma::Row tempResponses; + LabelsType tempResponses; Classify(predictors, tempResponses, parameters, decisionBoundary); // Count the number of responses that were correct. diff --git a/inst/include/ensmallen_bits/problems/maf/maf1_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf1_function.hpp index 0b2f133..884df10 100644 --- a/inst/include/ensmallen_bits/problems/maf/maf1_function.hpp +++ b/inst/include/ensmallen_bits/problems/maf/maf1_function.hpp @@ -23,8 +23,8 @@ namespace test { * \f[ * x_M = [x_i, n - M + 1 <= i <= n] * g(x) = \Sigma{i = n - M + 1}^n (x_i - 0.5)^2 - * - * f_1(x) = 1 - x_1 * x_2 * ... x_M-1 * (1 + g(x_M)) + * + * f_1(x) = 1 - x_1 * x_2 * ... x_M-1 * (1 + g(x_M)) * f_2(x) = 1 - x_1 * x_2 * ... (1 - x_M-1) * (1 + g(x_M)) * . * . @@ -50,139 +50,136 @@ namespace test { * * @tparam MatType Type of matrix to optimize. */ - template - class MAF1 +template +class MAF1 +{ + private: + // A fixed no. of Objectives and Variables(|x| = 7, M = 3). + size_t numObjectives {3}; + size_t numVariables {12}; + + public: + /** + * Object Constructor. + * Initializes the individual objective functions. + * + * @param numParetoPoint No. of pareto points in the reference front. + */ + MAF1() : + objectiveF1(0, *this), + objectiveF2(1, *this), + objectiveF3(2, *this) + {/* Nothing to do here */} + + // Get the private variables. + size_t GetNumObjectives() { return numObjectives; } + + size_t GetNumVariables() { return numVariables; } + + // Get the starting point. + arma::Col GetInitialPoint() + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + return arma::Col(numVariables, arma::fill::ones); + } + + /** + * Evaluate the G(x) with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Row + */ + arma::Row g(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Row innerSum(size(coords)[1], arma::fill::zeros); + + for (size_t i = numObjectives - 1;i < numVariables;i++) + { + innerSum += arma::pow((coords.row(i) - 0.5), 2); + } + + return innerSum; + } + + /** + * Evaluate the objectives with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Mat + */ + arma::Mat Evaluate(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Mat objectives(numObjectives, size(coords)[1]); + arma::Row G = g(coords); + arma::Row value(coords.n_cols, arma::fill::ones); + for (size_t i = 0;i < numObjectives - 1;i++) + { + objectives.row(i) = (1 - value % (1.0 - coords.row(i))) % (1. + G); + value = value % coords.row(i); + } + objectives.row(numObjectives - 1) = (1 - value) % (1. + G); + return objectives; + } + + // Individual Objective function. + // Changes based on stop variable provided. + struct MAF1Objective { - private: - - // A fixed no. of Objectives and Variables(|x| = 7, M = 3). - size_t numObjectives {3}; - size_t numVariables {12}; - - public: - - /** - * Object Constructor. - * Initializes the individual objective functions. - * - * @param numParetoPoint No. of pareto points in the reference front. - */ - MAF1() : - objectiveF1(0, *this), - objectiveF2(1, *this), - objectiveF3(2, *this) - {/* Nothing to do here */} - - // Get the private variables. - size_t GetNumObjectives() - { return this -> numObjectives; } - - size_t GetNumVariables() - { return this -> numVariables; } - - // Get the starting point. - arma::Col GetInitialPoint() + MAF1Objective(size_t stop, MAF1& maf): maf(maf), stop(stop) + {/* Nothing to do here. */} + + /** + * Evaluate one objective with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Col + */ + typename MatType::elem_type Evaluate(const MatType& coords) + { + // Convenience typedef. + if (stop == 0) { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - return arma::Col(numVariables, arma::fill::ones); + return coords[0] * (1. + maf.g(coords)[0]); } - - /** - * Evaluate the G(x) with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Row - */ - arma::Row g(const MatType& coords) - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Row innerSum(size(coords)[1], arma::fill::zeros); - - for (size_t i = numObjectives - 1;i < numVariables;i++) - { - innerSum += arma::pow((coords.row(i) - 0.5), 2); - } - - return innerSum; - } - - /** - * Evaluate the objectives with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Mat - */ - arma::Mat Evaluate(const MatType& coords) + typedef typename MatType::elem_type ElemType; + ElemType value = 1.0; + for (size_t i = 0; i < stop; i++) { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Mat objectives(numObjectives, size(coords)[1]); - arma::Row G = g(coords); - arma::Row value(coords.n_cols, arma::fill::ones); - for (size_t i = 0;i < numObjectives - 1;i++) - { - objectives.row(i) = (1 - value % (1.0 - coords.row(i))) % (1. + G); - value = value % coords.row(i); - } - objectives.row(numObjectives - 1) = (1 - value) % (1. + G); - return objectives; + value = value * coords[i]; } - - // Individual Objective function. - // Changes based on stop variable provided. - struct MAF1Objective - { - MAF1Objective(size_t stop, MAF1& maf): stop(stop), maf(maf) - {/* Nothing to do here. */} - - /** - * Evaluate one objective with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Col - */ - typename MatType::elem_type Evaluate(const MatType& coords) - { - // Convenience typedef. - if(stop == 0) - { - return coords[0] * (1. + maf.g(coords)[0]); - } - typedef typename MatType::elem_type ElemType; - ElemType value = 1.0; - for (size_t i = 0; i < stop; i++) - { - value = value * coords[i]; - } - - if(stop != maf.GetNumObjectives() - 1) - { - value = value * (1. - coords[stop]); - } - - value = (1.0 - value) * (1. + maf.g(coords)[0]); - return value; - } - - MAF1& maf; - size_t stop; - }; - - //! Get objective functions. - std::tuple GetObjectives() + + if(stop != maf.GetNumObjectives() - 1) { - return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); + value = value * (1. - coords[stop]); } - MAF1Objective objectiveF1; - MAF1Objective objectiveF2; - MAF1Objective objectiveF3; + value = (1.0 - value) * (1. + maf.g(coords)[0]); + return value; + } + + MAF1& maf; + size_t stop; }; - } //namespace test - } //namespace ens -#endif \ No newline at end of file + //! Get objective functions. + std::tuple GetObjectives() + { + return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); + } + + MAF1Objective objectiveF1; + MAF1Objective objectiveF2; + MAF1Objective objectiveF3; +}; + +} // namespace test +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/problems/maf/maf2_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf2_function.hpp index 697b26b..9140f9a 100644 --- a/inst/include/ensmallen_bits/problems/maf/maf2_function.hpp +++ b/inst/include/ensmallen_bits/problems/maf/maf2_function.hpp @@ -9,7 +9,6 @@ * the 3-clause BSD license along with ensmallen. If not, see * http://www.opensource.org/licenses/BSD-3-Clause for more information. */ - #ifndef ENSMALLEN_PROBLEMS_MAF_TWO_FUNCTION_HPP #define ENSMALLEN_PROBLEMS_MAF_TWO_FUNCTION_HPP @@ -22,8 +21,8 @@ namespace test { * theta_M = [theta_i, n - M + 1 <= i <= n] * g_i(x) = \Sigma{i = M + (i - 1) * (n - M + 1) / N}^ * {M - 1 + (i) * (n - M + 1) / N} (x_i - 0.5)^2 * 0.25 - * - * f_1(x) = cos(theta_1) * cos(theta_2) * ... cos(theta_M-1) * (1 + g_1(theta_M)) + * + * f_1(x) = cos(theta_1) * cos(theta_2) * ... cos(theta_M-1) * (1 + g_1(theta_M)) * f_2(x) = cos(theta_1) * cos(theta_2) * ... sin(theta_M-1) * (1 + g_2(theta_M)) * . * . @@ -32,12 +31,12 @@ namespace test { * * Bounds of the variable space is: * 0 <= x_i <= 1 for i = 1,...,n. - * + * * Where theta_i = 0.5 * (1 + 2 * g(X_M) * x_i) / (1 + g(X_M)) - * - * + * + * * For more information, please refer to: - * + * * @code * @article{cheng2017benchmark, * title={A benchmark test suite for evolutionary many-objective optimization}, @@ -49,156 +48,154 @@ namespace test { * publisher={Springer} * } * @endcode - * + * * @tparam MatType Type of matrix to optimize. */ - template - class MAF2 +template +class MAF2 +{ + private: + // A fixed no. of Objectives and Variables(|x| = 7, M = 3). + size_t numObjectives {3}; + size_t numVariables {12}; + size_t numParetoPoints; + + public: + /** + * Object Constructor. + * Initializes the individual objective functions. + * + * @param numParetoPoint No. of pareto points in the reference front. + */ + MAF2() : + objectiveF1(0, *this), + objectiveF2(1, *this), + objectiveF3(2, *this) + {/*Nothing to do here.*/} + + //! Get the starting point. + arma::Col GetInitialPoint() { - private: - - // A fixed no. of Objectives and Variables(|x| = 7, M = 3). - size_t numObjectives {3}; - size_t numVariables {12}; - size_t numParetoPoints; - - public: - - /** - * Object Constructor. - * Initializes the individual objective functions. - * - * @param numParetoPoint No. of pareto points in the reference front. - */ - MAF2() : - objectiveF1(0, *this), - objectiveF2(1, *this), - objectiveF3(2, *this) - {/*Nothing to do here.*/} - - //! Get the starting point. - arma::Col GetInitialPoint() - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - return arma::Col(numVariables, 1, arma::fill::zeros); - } - - // Get the private variables. - - // Get the number of objectives. - size_t GetNumObjectives() - { return this -> numObjectives; } - - // Get the number of variables. - size_t GetNumVariables() - { return this -> numVariables; } - - /** - * Evaluate the G(x) with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Row - */ - arma::Mat g(const MatType& coords) - { - size_t k = numVariables - numObjectives + 1; - size_t c = std::floor(k / numObjectives); - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Mat innerSum(numObjectives, size(coords)[1], - arma::fill::zeros); - - for (size_t i = 0; i < numObjectives; i++) - { - size_t j = numObjectives - 1 + (i * c); - for(; j < numVariables - 1 + (i + 1) *c && j < numObjectives; j++) - { - innerSum.row(i) += arma::pow((coords.row(i) - 0.5), 2) * 0.25; - } - } - - return innerSum; - } - - /** - * Evaluate the objectives with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Mat - */ - arma::Mat Evaluate(const MatType& coords) + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + return arma::Col(numVariables, 1, arma::fill::zeros); + } + + // Get the private variables. + + // Get the number of objectives. + size_t GetNumObjectives() { return numObjectives; } + + // Get the number of variables. + size_t GetNumVariables() { return numVariables; } + + /** + * Evaluate the G(x) with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Row + */ + arma::Mat g(const MatType& coords) + { + size_t k = numVariables - numObjectives + 1; + size_t c = std::floor(k / numObjectives); + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Mat innerSum(numObjectives, size(coords)[1], + arma::fill::zeros); + + for (size_t i = 0; i < numObjectives; i++) + { + size_t j = numObjectives - 1 + (i * c); + for(; j < numVariables - 1 + (i + 1) *c && j < numObjectives; j++) { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Mat objectives(numObjectives, size(coords)[1]); - arma::Mat G = g(coords); - arma::Row value(size(coords)[1], arma::fill::ones); - arma::Row theta; - for (size_t i = 0; i < numObjectives - 1; i++) - { - theta = arma::datum::pi * 0.5 * ((coords.row(i) / 2) + 0.25); - objectives.row(i) = value % - arma::sin(theta) % (1.0 + G.row(numObjectives - 1 - i)); - value = value % arma::cos(theta); - } - objectives.row(numObjectives - 1) = value % - (1.0 + G.row(0)); - return objectives; + innerSum.row(i) += arma::pow((coords.row(i) - 0.5), 2) * 0.25; } - - // Individual Objective function. - // Changes based on stop variable provided. - struct MAF2Objective + } + + return innerSum; + } + + /** + * Evaluate the objectives with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Mat + */ + arma::Mat Evaluate(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Mat objectives(numObjectives, size(coords)[1]); + arma::Mat G = g(coords); + arma::Row value(size(coords)[1], arma::fill::ones); + arma::Row theta; + for (size_t i = 0; i < numObjectives - 1; i++) + { + theta = arma::datum::pi * 0.5 * ((coords.row(i) / 2) + 0.25); + objectives.row(i) = value % + arma::sin(theta) % (1.0 + G.row(numObjectives - 1 - i)); + value = value % arma::cos(theta); + } + objectives.row(numObjectives - 1) = value % + (1.0 + G.row(0)); + return objectives; + } + + // Individual Objective function. + // Changes based on stop variable provided. + struct MAF2Objective + { + MAF2Objective(size_t stop, MAF2& maf): stop(stop), maf(maf) + {/* Nothing to do here. */} + + /** + * Evaluate one objective with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Col + */ + typename MatType::elem_type Evaluate(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + ElemType value = 1.0; + ElemType theta; + arma::Col G = maf.g(coords).col(0); + for (size_t i = 0; i < stop; i++) { - MAF2Objective(size_t stop, MAF2& maf): stop(stop), maf(maf) - {/* Nothing to do here. */} - - /** - * Evaluate one objective with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Col - */ - typename MatType::elem_type Evaluate(const MatType& coords) - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - ElemType value = 1.0; - ElemType theta; - arma::Col G = maf.g(coords).col(0); - for (size_t i = 0; i < stop; i++) - { - theta = arma::datum::pi * 0.5 * ((coords[i] / 2) + 0.25); - value = value * std::cos(theta); - } - theta = arma::datum::pi * 0.5 * ((coords[stop] / 2) + 0.25); - if(stop != maf.numObjectives - 1) - { - value = value * std::sin(theta); - } - - value = value * (1.0 + G[maf.GetNumObjectives() - 1 - stop]); - return value; - } - - MAF2& maf; - size_t stop; - }; - - // Return back a tuple of objective functions. - std::tuple GetObjectives() + theta = arma::datum::pi * 0.5 * ((coords[i] / 2) + 0.25); + value = value * std::cos(theta); + } + + theta = arma::datum::pi * 0.5 * ((coords[stop] / 2) + 0.25); + if (stop != maf.numObjectives - 1) { - return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); + value = value * std::sin(theta); } - MAF2Objective objectiveF1; - MAF2Objective objectiveF2; - MAF2Objective objectiveF3; + value = value * (1.0 + G[maf.GetNumObjectives() - 1 - stop]); + return value; + } + + MAF2& maf; + size_t stop; }; - } //namespace test - } //namespace ens -#endif \ No newline at end of file + // Return back a tuple of objective functions. + std::tuple GetObjectives() + { + return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); + } + + MAF2Objective objectiveF1; + MAF2Objective objectiveF2; + MAF2Objective objectiveF3; +}; + +} // namespace test +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/problems/maf/maf3_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf3_function.hpp index 4ea0963..af68a27 100644 --- a/inst/include/ensmallen_bits/problems/maf/maf3_function.hpp +++ b/inst/include/ensmallen_bits/problems/maf/maf3_function.hpp @@ -22,9 +22,9 @@ namespace test { * The MAF3 function, defined by: * \f[ * x_M = [x_i, n - M + 1 <= i <= n] - * g(x) = 100 * [|x_M| + \Sigma{i = n - M + 1}^n (x_i - 0.5)^2 - cos(20 * pi * + * g(x) = 100 * [|x_M| + \Sigma{i = n - M + 1}^n (x_i - 0.5)^2 - cos(20 * pi * * (x_i - 0.5))] - * + * * f_1(x) = (cos(x_1 * pi * 0.5) * cos(x_2 * pi * 0.5) * ... cos(x_2 * pi * 0.5) * (1 + g(x_M)))^4 * f_2(x) = (cos(x_1 * pi * 0.5) * cos(x_2 * pi * 0.5) * ... sin(x_M-1 * pi * 0.5) * (1 + g(x_M)))^4 * . @@ -34,9 +34,9 @@ namespace test { * * Bounds of the variable space is: * 0 <= x_i <= 1 for i = 1,...,n. - * + * * For more information, please refer to: - * + * * @code * @article{cheng2017benchmark, * title={A benchmark test suite for evolutionary many-objective optimization}, @@ -51,146 +51,146 @@ namespace test { * * @tparam MatType Type of matrix to optimize. */ - template - class MAF3 +template +class MAF3 +{ + private: + // A fixed no. of Objectives and Variables(|x| = 12, M = 3). + size_t numObjectives {3}; + size_t numVariables {12}; + + public: + /** + * Object Constructor. + * Initializes the individual objective functions. + * + * @param numParetoPoint No. of pareto points in the reference front. + */ + MAF3() : + objectiveF1(0, *this), + objectiveF2(1, *this), + objectiveF3(2, *this) + {/*Nothing to do here.*/} + + //! Get the starting point. + arma::Col GetInitialPoint() { - private: - - // A fixed no. of Objectives and Variables(|x| = 12, M = 3). - size_t numObjectives {3}; - size_t numVariables {12}; - - public: - - /** - * Object Constructor. - * Initializes the individual objective functions. - * - * @param numParetoPoint No. of pareto points in the reference front. - */ - MAF3() : - objectiveF1(0, *this), - objectiveF2(1, *this), - objectiveF3(2, *this) - {/*Nothing to do here.*/} - - //! Get the starting point. - arma::Col GetInitialPoint() - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - return arma::Col(numVariables, 1, arma::fill::zeros); - } - - // Get the private variables. - - // Get the number of objectives. - size_t GetNumObjectives() - { return this -> numObjectives; } - - // Get the number of variables. - size_t GetNumVariables() - { return this -> numVariables;} - - /** - * Evaluate the G(x) with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Row - */ - arma::Row g(const MatType& coords) - { - size_t k = numVariables - numObjectives + 1; - - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Row innerSum(size(coords)[1], arma::fill::zeros); - - for (size_t i = numObjectives - 1; i < numVariables; i++) - { - innerSum += arma::pow((coords.row(i) - 0.5), 2) - - arma::cos(20 * arma::datum::pi * (coords.row(i) - 0.5)); - } - - return 100 * (k + innerSum); - } - - /** - * Evaluate the objectives with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Mat - */ - arma::Mat Evaluate(const MatType& coords) + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + return arma::Col(numVariables, 1, arma::fill::zeros); + } + + // Get the private variables. + + // Get the number of objectives. + size_t GetNumObjectives() { return numObjectives; } + + // Get the number of variables. + size_t GetNumVariables() { return numVariables; } + + /** + * Evaluate the G(x) with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Row + */ + arma::Row g(const MatType& coords) + { + size_t k = numVariables - numObjectives + 1; + + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Row innerSum(size(coords)[1], arma::fill::zeros); + + for (size_t i = numObjectives - 1; i < numVariables; i++) + { + innerSum += arma::pow((coords.row(i) - 0.5), 2) - + arma::cos(20 * arma::datum::pi * (coords.row(i) - 0.5)); + } + + return 100 * (k + innerSum); + } + + /** + * Evaluate the objectives with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Mat + */ + arma::Mat Evaluate(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Mat objectives(numObjectives, size(coords)[1], + arma::fill::ones); + arma::Row G = g(coords); + arma::Row value = (1.0 + G); + for (size_t i = 0; i < numObjectives - 1; i++) + { + objectives.row(i) = arma::pow(value, i == 0 ? 2:4) % + arma::pow(arma::sin(coords.row(i) * arma::datum::pi * 0.5), + i == 0 ? 2:4); + value = value % arma::cos(coords.row(i) * arma::datum::pi * 0.5); + } + objectives.row(numObjectives - 1) = arma::pow(value, 4); + return objectives; + } + + // Individual Objective function. + // Changes based on stop variable provided. + struct MAF3Objective + { + MAF3Objective(size_t stop, MAF3& maf): maf(maf), stop(stop) + {/* Nothing to do here. */} + + /** + * Evaluate one objective with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Col + */ + typename MatType::elem_type Evaluate(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + ElemType value = 1.0; + for (size_t i = 0; i < stop; i++) { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Mat objectives(numObjectives, size(coords)[1], arma::fill::ones); - arma::Row G = g(coords); - arma::Row value = (1.0 + G); - for (size_t i = 0; i < numObjectives - 1; i++) - { - objectives.row(i) = arma::pow(value, i == 0 ? 2:4) % - arma::pow(arma::sin(coords.row(i) * arma::datum::pi * 0.5), i == 0 ? 2:4); - value = value % arma::cos(coords.row(i) * arma::datum::pi * 0.5); - } - objectives.row(numObjectives - 1) = arma::pow(value, 4); - return objectives; + value = value * std::cos(coords[i] * arma::datum::pi * 0.5); } - - // Individual Objective function. - // Changes based on stop variable provided. - struct MAF3Objective + + if (stop != maf.GetNumObjectives() - 1) { - MAF3Objective(size_t stop, MAF3& maf): stop(stop), maf(maf) - {/* Nothing to do here. */} - - /** - * Evaluate one objective with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Col - */ - typename MatType::elem_type Evaluate(const MatType& coords) - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - ElemType value = 1.0; - for (size_t i = 0; i < stop; i++) - { - value = value * std::cos(coords[i] * arma::datum::pi * 0.5); - } - - if(stop != maf.GetNumObjectives() - 1) - { - value = value * std::sin(coords[stop] * arma::datum::pi * 0.5); - } - - value = value * (1. + maf.g(coords)[0]); - - if(stop == 0) { - return std::pow(value, 2); - } - return std::pow(value, 4); - } - - MAF3& maf; - size_t stop; - }; - - // Return back a tuple of objective functions. - std::tuple GetObjectives() + value = value * std::sin(coords[stop] * arma::datum::pi * 0.5); + } + + value = value * (1. + maf.g(coords)[0]); + + if (stop == 0) { - return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); - } + return std::pow(value, 2); + } + return std::pow(value, 4); + } - MAF3Objective objectiveF1; - MAF3Objective objectiveF2; - MAF3Objective objectiveF3; + MAF3& maf; + size_t stop; }; - } //namespace test - } //namespace ens -#endif \ No newline at end of file + // Return back a tuple of objective functions. + std::tuple GetObjectives() + { + return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); + } + + MAF3Objective objectiveF1; + MAF3Objective objectiveF2; + MAF3Objective objectiveF3; +}; + +} // namespace test +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/problems/maf/maf4_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf4_function.hpp index 9258468..aa4bcd1 100644 --- a/inst/include/ensmallen_bits/problems/maf/maf4_function.hpp +++ b/inst/include/ensmallen_bits/problems/maf/maf4_function.hpp @@ -22,10 +22,10 @@ namespace test { * The MAF4 function, defined by: * \f[ * x_M = [x_i, n - M + 1 <= i <= n] - * g(x) = 100 * [|x_M| + \Sigma{i = n - M + 1}^n (x_i - 0.5)^2 - cos(20 * pi * + * g(x) = 100 * [|x_M| + \Sigma{i = n - M + 1}^n (x_i - 0.5)^2 - cos(20 * pi * * (x_i - 0.5))] - * - * f_1(x) = a * (1 - cos(x_1 * pi * 0.5) * cos(x_2 * pi * 0.5) * ... cos(x_2 * pi * 0.5))* (1 + g(x_M)) + * + * f_1(x) = a * (1 - cos(x_1 * pi * 0.5) * cos(x_2 * pi * 0.5) * ... cos(x_2 * pi * 0.5))* (1 + g(x_M)) * f_2(x) = a^2 * (1 - cos(x_1 * pi * 0.5) * cos(x_2 * pi * 0.5) * ... sin(x_M-1 * pi * 0.5)) * (1 + g(x_M)) * . * . @@ -34,9 +34,9 @@ namespace test { * * Bounds of the variable space is: * 0 <= x_i <= 1 for i = 1,...,n. - * + * * For more information, please refer to: - * + * * @code * @article{cheng2017benchmark, * title={A benchmark test suite for evolutionary many-objective optimization}, @@ -51,161 +51,156 @@ namespace test { * * @tparam MatType Type of matrix to optimize. */ - template - class MAF4 +template +class MAF4 +{ + private: + // A fixed no. of Objectives and Variables(|x| = 7, M = 3). + size_t numObjectives {3}; + size_t numVariables {12}; + double a; + + public: + /** + * Object Constructor. + * Initializes the individual objective functions. + * + * @param numParetoPoint No. of pareto points in the reference front. + * @param a The scale factor of the objectives. + */ + MAF4(double a = 2) : + a(a), + objectiveF1(0, *this), + objectiveF2(1, *this), + objectiveF3(2, *this) + {/*Nothing to do here.*/} + + //! Get the starting point. + arma::Col GetInitialPoint() { - private: - - // A fixed no. of Objectives and Variables(|x| = 7, M = 3). - size_t numObjectives {3}; - size_t numVariables {12}; - double a; - - public: - - /** - * Object Constructor. - * Initializes the individual objective functions. - * - * @param numParetoPoint No. of pareto points in the reference front. - * @param a The scale factor of the objectives. - */ - MAF4(double a = 2) : - objectiveF1(0, *this), - objectiveF2(1, *this), - objectiveF3(2, *this), - a(a) - {/*Nothing to do here.*/} - - //! Get the starting point. - arma::Col GetInitialPoint() - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - return arma::Col(numVariables, 1, arma::fill::zeros); - } - - // Get the private variables. - - // Get the number of objectives. - size_t GetNumObjectives() - { return this -> numObjectives; } - - // Get the number of variables. - size_t GetNumVariables() - { return this -> numVariables;} - - //Get the scaling parameter a. - size_t GetA() - { return this -> a; } - - /** - * Set the scale factor of the objectives. - * - * @param a The scale factor a of the objectives. - */ - void SetA(double a) - { this -> a = a; } - - /** - * Evaluate the G(x) with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Row - */ - arma::Row g(const MatType& coords) + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + return arma::Col(numVariables, 1, arma::fill::zeros); + } + + // Get the private variables. + + // Get the number of objectives. + size_t GetNumObjectives() { return numObjectives; } + + // Get the number of variables. + size_t GetNumVariables() { return numVariables;} + + //Get the scaling parameter a. + size_t GetA() { return a; } + + /** + * Set the scale factor of the objectives. + * + * @param a The scale factor a of the objectives. + */ + void SetA(double a) { this->a = a; } + + /** + * Evaluate the G(x) with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Row + */ + arma::Row g(const MatType& coords) + { + size_t k = numVariables - numObjectives + 1; + + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Row innerSum(size(coords)[1], arma::fill::zeros); + + for (size_t i = numObjectives - 1; i < numVariables; i++) + { + innerSum += arma::pow((coords.row(i) - 0.5), 2) - + arma::cos(20 * arma::datum::pi * (coords.row(i) - 0.5)); + } + + return 100 * (k + innerSum); + } + + /** + * Evaluate the objectives with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Mat + */ + arma::Mat Evaluate(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Mat objectives(numObjectives, size(coords)[1]); + arma::Row G = g(coords); + arma::Row value(coords.n_cols, arma::fill::ones); + for (size_t i = 0; i < numObjectives - 1; i++) + { + objectives.row(i) = (1.0 - value % + arma::sin(coords.row(i) * arma::datum::pi * 0.5)) % (1. + G) * + std::pow(a, numObjectives - i); + value = value % arma::cos(coords.row(i) * arma::datum::pi * 0.5); + } + objectives.row(numObjectives - 1) = (1 - value) % (1. + G) * + std::pow(a, 1); + return objectives; + } + + // Individual Objective function. + // Changes based on stop variable provided. + struct MAF4Objective + { + MAF4Objective(size_t stop, MAF4& maf): maf(maf), stop(stop) + {/* Nothing to do here. */} + + /** + * Evaluate one objective with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Col + */ + typename MatType::elem_type Evaluate(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + ElemType value = 1.0; + for (size_t i = 0; i < stop; i++) { - size_t k = numVariables - numObjectives + 1; - - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Row innerSum(size(coords)[1], arma::fill::zeros); - - for (size_t i = numObjectives - 1; i < numVariables; i++) - { - innerSum += arma::pow((coords.row(i) - 0.5), 2) - - arma::cos(20 * arma::datum::pi * (coords.row(i) - 0.5)); - } - - return 100 * (k + innerSum); + value = value * std::cos(coords[i] * arma::datum::pi * 0.5); } - /** - * Evaluate the objectives with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Mat - */ - arma::Mat Evaluate(const MatType& coords) + if(stop != maf.GetNumObjectives() - 1) { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Mat objectives(numObjectives, size(coords)[1]); - arma::Row G = g(coords); - arma::Row value(coords.n_cols, arma::fill::ones); - for (size_t i = 0; i < numObjectives - 1; i++) - { - objectives.row(i) = (1.0 - value % - arma::sin(coords.row(i) * arma::datum::pi * 0.5)) % (1. + G) * - std::pow(a, numObjectives - i); - value = value % arma::cos(coords.row(i) * arma::datum::pi * 0.5); - } - objectives.row(numObjectives - 1) = (1 - value) % (1. + G) * - std::pow(a, 1); - return objectives; + value = value * std::sin(coords[stop] * arma::datum::pi * 0.5); } - - // Individual Objective function. - // Changes based on stop variable provided. - struct MAF4Objective - { - MAF4Objective(size_t stop, MAF4& maf): stop(stop), maf(maf) - {/* Nothing to do here. */} - - /** - * Evaluate one objective with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Col - */ - typename MatType::elem_type Evaluate(const MatType& coords) - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - ElemType value = 1.0; - for (size_t i = 0; i < stop; i++) - { - value = value * std::cos(coords[i] * arma::datum::pi * 0.5); - } - - if(stop != maf.GetNumObjectives() - 1) - { - value = value * std::sin(coords[stop] * arma::datum::pi * 0.5); - } - - value = std::pow(maf.GetA(), maf.GetNumObjectives() - stop) * - (1 - value) * (1. + maf.g(coords)[0]); - - return value; - } - - MAF4& maf; - size_t stop; - }; - - // Return back a tuple of objective functions. - std::tuple GetObjectives() - { - return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); - } - MAF4Objective objectiveF1; - MAF4Objective objectiveF2; - MAF4Objective objectiveF3; + value = std::pow(maf.GetA(), maf.GetNumObjectives() - stop) * + (1 - value) * (1. + maf.g(coords)[0]); + + return value; + } + + MAF4& maf; + size_t stop; }; - } //namespace test - } //namespace ens + + // Return back a tuple of objective functions. + std::tuple GetObjectives() + { + return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); + } + + MAF4Objective objectiveF1; + MAF4Objective objectiveF2; + MAF4Objective objectiveF3; +}; + +} // namespace test +} // namespace ens #endif diff --git a/inst/include/ensmallen_bits/problems/maf/maf5_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf5_function.hpp index db8c16f..bb1b8c3 100644 --- a/inst/include/ensmallen_bits/problems/maf/maf5_function.hpp +++ b/inst/include/ensmallen_bits/problems/maf/maf5_function.hpp @@ -23,8 +23,8 @@ namespace test { * \f[ * x_M = [x_i, n - M + 1 <= i <= n] * g(x) = \Sigma{i = n - M + 1}^n (x_i - 0.5)^2 - * - * f_1(x) = a^M * cos(x_1^alpha * pi * 0.5) * cos(x_2^alpha * pi * 0.5) * ... cos(x_2^alpha * pi * 0.5) * (1 + g(x_M)) + * + * f_1(x) = a^M * cos(x_1^alpha * pi * 0.5) * cos(x_2^alpha * pi * 0.5) * ... cos(x_2^alpha * pi * 0.5) * (1 + g(x_M)) * f_2(x) = a^M-1 * cos(x_1^alpha * pi * 0.5) * cos(x_2^alpha * pi * 0.5) * ... sin(x_M-1^alpha * pi * 0.5) * (1 + g(x_M)) * . * . @@ -35,9 +35,9 @@ namespace test { * 0 <= x_i <= 1 for i = 1,...,n. * * This should be optimized to x_i = 0.5 (for all x_i in x_M), at: - * + * * For more information, please refer to: - * + * * @code * @article{cheng2017benchmark, * title={A benchmark test suite for evolutionary many-objective optimization}, @@ -52,175 +52,169 @@ namespace test { * * @tparam MatType Type of matrix to optimize. */ - template - class MAF5 +template +class MAF5 +{ + private: + // A fixed no. of Objectives and Variables(|x| = 7, M = 3). + size_t numObjectives {3}; + size_t numVariables {12}; + size_t alpha; + size_t a; + + public: + /** + * Object Constructor. + * Initializes the individual objective functions. + * + * @param alpha The power which each variable is raised to. + * @param numParetoPoint No. of pareto points in the reference front. + * @param a The scale factor of the objectives. + */ + MAF5(size_t alpha = 100, double a = 2) : + alpha(alpha), + a(a), + objectiveF1(0, *this), + objectiveF2(1, *this), + objectiveF3(2, *this) + {/*Nothing to do here.*/} + + //! Get the starting point. + arma::Col GetInitialPoint() { - private: - - // A fixed no. of Objectives and Variables(|x| = 7, M = 3). - size_t numObjectives {3}; - size_t numVariables {12}; - size_t alpha; - size_t a; - - public: - - /** - * Object Constructor. - * Initializes the individual objective functions. - * - * @param alpha The power which each variable is raised to. - * @param numParetoPoint No. of pareto points in the reference front. - * @param a The scale factor of the objectives. - */ - MAF5(size_t alpha = 100, double a = 2) : - alpha(alpha), - a(a), - objectiveF1(0, *this), - objectiveF2(1, *this), - objectiveF3(2, *this) - {/*Nothing to do here.*/} - - //! Get the starting point. - arma::Col GetInitialPoint() - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - return arma::Col(numVariables, 1, arma::fill::zeros); - } - - // Get the private variables. - - // Get the number of objectives. - size_t GetNumObjectives() - { return this -> numObjectives; } - - // Get the number of variables. - size_t GetNumVariables() - { return this -> numVariables; } - - // Get the scale factor a. - double GetA() - { return this -> a; } - - // Get the power alpha of each variable. - size_t GetAlpha() - { return this -> alpha; } - - /** - * Set the scale factor a. - * - * @param a The scale factor of the objectives. - */ - void SetA(double a) - { this -> a = a; } - - /** - * Set the power of each variable alpha. - * - * @param alpha The power of each variable. - */ - void SetAlpha(size_t alpha) - { this -> alpha = alpha; } - - /** - * Evaluate the G(x) with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Row - */ - arma::Row g(const MatType& coords) + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + return arma::Col(numVariables, 1, arma::fill::zeros); + } + + // Get the private variables. + + // Get the number of objectives. + size_t GetNumObjectives() { return numObjectives; } + + // Get the number of variables. + size_t GetNumVariables() { return numVariables; } + + // Get the scale factor a. + double GetA() { return a; } + + // Get the power alpha of each variable. + size_t GetAlpha() { return alpha; } + + /** + * Set the scale factor a. + * + * @param a The scale factor of the objectives. + */ + void SetA(double a) { this->a = a; } + + /** + * Set the power of each variable alpha. + * + * @param alpha The power of each variable. + */ + void SetAlpha(size_t alpha) { this->alpha = alpha; } + + /** + * Evaluate the G(x) with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Row + */ + arma::Row g(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Row innerSum(size(coords)[1], arma::fill::zeros); + + for (size_t i = numObjectives - 1; i < numVariables; i++) + { + innerSum += arma::pow((coords.row(i) - 0.5), 2); + } + + return innerSum; + } + + /** + * Evaluate the objectives with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Mat + */ + arma::Mat Evaluate(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Mat objectives(numObjectives, size(coords)[1]); + arma::Row G = g(coords); + arma::Row value = (1.0 + G); + for (size_t i = 0; i < numObjectives - 1; i++) + { + objectives.row(i) = std::pow(a, i + 1) * arma::pow(value, 4) % + arma::pow(arma::sin(arma::pow(coords.row(i), alpha) * + arma::datum::pi * 0.5), 4); + value = value % arma::cos(arma::pow(coords.row(i), alpha) * + arma::datum::pi * 0.5); + } + objectives.row(numObjectives - 1) = arma::pow(value, 4) * std::pow(a, + numObjectives); + return objectives; + } + + // Individual Objective function. + // Changes based on stop variable provided. + struct MAF5Objective + { + MAF5Objective(size_t stop, MAF5& maf): stop(stop), maf(maf) + {/* Nothing to do here.*/} + + /** + * Evaluate one objective with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Col + */ + typename MatType::elem_type Evaluate(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + ElemType value = 1.0; + for (size_t i = 0; i < stop; i++) { + value = value * std::cos(std::pow(coords[i], maf.GetAlpha()) + * arma::datum::pi * 0.5); + } - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Row innerSum(size(coords)[1], arma::fill::zeros); - - for (size_t i = numObjectives - 1; i < numVariables; i++) - { - innerSum += arma::pow((coords.row(i) - 0.5), 2); - } - - return innerSum; - } - - /** - * Evaluate the objectives with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Mat - */ - arma::Mat Evaluate(const MatType& coords) + if (stop != maf.GetNumObjectives() - 1) { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Mat objectives(numObjectives, size(coords)[1]); - arma::Row G = g(coords); - arma::Row value = (1.0 + G); - for (size_t i = 0; i < numObjectives - 1; i++) - { - objectives.row(i) = std::pow(a, i + 1) * arma::pow(value, 4) % - arma::pow(arma::sin(arma::pow(coords.row(i), alpha) * - arma::datum::pi * 0.5), 4); - value = value % arma::cos(arma::pow(coords.row(i), alpha) * arma::datum::pi * 0.5); - } - objectives.row(numObjectives - 1) = arma::pow(value, 4) * std::pow(a, numObjectives); - return objectives; + value = value * std::sin(std::pow(coords[stop], maf.GetAlpha()) + * arma::datum::pi * 0.5); } - - // Individual Objective function. - // Changes based on stop variable provided. - struct MAF5Objective - { - MAF5Objective(size_t stop, MAF5& maf): stop(stop), maf(maf) - {/* Nothing to do here.*/} - - /** - * Evaluate one objective with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Col - */ - typename MatType::elem_type Evaluate(const MatType& coords) - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - ElemType value = 1.0; - for (size_t i = 0; i < stop; i++) - { - value = value * std::cos(std::pow(coords[i], maf.GetAlpha()) - * arma::datum::pi * 0.5); - } - - if(stop != maf.GetNumObjectives() - 1) - { - value = value * std::sin(std::pow(coords[stop], maf.GetAlpha()) - * arma::datum::pi * 0.5); - } - - value = value * (1 + maf.g(coords)[0]); - value = std::pow(value, 4); - value = value * std::pow(maf.GetA(), stop + 1); - return value; - } - - MAF5& maf; - size_t stop; - }; - - // Return back a tuple of objective functions. - std::tuple GetObjectives() - { - return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); - } - MAF5Objective objectiveF1; - MAF5Objective objectiveF2; - MAF5Objective objectiveF3; + value = value * (1 + maf.g(coords)[0]); + value = std::pow(value, 4); + value = value * std::pow(maf.GetA(), stop + 1); + return value; + } + + MAF5& maf; + size_t stop; }; - } //namespace test - } //namespace ens -#endif \ No newline at end of file + // Return back a tuple of objective functions. + std::tuple GetObjectives() + { + return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); + } + + MAF5Objective objectiveF1; + MAF5Objective objectiveF2; + MAF5Objective objectiveF3; +}; + +} // namespace test +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/problems/maf/maf6_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf6_function.hpp index 2ace864..55de247 100644 --- a/inst/include/ensmallen_bits/problems/maf/maf6_function.hpp +++ b/inst/include/ensmallen_bits/problems/maf/maf6_function.hpp @@ -21,8 +21,8 @@ namespace test { * \f[ * theta_M = [theta_i, n - M + 1 <= i <= n] * g(x) = \Sigma{i = n - M + 1}^n (x_i - 0.5)^2 - * - * f_1(x) = 0.5 * cos(theta_1 * pi * 0.5) * cos(theta_2 * pi * 0.5) * ... cos(theta_2 * pi * 0.5) * (1 + g(theta_M)) + * + * f_1(x) = 0.5 * cos(theta_1 * pi * 0.5) * cos(theta_2 * pi * 0.5) * ... cos(theta_2 * pi * 0.5) * (1 + g(theta_M)) * f_2(x) = 0.5 * cos(theta_1 * pi * 0.5) * cos(theta_2 * pi * 0.5) * ... sin(theta_M-1 * pi * 0.5) * (1 + g(theta_M)) * . * . @@ -31,13 +31,13 @@ namespace test { * * Bounds of the variable space is: * 0 <= x_i <= 1 for i = 1,...,n. - * + * * Where theta_i = 0.5 * (1 + 2 * g(X_M) * x_i) / (1 + g(X_M)) - * + * * This should be optimized to x_i = 0.5 (for all x_i in X_M), at: - * + * * For more information, please refer to: - * + * * @code * @article{cheng2017benchmark, * title={A benchmark test suite for evolutionary many-objective optimization}, @@ -52,183 +52,177 @@ namespace test { * * @tparam MatType Type of matrix to optimize. */ - template - class MAF6 +template +class MAF6 +{ + private: + // A fixed no. of Objectives and Variables(|x| = 7, M = 3). + size_t numObjectives {3}; + size_t numVariables {12}; + size_t I; + + public: + /** + * Object Constructor. + * Initializes the individual objective functions. + * + * @param numParetoPoint No. of pareto points in the reference front. + * @param I The manifold dimension (zero indexed). + */ + MAF6(size_t I = 2) : + objectiveF1(0, *this), + objectiveF2(1, *this), + objectiveF3(2, *this), + I(I) + {/*Nothing to do here.*/} + + //! Get the starting point. + arma::Col GetInitialPoint() + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + return arma::Col(numVariables, 1, arma::fill::zeros); + } + + // Get the private variables. + + // Get the number of objectives. + size_t GetNumObjectives() { return numObjectives; } + + // Get the number of variables. + size_t GetNumVariables() { return numVariables; } + + // Get the manifold dimension. + size_t GetI() { return I; } + + /** + * Set the no. of pareto points. + * + * @param I The manifold dimension (0 indexed). + */ + void SetI(size_t I) { this->I = I; } + + /** + * Evaluate the G(x) with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Row + */ + arma::Row g(const MatType& coords) { - private: - - // A fixed no. of Objectives and Variables(|x| = 7, M = 3). - size_t numObjectives {3}; - size_t numVariables {12}; - size_t I; - - public: - - /** - * Object Constructor. - * Initializes the individual objective functions. - * - * @param numParetoPoint No. of pareto points in the reference front. - * @param I The manifold dimension (zero indexed). - */ - MAF6(size_t I = 2) : - objectiveF1(0, *this), - objectiveF2(1, *this), - objectiveF3(2, *this), - I(I) - {/*Nothing to do here.*/} - - //! Get the starting point. - arma::Col GetInitialPoint() + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Row innerSum(size(coords)[1], arma::fill::zeros); + + for (size_t i = numObjectives - 1; i < numVariables; i++) + { + innerSum += arma::pow((coords.row(i) - 0.5), 2); + } + + return innerSum; + } + + /** + * Evaluate the objectives with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Mat + */ + arma::Mat Evaluate(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + arma::Mat objectives(numObjectives, size(coords)[1]); + arma::Row G = g(coords); + arma::Row value = (1.0 + 100 * G); + arma::Row theta; + for (size_t i = 0; i < numObjectives - 1; i++) + { + if (i < I) { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - return arma::Col(numVariables, 1, arma::fill::zeros); + theta = coords.row(i) * arma::datum::pi * 0.5; } - - // Get the private variables. - - // Get the number of objectives. - size_t GetNumObjectives() - { return this -> numObjectives; } - - // Get the number of variables. - size_t GetNumVariables() - { return this -> numVariables; } - - // Get the manifold dimension. - size_t GetI() - { return this -> I; } - - /** - * Set the no. of pareto points. - * - * @param I The manifold dimension (0 indexed). - */ - void SetI(size_t I) - { this -> I = I; } - - /** - * Evaluate the G(x) with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Row - */ - arma::Row g(const MatType& coords) + else { - - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Row innerSum(size(coords)[1], arma::fill::zeros); - - for (size_t i = numObjectives - 1; i < numVariables; i++) - { - innerSum += arma::pow((coords.row(i) - 0.5), 2); - } - - return innerSum; - } - - /** - * Evaluate the objectives with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Mat - */ - arma::Mat Evaluate(const MatType& coords) + theta = 0.25 * (1.0 + 2.0 * coords.row(i) % G) / (1.0 + G); + } + objectives.row(i) = value % + arma::sin(theta); + value = value % arma::cos(theta); + } + objectives.row(numObjectives - 1) = value; + return objectives; + } + + // Individual Objective function. + // Changes based on stop variable provided. + struct MAF6Objective + { + MAF6Objective(size_t stop, MAF6& maf): stop(stop), maf(maf) + {/* Nothing to do here. */} + + /** + * Evaluate one objective with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Col + */ + typename MatType::elem_type Evaluate(const MatType& coords) + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + ElemType value = 1.0; + ElemType theta; + ElemType G = maf.g(coords)[0]; + for (size_t i = 0; i < stop; i++) { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - - arma::Mat objectives(numObjectives, size(coords)[1]); - arma::Row G = g(coords); - arma::Row value = (1.0 + 100 * G); - arma::Row theta; - for (size_t i = 0; i < numObjectives - 1; i++) + if (i < maf.GetI()) + { + theta = arma::datum::pi * coords[i] * 0.5; + } + else { - if(i < I) - { - theta = coords.row(i) * arma::datum::pi * 0.5; - } - else - { - theta = 0.25 * (1.0 + 2.0 * coords.row(i) % G) / (1.0 + G); - } - objectives.row(i) = value % - arma::sin(theta); - value = value % arma::cos(theta); + theta = 0.25 * (1.0 + 2.0 * coords[i] * G) / (1.0 + G); } - objectives.row(numObjectives - 1) = value; - return objectives; + value = value * std::cos(theta); } - - // Individual Objective function. - // Changes based on stop variable provided. - struct MAF6Objective + + if (stop < maf.GetI()) { - MAF6Objective(size_t stop, MAF6& maf): stop(stop), maf(maf) - {/* Nothing to do here. */} - - /** - * Evaluate one objective with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Col - */ - typename MatType::elem_type Evaluate(const MatType& coords) - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - ElemType value = 1.0; - ElemType theta; - ElemType G = maf.g(coords)[0]; - for (size_t i = 0; i < stop; i++) - { - if(i < maf.GetI()) - { - theta = arma::datum::pi * coords[i] * 0.5; - } - else - { - theta = 0.25 * (1.0 + 2.0 * coords[i] * G) / (1.0 + G); - } - value = value * std::cos(theta); - } - - if(stop < maf.GetI()) - { - theta = arma::datum::pi * coords[stop] * 0.5; - } - else - { - theta = 0.25 * (1.0 + 2.0 * coords[stop] * G) / (1.0 + G); - } - - if (stop != maf.GetNumObjectives() - 1) - { - value = value * std::sin(theta); - } - - value = value * (1.0 + 100 * G); - return value; - } - - MAF6& maf; - size_t stop; - }; - - // Return back a tuple of objective functions. - std::tuple GetObjectives() + theta = arma::datum::pi * coords[stop] * 0.5; + } + else + { + theta = 0.25 * (1.0 + 2.0 * coords[stop] * G) / (1.0 + G); + } + + if (stop != maf.GetNumObjectives() - 1) { - return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); + value = value * std::sin(theta); } - MAF6Objective objectiveF1; - MAF6Objective objectiveF2; - MAF6Objective objectiveF3; + value = value * (1.0 + 100 * G); + return value; + } + + MAF6& maf; + size_t stop; }; - } //namespace test - } //namespace ens -#endif \ No newline at end of file + // Return back a tuple of objective functions. + std::tuple GetObjectives() + { + return std::make_tuple(objectiveF1, objectiveF2, objectiveF3); + } + + MAF6Objective objectiveF1; + MAF6Objective objectiveF2; + MAF6Objective objectiveF3; +}; + +} // namespace test +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/problems/matyas_function_impl.hpp b/inst/include/ensmallen_bits/problems/matyas_function_impl.hpp index 5fbdd29..6b1ac13 100644 --- a/inst/include/ensmallen_bits/problems/matyas_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/matyas_function_impl.hpp @@ -35,8 +35,8 @@ typename MatType::elem_type MatyasFunction::Evaluate( const ElemType x1 = coordinates(0); const ElemType x2 = coordinates(1); - const double objective = 0.26 * (pow(x1, 2) + std::pow(x2, 2)) - - 0.48 * x1 * x2; + const ElemType objective = ElemType(0.26) * (std::pow(x1, ElemType(2)) + + std::pow(x2, ElemType(2))) - ElemType(0.48) * x1 * x2; return objective; } @@ -62,8 +62,8 @@ inline void MatyasFunction::Gradient(const MatType& coordinates, const ElemType x2 = coordinates(1); gradient.set_size(2, 1); - gradient(0) = 0.52 * x1 - 48 * x2; - gradient(1) = 0.52 * x2 - 0.48 * x1; + gradient(0) = ElemType(0.52) * x1 - ElemType(0.48) * x2; + gradient(1) = ElemType(0.52) * x2 - ElemType(0.48) * x1; } template diff --git a/inst/include/ensmallen_bits/problems/mc_cormick_function_impl.hpp b/inst/include/ensmallen_bits/problems/mc_cormick_function_impl.hpp index f1e47ed..e060372 100644 --- a/inst/include/ensmallen_bits/problems/mc_cormick_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/mc_cormick_function_impl.hpp @@ -28,12 +28,16 @@ typename MatType::elem_type McCormickFunction::Evaluate( const size_t /* begin */, const size_t /* batchSize */) const { + typedef typename MatType::elem_type ElemType; + // For convenience; we assume these temporaries will be optimized out. - const typename MatType::elem_type x1 = coordinates(0); - const typename MatType::elem_type x2 = coordinates(1); + const ElemType x1 = coordinates(0); + const ElemType x2 = coordinates(1); - const typename MatType::elem_type objective = std::sin(x1 + x2) + - std::pow(x1 - x2, 2) - 1.5 * x1 + 2.5 * x2 + 1; + const ElemType objective = std::sin(x1 + x2) + + std::pow(x1 - x2, ElemType(2)) - + ElemType(1.5) * x1 + + ElemType(2.5) * x2 + 1; return objective; } @@ -51,13 +55,15 @@ inline void McCormickFunction::Gradient(const MatType& coordinates, GradType& gradient, const size_t /* batchSize */) const { + typedef typename MatType::elem_type ElemType; + // For convenience; we assume these temporaries will be optimized out. - const typename MatType::elem_type x1 = coordinates(0); - const typename MatType::elem_type x2 = coordinates(1); + const ElemType x1 = coordinates(0); + const ElemType x2 = coordinates(1); gradient.set_size(2, 1); - gradient(0) = std::cos(x1 + x2) + 2 * x1 - 2 * x2 - 1.5; - gradient(1) = std::cos(x1 + x2) - 2 * x1 + 2 * x2 + 2.5; + gradient(0) = std::cos(x1 + x2) + 2 * x1 - 2 * x2 - ElemType(1.5); + gradient(1) = std::cos(x1 + x2) - 2 * x1 + 2 * x2 + ElemType(2.5); } template diff --git a/inst/include/ensmallen_bits/problems/problems.hpp b/inst/include/ensmallen_bits/problems/problems.hpp index ad2a9a2..f4acdd0 100644 --- a/inst/include/ensmallen_bits/problems/problems.hpp +++ b/inst/include/ensmallen_bits/problems/problems.hpp @@ -28,6 +28,7 @@ #include "logistic_regression_function.hpp" #include "matyas_function.hpp" #include "mc_cormick_function.hpp" +#include "quadratic_function.hpp" #include "rastrigin_function.hpp" #include "rosenbrock_function.hpp" #include "rosenbrock_wood_function.hpp" diff --git a/inst/include/ensmallen_bits/problems/quadratic_function.hpp b/inst/include/ensmallen_bits/problems/quadratic_function.hpp new file mode 100644 index 0000000..982d9e5 --- /dev/null +++ b/inst/include/ensmallen_bits/problems/quadratic_function.hpp @@ -0,0 +1,108 @@ +/** + * @file quadratic_function.hpp + * @author Ryan Curtin + * + * Definition of QuadraticFunction, f(x) = | x |. + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_PROBLEMS_QUADRATIC_FUNCTION_HPP +#define ENSMALLEN_PROBLEMS_QUADRATIC_FUNCTION_HPP + +namespace ens { +namespace test { + +/** + * The quadratic value function in one dimension, defined by + * + * \f[ + * f(x) = x^2 + * \f] + * + * This should optimize to f(x) = 0, at x = [0]. + */ +class QuadraticFunction +{ + public: + //! Initialize the QuadraticFunction. + QuadraticFunction(); + + /** + * Shuffle the order of function visitation. This may be called by the + * optimizer. + */ + void Shuffle(); + + //! Return 1 (the number of functions). + size_t NumFunctions() const { return 1; } + + /** + * Evaluate a function for a particular batch-size. + * + * @param coordinates The function coordinates. + * @param begin The first function. + * @param batchSize Number of points to process. + */ + template + typename MatType::elem_type Evaluate(const MatType& coordinates, + const size_t begin, + const size_t batchSize) const; + + /** + * Evaluate a function with the given coordinates. + * + * @param coordinates The function coordinates. + */ + template + typename MatType::elem_type Evaluate(const MatType& coordinates) const; + + /** + * Evaluate the gradient of a function for a particular batch-size. + * + * @param coordinates The function coordinates. + * @param begin The first function. + * @param gradient The function gradient. + * @param batchSize Number of points to process. + */ + template + void Gradient(const MatType& coordinates, + const size_t begin, + GradType& gradient, + const size_t batchSize) const; + + /** + * Evaluate the gradient of a function with the given coordinates. + * + * @param coordinates The function coordinates. + * @param gradient The function gradient. + */ + template + void Gradient(const MatType& coordinates, GradType& gradient); + + // Note: GetInitialPoint(), GetFinalPoint(), and GetFinalObjective() are not + // required for using ensmallen to optimize this function! They are + // specifically used as a convenience just for ensmallen's testing + // infrastructure. + + //! Get the starting point. + template + MatType GetInitialPoint() const { return MatType("20.0"); } + + //! Get the final point. + template + MatType GetFinalPoint() const { return MatType("0.0"); } + + //! Get the final objective. + double GetFinalObjective() const { return 0.0; } +}; + +} // namespace test +} // namespace ens + +// Include implementation. +#include "quadratic_function_impl.hpp" + +#endif // ENSMALLEN_PROBLEMS_BEALE_FUNCTION_HPP diff --git a/inst/include/ensmallen_bits/problems/quadratic_function_impl.hpp b/inst/include/ensmallen_bits/problems/quadratic_function_impl.hpp new file mode 100644 index 0000000..410a1ec --- /dev/null +++ b/inst/include/ensmallen_bits/problems/quadratic_function_impl.hpp @@ -0,0 +1,61 @@ +/** + * @file quadratic_function_impl.hpp + * @author Ryan Curtin + * + * Implementation of QuadraticFunction, f(x) = | x |. + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_PROBLEMS_QUADRATIC_FUNCTION_IMPL_HPP +#define ENSMALLEN_PROBLEMS_QUADRATIC_FUNCTION_IMPL_HPP + +// In case it hasn't been included yet. +#include "quadratic_function.hpp" + +namespace ens { +namespace test { + +inline QuadraticFunction::QuadraticFunction() { /* Nothing to do here */ } + +inline void QuadraticFunction::Shuffle() { /* Nothing to do here */ } + +template +typename MatType::elem_type QuadraticFunction::Evaluate( + const MatType& coordinates, + const size_t /* begin */, + const size_t /* batchSize */) const +{ + return coordinates[0] * coordinates[0]; +} + +template +typename MatType::elem_type QuadraticFunction::Evaluate(const MatType& coordinates) + const +{ + return Evaluate(coordinates, 0, NumFunctions()); +} + +template +inline void QuadraticFunction::Gradient(const MatType& coordinates, + const size_t /* begin */, + GradType& gradient, + const size_t /* batchSize */) const +{ + gradient.set_size(1, 1); + gradient(0, 0) = 2 * coordinates[0]; +} + +template +inline void QuadraticFunction::Gradient(const MatType& coordinates, + GradType& gradient) +{ + Gradient(coordinates, 0, gradient, 1); +} + +} // namespace test +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/problems/rastrigin_function.hpp b/inst/include/ensmallen_bits/problems/rastrigin_function.hpp index d207473..7d14bc6 100644 --- a/inst/include/ensmallen_bits/problems/rastrigin_function.hpp +++ b/inst/include/ensmallen_bits/problems/rastrigin_function.hpp @@ -104,17 +104,17 @@ class RastriginFunction // infrastructure. //! Get the starting point. - template + template MatType GetInitialPoint() const { - return arma::conv_to::from(initialPoint); + return conv_to::from(initialPoint); } //! Get the final point. - template + template MatType GetFinalPoint() const { - return arma::zeros(initialPoint.n_rows, initialPoint.n_cols); + return zeros(initialPoint.n_rows, initialPoint.n_cols); } //! Get the final objective. @@ -125,7 +125,7 @@ class RastriginFunction size_t n; //! For shuffling. - arma::Row visitationOrder; + arma::Col visitationOrder; //! Initial starting point. arma::mat initialPoint; diff --git a/inst/include/ensmallen_bits/problems/rastrigin_function_impl.hpp b/inst/include/ensmallen_bits/problems/rastrigin_function_impl.hpp index 6824cc0..18d8767 100644 --- a/inst/include/ensmallen_bits/problems/rastrigin_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/rastrigin_function_impl.hpp @@ -18,10 +18,10 @@ namespace ens { namespace test { -inline RastriginFunction::RastriginFunction(const size_t n) : +inline RastriginFunction::RastriginFunction( + const size_t n) : n(n), - visitationOrder(arma::linspace >(0, n - 1, n)) - + visitationOrder(linspace>(0, n - 1, n)) { initialPoint.set_size(n, 1); initialPoint.fill(-3); @@ -29,12 +29,12 @@ inline RastriginFunction::RastriginFunction(const size_t n) : inline void RastriginFunction::Shuffle() { - visitationOrder = arma::shuffle( - arma::linspace >(0, n - 1, n)); + visitationOrder = shuffle(linspace>(0, n - 1, n)); } template -typename MatType::elem_type RastriginFunction::Evaluate( +typename MatType::elem_type +RastriginFunction::Evaluate( const MatType& coordinates, const size_t begin, const size_t batchSize) const @@ -42,44 +42,48 @@ typename MatType::elem_type RastriginFunction::Evaluate( // Convenience typedef. typedef typename MatType::elem_type ElemType; - ElemType objective = 0.0; + ElemType objective = 0; for (size_t j = begin; j < begin + batchSize; ++j) { const size_t p = visitationOrder[j]; - objective += std::pow(coordinates(p), 2) - 10.0 * - std::cos(2.0 * arma::datum::pi * coordinates(p)); + objective += std::pow(coordinates(p), ElemType(2)) - 10 * + std::cos(2 * arma::Datum::pi * coordinates(p)); } - objective += 10.0 * n; + objective += 10 * n; return objective; } template -typename MatType::elem_type RastriginFunction::Evaluate( - const MatType& coordinates) const +typename MatType::elem_type +RastriginFunction::Evaluate(const MatType& coordinates) const { return Evaluate(coordinates, 0, NumFunctions()); } template -inline void RastriginFunction::Gradient(const MatType& coordinates, - const size_t begin, - GradType& gradient, - const size_t batchSize) const +void RastriginFunction::Gradient( + const MatType& coordinates, + const size_t begin, + GradType& gradient, + const size_t batchSize) const { + typedef typename MatType::elem_type ElemType; + gradient.zeros(n, 1); for (size_t j = begin; j < begin + batchSize; ++j) { const size_t p = visitationOrder[j]; - gradient(p) += (10.0 * n) * (2 * (coordinates(p) + 10.0 * arma::datum::pi * - std::sin(2.0 * arma::datum::pi * coordinates(p)))); + gradient(p) += (10 * n) * (2 * (coordinates(p) + + 10 * arma::Datum::pi * + std::sin(2 * arma::Datum::pi * coordinates(p)))); } } template -inline void RastriginFunction::Gradient(const MatType& coordinates, - GradType& gradient) +inline void RastriginFunction::Gradient( + const MatType& coordinates, GradType& gradient) { Gradient(coordinates, 0, gradient, NumFunctions()); } diff --git a/inst/include/ensmallen_bits/problems/rosenbrock_function_impl.hpp b/inst/include/ensmallen_bits/problems/rosenbrock_function_impl.hpp index 6bc8186..b9ee036 100644 --- a/inst/include/ensmallen_bits/problems/rosenbrock_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/rosenbrock_function_impl.hpp @@ -37,8 +37,8 @@ typename MatType::elem_type RosenbrockFunction::Evaluate( const ElemType x2 = coordinates(1); const ElemType objective = - /* f1(x) */ 100 * std::pow(x2 - std::pow(x1, 2), 2) + - /* f2(x) */ std::pow(1 - x1, 2); + /* f1(x) */ 100 * std::pow(x2 - std::pow(x1, ElemType(2)), ElemType(2)) + + /* f2(x) */ std::pow(1 - x1, ElemType(2)); return objective; } @@ -64,8 +64,8 @@ void RosenbrockFunction::Gradient(const MatType& coordinates, const ElemType x2 = coordinates(1); gradient.set_size(2, 1); - gradient(0) = -2 * (1 - x1) + 400 * (std::pow(x1, 3) - x2 * x1); - gradient(1) = 200 * (x2 - std::pow(x1, 2)); + gradient(0) = -2 * (1 - x1) + 400 * (std::pow(x1, ElemType(3)) - x2 * x1); + gradient(1) = 200 * (x2 - std::pow(x1, ElemType(2))); } template diff --git a/inst/include/ensmallen_bits/problems/rosenbrock_wood_function.hpp b/inst/include/ensmallen_bits/problems/rosenbrock_wood_function.hpp index b42422a..0e57406 100644 --- a/inst/include/ensmallen_bits/problems/rosenbrock_wood_function.hpp +++ b/inst/include/ensmallen_bits/problems/rosenbrock_wood_function.hpp @@ -91,14 +91,14 @@ class RosenbrockWoodFunction template const MatType GetInitialPoint() const { - return arma::conv_to::from(initialPoint); + return conv_to::from(initialPoint); } //! Get the final point. template MatType GetFinalPoint() const { - return arma::ones(initialPoint.n_rows, initialPoint.n_cols); + return ones(initialPoint.n_rows, initialPoint.n_cols); } //! Get the final objective. diff --git a/inst/include/ensmallen_bits/problems/rosenbrock_wood_function_impl.hpp b/inst/include/ensmallen_bits/problems/rosenbrock_wood_function_impl.hpp index 071682e..5d3f095 100644 --- a/inst/include/ensmallen_bits/problems/rosenbrock_wood_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/rosenbrock_wood_function_impl.hpp @@ -51,13 +51,10 @@ inline void RosenbrockWoodFunction::Gradient(const MatType& coordinates, GradType& gradient, const size_t /* batchSize */) const { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; - gradient.set_size(4, 2); - arma::Col grf(4); - arma::Col gwf(4); + MatType grf(4, 1); + MatType gwf(4, 1); rf.Gradient(coordinates.col(0), grf); wf.Gradient(coordinates.col(1), gwf); diff --git a/inst/include/ensmallen_bits/problems/schaffer_function_n1.hpp b/inst/include/ensmallen_bits/problems/schaffer_function_n1.hpp index 4c31974..2f89993 100644 --- a/inst/include/ensmallen_bits/problems/schaffer_function_n1.hpp +++ b/inst/include/ensmallen_bits/problems/schaffer_function_n1.hpp @@ -37,7 +37,9 @@ class SchafferFunctionN1 size_t numVariables; public: - //! Initialize the SchafferFunctionN1 + typedef typename MatType::elem_type ElemType; + + // Initialize the SchafferFunctionN1 object. SchafferFunctionN1() : numObjectives(2), numVariables(1) {/* Nothing to do here. */} @@ -54,8 +56,8 @@ class SchafferFunctionN1 arma::Col objectives(numObjectives); - objectives(0) = std::pow(coords[0], 2); - objectives(1) = std::pow(coords[0] - 2, 2); + objectives(0) = std::pow(coords[0], ElemType(2)); + objectives(1) = std::pow(coords[0] - 2, ElemType(2)); return objectives; } @@ -71,17 +73,17 @@ class SchafferFunctionN1 struct ObjectiveA { - typename MatType::elem_type Evaluate(const MatType& coords) + ElemType Evaluate(const MatType& coords) { - return std::pow(coords[0], 2); + return std::pow(coords[0], ElemType(2)); } } objectiveA; struct ObjectiveB { - typename MatType::elem_type Evaluate(const MatType& coords) + ElemType Evaluate(const MatType& coords) { - return std::pow(coords[0] - 2, 2); + return std::pow(coords[0] - 2, ElemType(2)); } } objectiveB; @@ -91,7 +93,8 @@ class SchafferFunctionN1 return std::make_tuple(objectiveA, objectiveB); } }; + } // namespace test } // namespace ens -#endif \ No newline at end of file +#endif diff --git a/inst/include/ensmallen_bits/problems/schaffer_function_n2_impl.hpp b/inst/include/ensmallen_bits/problems/schaffer_function_n2_impl.hpp index 1e289e6..9d3a9c5 100644 --- a/inst/include/ensmallen_bits/problems/schaffer_function_n2_impl.hpp +++ b/inst/include/ensmallen_bits/problems/schaffer_function_n2_impl.hpp @@ -35,9 +35,11 @@ typename MatType::elem_type SchafferFunctionN2::Evaluate( const ElemType x1 = coordinates(0); const ElemType x2 = coordinates(1); - const ElemType objective = 0.5 + (std::pow(std::sin(std::pow(x1, 2) - - std::pow(x2, 2)), 2) - 0.5) / std::pow(1 + 0.001 * - (std::pow(x1, 2) + std::pow(x2, 2)), 2); + const ElemType objective = ElemType(0.5) + + (std::pow(std::sin(std::pow(x1, ElemType(2)) - + std::pow(x2, ElemType(2))), ElemType(2)) - ElemType(0.5)) / + std::pow(1 + ElemType(0.001) * (std::pow(x1, ElemType(2)) + + std::pow(x2, ElemType(2))), ElemType(2)); return objective; } @@ -67,11 +69,12 @@ inline void SchafferFunctionN2::Gradient(const MatType& coordinates, const ElemType x2Sq = x2 * x2; const ElemType sum1 = x1Sq - x2Sq; const ElemType sinSum1 = sin(sum1); - const ElemType sum2 = 0.001 * (x1Sq + x2Sq) + 1; + const ElemType sum2 = ElemType(0.001) * (x1Sq + x2Sq) + 1; const ElemType trigExpression = 4 * sinSum1 * cos(sum1); - const ElemType numerator1 = - 0.004 * (pow(sinSum1, 2) - 0.5); - const ElemType expr1 = numerator1 / pow(sum2, 3); - const ElemType expr2 = trigExpression / pow(sum2, 2); + const ElemType numerator1 = + ElemType(-0.004) * (pow(sinSum1, ElemType(2)) - 0.5); + const ElemType expr1 = numerator1 / pow(sum2, ElemType(3)); + const ElemType expr2 = trigExpression / pow(sum2, ElemType(2)); gradient.set_size(2, 1); gradient(0) = x1 * (expr1 + expr2); diff --git a/inst/include/ensmallen_bits/problems/schaffer_function_n4_impl.hpp b/inst/include/ensmallen_bits/problems/schaffer_function_n4_impl.hpp index 72ef1a6..6b935ca 100644 --- a/inst/include/ensmallen_bits/problems/schaffer_function_n4_impl.hpp +++ b/inst/include/ensmallen_bits/problems/schaffer_function_n4_impl.hpp @@ -35,9 +35,11 @@ typename MatType::elem_type SchafferFunctionN4::Evaluate( const ElemType x1 = coordinates(0); const ElemType x2 = coordinates(1); - const ElemType objective = 0.5 + (std::pow(std::cos(std::sin(std::abs( - std::pow(x1, 2) - std::pow(x2, 2)))), 2) - 0.5) / std::pow(1 + 0.001 * - (std::pow(x1, 2) + std::pow(x2, 2)), 2); + const ElemType objective = ElemType(0.5) + + (std::pow(std::cos(std::sin(std::abs(std::pow(x1, ElemType(2)) - + std::pow(x2, ElemType(2))))), ElemType(2)) - ElemType(0.5)) / + std::pow(1 + ElemType(0.001) * (std::pow(x1, ElemType(2)) + + std::pow(x2, ElemType(2))), ElemType(2)); return objective; } diff --git a/inst/include/ensmallen_bits/problems/schwefel_function.hpp b/inst/include/ensmallen_bits/problems/schwefel_function.hpp index c973bba..4491e4e 100644 --- a/inst/include/ensmallen_bits/problems/schwefel_function.hpp +++ b/inst/include/ensmallen_bits/problems/schwefel_function.hpp @@ -107,14 +107,14 @@ class SchwefelFunction template MatType GetInitialPoint() const { - return arma::conv_to::from(initialPoint); + return conv_to::from(initialPoint); } //! Get the final point. template MatType GetFinalPoint() const { - MatType result(initialPoint.n_rows, initialPoint.n_cols, arma::fill::none); + MatType result(initialPoint.n_rows, initialPoint.n_cols); result.fill(420.9687); return result; } diff --git a/inst/include/ensmallen_bits/problems/sgd_test_function_impl.hpp b/inst/include/ensmallen_bits/problems/sgd_test_function_impl.hpp index ac5e3d8..d8a36f5 100644 --- a/inst/include/ensmallen_bits/problems/sgd_test_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/sgd_test_function_impl.hpp @@ -36,7 +36,9 @@ typename MatType::elem_type SGDTestFunction::Evaluate( const size_t begin, const size_t batchSize) const { - typename MatType::elem_type objective = 0; + typedef typename MatType::elem_type ElemType; + + ElemType objective = 0; for (size_t i = begin; i < begin + batchSize; i++) { @@ -47,12 +49,12 @@ typename MatType::elem_type SGDTestFunction::Evaluate( break; case 1: - objective += std::pow(coordinates[1], 2); + objective += std::pow(coordinates[1], ElemType(2)); break; case 2: - objective += std::pow(coordinates[2], 4) + \ - 3 * std::pow(coordinates[2], 2); + objective += std::pow(coordinates[2], ElemType(4)) + \ + 3 * std::pow(coordinates[2], ElemType(2)); break; } } @@ -66,6 +68,8 @@ void SGDTestFunction::Gradient(const MatType& coordinates, GradType& gradient, const size_t batchSize) const { + typedef typename MatType::elem_type ElemType; + gradient.zeros(3); for (size_t i = begin; i < begin + batchSize; ++i) @@ -84,7 +88,8 @@ void SGDTestFunction::Gradient(const MatType& coordinates, break; case 2: - gradient[2] += 4 * std::pow(coordinates[2], 3) + 6 * coordinates[2]; + gradient[2] += 4 * std::pow(coordinates[2], ElemType(3)) + + 6 * coordinates[2]; break; } } diff --git a/inst/include/ensmallen_bits/problems/softmax_regression_function.hpp b/inst/include/ensmallen_bits/problems/softmax_regression_function.hpp index 0e16307..748782a 100644 --- a/inst/include/ensmallen_bits/problems/softmax_regression_function.hpp +++ b/inst/include/ensmallen_bits/problems/softmax_regression_function.hpp @@ -16,9 +16,12 @@ namespace ens { namespace test { +template class SoftmaxRegressionFunction { public: + typedef typename MatType::elem_type ElemType; + /** * Construct the Softmax Regression objective function with the given * parameters. @@ -30,14 +33,14 @@ class SoftmaxRegressionFunction * @param lambda L2-regularization constant. * @param fitIntercept Intercept term flag. */ - SoftmaxRegressionFunction(const arma::mat& data, + SoftmaxRegressionFunction(const MatType& data, const arma::Row& labels, const size_t numClasses, const double lambda = 0.0001, const bool fitIntercept = false); //! Initializes the parameters of the model to suitable values. - const arma::mat InitializeWeights(); + const MatType InitializeWeights(); /** * Shuffle the dataset. @@ -53,9 +56,9 @@ class SoftmaxRegressionFunction * @param fitIntercept If true, an intercept is fitted. * @return Initialized model weights. */ - const arma::mat InitializeWeights(const size_t featureSize, - const size_t numClasses, - const bool fitIntercept = false); + const MatType InitializeWeights(const size_t featureSize, + const size_t numClasses, + const bool fitIntercept = false); /** * Initialize Softmax Regression weights (trainable parameters) with the given @@ -66,7 +69,7 @@ class SoftmaxRegressionFunction * @param numClasses Number of classes for classification. * @param fitIntercept Intercept term flag. */ - void InitializeWeights(arma::mat &weights, + void InitializeWeights(MatType& weights, const size_t featureSize, const size_t numClasses, const bool fitIntercept = false); @@ -78,7 +81,7 @@ class SoftmaxRegressionFunction * @param groundTruth Pointer to arma::mat which stores the computed matrix. */ void GetGroundTruthMatrix(const arma::Row& labels, - arma::sp_mat& groundTruth); + arma::SpMat& groundTruth); /** * Evaluate the probabilities matrix with the passed parameters. @@ -91,8 +94,8 @@ class SoftmaxRegressionFunction * @param start Index of point to start at. * @param batchSize Number of points to calculate probabilities for. */ - void GetProbabilitiesMatrix(const arma::mat& parameters, - arma::mat& probabilities, + void GetProbabilitiesMatrix(const MatType& parameters, + MatType& probabilities, const size_t start, const size_t batchSize) const; @@ -105,7 +108,7 @@ class SoftmaxRegressionFunction * * @param parameters Current values of the model parameters. */ - double Evaluate(const arma::mat& parameters) const; + ElemType Evaluate(const MatType& parameters) const; /** * Evaluate the objective function of the softmax regression model for a @@ -118,9 +121,9 @@ class SoftmaxRegressionFunction * @param start First index of the data points to use. * @param batchSize Number of data points to evaluate objective for. */ - double Evaluate(const arma::mat& parameters, - const size_t start, - const size_t batchSize = 1) const; + ElemType Evaluate(const MatType& parameters, + const size_t start, + const size_t batchSize = 1) const; /** * Evaluates the gradient values of the objective function given the current @@ -131,7 +134,7 @@ class SoftmaxRegressionFunction * @param parameters Current values of the model parameters. * @param gradient Matrix where gradient values will be stored. */ - void Gradient(const arma::mat& parameters, arma::mat& gradient) const; + void Gradient(const MatType& parameters, MatType& gradient) const; /** * Evaluate the gradient of the objective function given the current set of @@ -144,9 +147,9 @@ class SoftmaxRegressionFunction * @param gradient Matrix to store gradient into. * @param batchSize Number of data points to evaluate gradient for. */ - void Gradient(const arma::mat& parameters, + void Gradient(const MatType& parameters, const size_t start, - arma::mat& gradient, + MatType& gradient, const size_t batchSize = 1) const; /** @@ -158,12 +161,12 @@ class SoftmaxRegressionFunction * gradient is to be computed. * @param gradient Out param for the gradient value. */ - void PartialGradient(const arma::mat& parameters, + void PartialGradient(const MatType& parameters, size_t j, - arma::sp_mat& gradient) const; + arma::SpMat& gradient) const; //! Return the initial point for the optimization. - const arma::mat& GetInitialPoint() const { return initialPoint; } + const MatType& GetInitialPoint() const { return initialPoint; } //! Gets the number of classes. size_t NumClasses() const { return numClasses; } @@ -184,11 +187,11 @@ class SoftmaxRegressionFunction private: //! Training data matrix. This is an alias until the data is shuffled. - arma::mat data; + MatType data; //! Label matrix for the provided data. - arma::sp_mat groundTruth; + arma::SpMat groundTruth; //! Initial parameter point. - arma::mat initialPoint; + MatType initialPoint; //! Number of classes. size_t numClasses; //! L2-regularization constant. diff --git a/inst/include/ensmallen_bits/problems/softmax_regression_function_impl.hpp b/inst/include/ensmallen_bits/problems/softmax_regression_function_impl.hpp index d781da1..e6860d7 100644 --- a/inst/include/ensmallen_bits/problems/softmax_regression_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/softmax_regression_function_impl.hpp @@ -18,14 +18,15 @@ namespace ens { namespace test { -inline SoftmaxRegressionFunction::SoftmaxRegressionFunction( - const arma::mat& data, +template +inline SoftmaxRegressionFunction::SoftmaxRegressionFunction( + const MatType& data, const arma::Row& labels, const size_t numClasses, const double lambda, const bool fitIntercept) : - data(arma::mat(const_cast(data).memptr(), data.n_rows, - data.n_cols, false, false)), + data(MatType(const_cast(data).memptr(), data.n_rows, data.n_cols, + false, false)), numClasses(numClasses), lambda(lambda), fitIntercept(fitIntercept) @@ -40,14 +41,15 @@ inline SoftmaxRegressionFunction::SoftmaxRegressionFunction( /** * Shuffle the data. */ -inline void SoftmaxRegressionFunction::Shuffle() +template +inline void SoftmaxRegressionFunction::Shuffle() { // Determine new ordering. arma::uvec ordering = arma::shuffle(arma::linspace(0, data.n_cols - 1, data.n_cols)); // Re-sort data. - arma::mat newData = data.cols(ordering); + MatType newData = data.cols(ordering); if (data.mem_state >= 1) data.reset(); data = std::move(newData); @@ -58,8 +60,8 @@ inline void SoftmaxRegressionFunction::Shuffle() reverseOrdering[ordering[i]] = i; arma::umat newLocations(2, groundTruth.n_nonzero); - arma::vec values(groundTruth.n_nonzero); - arma::sp_mat::const_iterator it = groundTruth.begin(); + arma::Col values(groundTruth.n_nonzero); + typename arma::SpMat::const_iterator it = groundTruth.begin(); size_t loc = 0; while (it != groundTruth.end()) { @@ -71,7 +73,7 @@ inline void SoftmaxRegressionFunction::Shuffle() ++loc; } - groundTruth = arma::sp_mat(newLocations, values, groundTruth.n_rows, + groundTruth = arma::SpMat(newLocations, values, groundTruth.n_rows, groundTruth.n_cols); } @@ -80,23 +82,26 @@ inline void SoftmaxRegressionFunction::Shuffle() * normal distribution. The weights cannot be initialized to zero, as that will * lead to each class output being the same. */ -inline const arma::mat SoftmaxRegressionFunction::InitializeWeights() +template +inline const MatType SoftmaxRegressionFunction::InitializeWeights() { return InitializeWeights(data.n_rows, numClasses, fitIntercept); } -inline const arma::mat SoftmaxRegressionFunction::InitializeWeights( +template +inline const MatType SoftmaxRegressionFunction::InitializeWeights( const size_t featureSize, const size_t numClasses, const bool fitIntercept) { - arma::mat parameters; - InitializeWeights(parameters, featureSize, numClasses, fitIntercept); - return parameters; + MatType parameters; + InitializeWeights(parameters, featureSize, numClasses, fitIntercept); + return parameters; } -inline void SoftmaxRegressionFunction::InitializeWeights( - arma::mat &weights, +template +inline void SoftmaxRegressionFunction::InitializeWeights( + MatType& weights, const size_t featureSize, const size_t numClasses, const bool fitIntercept) @@ -116,8 +121,9 @@ inline void SoftmaxRegressionFunction::InitializeWeights( * labels. The output is in the form of a matrix, which leads to simpler * calculations in the Evaluate() and Gradient() methods. */ -inline void SoftmaxRegressionFunction::GetGroundTruthMatrix( - const arma::Row& labels, arma::sp_mat& groundTruth) +template +inline void SoftmaxRegressionFunction::GetGroundTruthMatrix( + const arma::Row& labels, arma::SpMat& groundTruth) { // Calculate the ground truth matrix according to the labels passed. The // ground truth matrix is a matrix of dimensions 'numClasses * numExamples', @@ -137,25 +143,26 @@ inline void SoftmaxRegressionFunction::GetGroundTruthMatrix( } // All entries are '1'. - arma::vec values; + arma::Col values; values.ones(labels.n_elem); // Calculate the matrix. - groundTruth = arma::sp_mat(rowPointers, colPointers, values, numClasses, - labels.n_elem); + groundTruth = arma::SpMat(rowPointers, colPointers, values, + numClasses, labels.n_elem); } /** * Evaluate the probabilities matrix. If fitIntercept flag is true, * it should consider the parameters.cols(0) intercept term. */ -inline void SoftmaxRegressionFunction::GetProbabilitiesMatrix( - const arma::mat& parameters, - arma::mat& probabilities, +template +inline void SoftmaxRegressionFunction::GetProbabilitiesMatrix( + const MatType& parameters, + MatType& probabilities, const size_t start, const size_t batchSize) const { - arma::mat hypothesis; + MatType hypothesis; if (fitIntercept) { @@ -183,8 +190,9 @@ inline void SoftmaxRegressionFunction::GetProbabilitiesMatrix( /** * Evaluates the objective function given the parameters. */ -inline double SoftmaxRegressionFunction::Evaluate( - const arma::mat& parameters) const +template +inline typename MatType::elem_type SoftmaxRegressionFunction::Evaluate( + const MatType& parameters) const { // The objective function is the negative log likelihood of the model // calculated over all the training examples. Mathematically it is as follows: @@ -202,11 +210,11 @@ inline double SoftmaxRegressionFunction::Evaluate( // The sum is calculated over all the classes. // x_i is the input vector for a particular training example. // theta_j is the parameter vector associated with a particular class. - arma::mat probabilities; + MatType probabilities; GetProbabilitiesMatrix(parameters, probabilities, 0, data.n_cols); // Calculate the log likelihood and regularization terms. - double logLikelihood, weightDecay, cost; + ElemType logLikelihood, weightDecay, cost; logLikelihood = arma::accu(groundTruth % arma::log(probabilities)) / data.n_cols; @@ -222,16 +230,17 @@ inline double SoftmaxRegressionFunction::Evaluate( /** * Evaluate the objective function for the given points given the parameters. */ -inline double SoftmaxRegressionFunction::Evaluate( - const arma::mat& parameters, +template +inline typename MatType::elem_type SoftmaxRegressionFunction::Evaluate( + const MatType& parameters, const size_t start, const size_t batchSize) const { - arma::mat probabilities; + MatType probabilities; GetProbabilitiesMatrix(parameters, probabilities, start, batchSize); // Calculate the log likelihood and regularization terms. - double logLikelihood, weightDecay; + ElemType logLikelihood, weightDecay; logLikelihood = arma::accu(groundTruth.cols(start, start + batchSize - 1) % arma::log(probabilities)) / batchSize; @@ -243,8 +252,9 @@ inline double SoftmaxRegressionFunction::Evaluate( /** * Calculates and stores the gradient values given a set of parameters. */ -inline void SoftmaxRegressionFunction::Gradient( - const arma::mat& parameters, arma::mat& gradient) const +template +inline void SoftmaxRegressionFunction::Gradient( + const MatType& parameters, MatType& gradient) const { // Calculate the class probabilities for each training example. The // probabilities for each of the classes are given by: @@ -252,7 +262,7 @@ inline void SoftmaxRegressionFunction::Gradient( // The sum is calculated over all the classes. // x_i is the input vector for a particular training example. // theta_j is the parameter vector associated with a particular class. - arma::mat probabilities; + MatType probabilities; GetProbabilitiesMatrix(parameters, probabilities, 0, data.n_cols); // Calculate the parameter gradients. @@ -261,13 +271,13 @@ inline void SoftmaxRegressionFunction::Gradient( { // Treating the intercept term parameters.col(0) seperately to avoid // the cost of building matrix [1; data]. - arma::mat inner = probabilities - groundTruth; + MatType inner = probabilities - groundTruth; gradient.col(0) = - inner * arma::ones(data.n_cols, 1) / data.n_cols + - lambda * parameters.col(0); + inner * arma::ones(data.n_cols, 1) / data.n_cols + + lambda * parameters.col(0); gradient.cols(1, parameters.n_cols - 1) = - inner * data.t() / data.n_cols + - lambda * parameters.cols(1, parameters.n_cols - 1); + inner * data.t() / data.n_cols + + lambda * parameters.cols(1, parameters.n_cols - 1); } else { @@ -276,23 +286,24 @@ inline void SoftmaxRegressionFunction::Gradient( } } -inline void SoftmaxRegressionFunction::Gradient( - const arma::mat& parameters, +template +inline void SoftmaxRegressionFunction::Gradient( + const MatType& parameters, const size_t start, - arma::mat& gradient, + MatType& gradient, const size_t batchSize) const { - arma::mat probabilities; + MatType probabilities; GetProbabilitiesMatrix(parameters, probabilities, start, batchSize); // Calculate the parameter gradients. gradient.set_size(parameters.n_rows, parameters.n_cols); if (fitIntercept) { - arma::mat inner = probabilities - groundTruth.cols(start, start + + MatType inner = probabilities - groundTruth.cols(start, start + batchSize - 1); gradient.col(0) = - inner * arma::ones(batchSize, 1) / batchSize + + inner * arma::ones(batchSize, 1) / batchSize + lambda * parameters.col(0); gradient.cols(1, parameters.n_cols - 1) = inner * data.cols(start, start + batchSize - 1).t() / batchSize + @@ -306,24 +317,25 @@ inline void SoftmaxRegressionFunction::Gradient( } } -inline void SoftmaxRegressionFunction::PartialGradient( - const arma::mat& parameters, +template +inline void SoftmaxRegressionFunction::PartialGradient( + const MatType& parameters, const size_t j, - arma::sp_mat& gradient) const + arma::SpMat& gradient) const { gradient.zeros(arma::size(parameters)); - arma::mat probabilities; + MatType probabilities; GetProbabilitiesMatrix(parameters, probabilities, 0, data.n_cols); // Calculate the required part of the gradient. - arma::mat inner = probabilities - groundTruth; + MatType inner = probabilities - groundTruth; if (fitIntercept) { if (j == 0) { gradient.col(j) = - inner * arma::ones(data.n_cols, 1) / data.n_cols + + inner * arma::ones(data.n_cols, 1) / data.n_cols + lambda * parameters.col(0); } else diff --git a/inst/include/ensmallen_bits/problems/sparse_test_function_impl.hpp b/inst/include/ensmallen_bits/problems/sparse_test_function_impl.hpp index a737f6e..79bfa66 100644 --- a/inst/include/ensmallen_bits/problems/sparse_test_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/sparse_test_function_impl.hpp @@ -31,11 +31,13 @@ inline typename MatType::elem_type SparseTestFunction::Evaluate( const size_t i, const size_t batchSize) const { - typename MatType::elem_type result = 0.0; + typedef typename MatType::elem_type ElemType; + + ElemType result = 0; for (size_t j = i; j < i + batchSize; ++j) { - result += coordinates[j] * coordinates[j] + bi[j] * coordinates[j] + - intercepts[j]; + result += coordinates[j] * coordinates[j] + + ElemType(bi[j]) * coordinates[j] + ElemType(intercepts[j]); } return result; @@ -46,11 +48,13 @@ template inline typename MatType::elem_type SparseTestFunction::Evaluate( const MatType& coordinates) const { - typename MatType::elem_type objective = 0.0; + typedef typename MatType::elem_type ElemType; + + ElemType objective = 0; for (size_t i = 0; i < NumFunctions(); ++i) { - objective += coordinates[i] * coordinates[i] + bi[i] * coordinates[i] + - intercepts[i]; + objective += coordinates[i] * coordinates[i] + + ElemType(bi[i]) * coordinates[i] + ElemType(intercepts[i]); } return objective; @@ -65,7 +69,7 @@ inline void SparseTestFunction::Gradient(const MatType& coordinates, { gradient.zeros(arma::size(coordinates)); for (size_t j = i; j < i + batchSize; ++j) - gradient[j] = 2 * coordinates[j] + bi[j]; + gradient[j] = 2 * coordinates[j] + typename MatType::elem_type(bi[j]); } //! Evaluate the gradient of a feature function. @@ -75,7 +79,7 @@ inline void SparseTestFunction::PartialGradient(const MatType& coordinates, GradType& gradient) const { gradient.zeros(arma::size(coordinates)); - gradient[j] = 2 * coordinates[j] + bi[j]; + gradient[j] = 2 * coordinates[j] + typename MatType::elem_type(bi[j]); } } // namespace test diff --git a/inst/include/ensmallen_bits/problems/sphere_function.hpp b/inst/include/ensmallen_bits/problems/sphere_function.hpp index a5039e8..c08b548 100644 --- a/inst/include/ensmallen_bits/problems/sphere_function.hpp +++ b/inst/include/ensmallen_bits/problems/sphere_function.hpp @@ -108,14 +108,14 @@ class SphereFunction template MatType GetInitialPoint() const { - return arma::conv_to::from(initialPoint); + return conv_to::from(initialPoint); } //! Get the final point. template MatType GetFinalPoint() const { - return arma::zeros(initialPoint.n_rows, initialPoint.n_cols); + return zeros(initialPoint.n_rows, initialPoint.n_cols); } //! Get the final objective. diff --git a/inst/include/ensmallen_bits/problems/sphere_function_impl.hpp b/inst/include/ensmallen_bits/problems/sphere_function_impl.hpp index c83a2b0..783c7c5 100644 --- a/inst/include/ensmallen_bits/problems/sphere_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/sphere_function_impl.hpp @@ -45,11 +45,13 @@ typename MatType::elem_type SphereFunction::Evaluate( const size_t begin, const size_t batchSize) const { - typename MatType::elem_type objective = 0.0; + typedef typename MatType::elem_type ElemType; + + ElemType objective = 0; for (size_t j = begin; j < begin + batchSize; ++j) { const size_t p = visitationOrder[j]; - objective += std::pow(coordinates(p), 2); + objective += std::pow(coordinates(p), ElemType(2)); } return objective; @@ -73,7 +75,7 @@ void SphereFunction::Gradient(const MatType& coordinates, for (size_t j = begin; j < begin + batchSize; ++j) { const size_t p = visitationOrder[j]; - gradient(p) += 2.0 * coordinates[p]; + gradient(p) += 2 * coordinates[p]; } } diff --git a/inst/include/ensmallen_bits/problems/styblinski_tang_function.hpp b/inst/include/ensmallen_bits/problems/styblinski_tang_function.hpp index 0009a3a..0c9c2db 100644 --- a/inst/include/ensmallen_bits/problems/styblinski_tang_function.hpp +++ b/inst/include/ensmallen_bits/problems/styblinski_tang_function.hpp @@ -109,7 +109,7 @@ class StyblinskiTangFunction template MatType GetInitialPoint() const { - return arma::conv_to::from(initialPoint); + return conv_to::from(initialPoint); } //! Get the final point. @@ -118,7 +118,7 @@ class StyblinskiTangFunction { MatType result(initialPoint.n_rows, initialPoint.n_cols); for (size_t i = 0; i < result.n_elem; ++i) - result[i] = -2.903534; + result[i] = typename MatType::elem_type(-2.903534); return result; } diff --git a/inst/include/ensmallen_bits/problems/styblinski_tang_function_impl.hpp b/inst/include/ensmallen_bits/problems/styblinski_tang_function_impl.hpp index 671de35..aa043c8 100644 --- a/inst/include/ensmallen_bits/problems/styblinski_tang_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/styblinski_tang_function_impl.hpp @@ -24,7 +24,10 @@ inline StyblinskiTangFunction::StyblinskiTangFunction(const size_t n) : { initialPoint.set_size(n, 1); - initialPoint.fill(-5); + // Manual reimplementation of fill() that also works for sparse types (for + // testing). + for (size_t i = 0; i < n; ++i) + initialPoint[i] = -5; } inline void StyblinskiTangFunction::Shuffle() @@ -39,12 +42,14 @@ typename MatType::elem_type StyblinskiTangFunction::Evaluate( const size_t begin, const size_t batchSize) const { - typename MatType::elem_type objective = 0.0; + typedef typename MatType::elem_type ElemType; + + typename MatType::elem_type objective = ElemType(0); for (size_t j = begin; j < begin + batchSize; ++j) { const size_t p = visitationOrder[j]; - objective += std::pow(coordinates(p), 4) - 16 * - std::pow(coordinates(p), 2) + 5 * coordinates(p); + objective += std::pow(coordinates(p), ElemType(4)) - 16 * + std::pow(coordinates(p), ElemType(2)) + 5 * coordinates(p); } objective /= 2; @@ -64,13 +69,15 @@ void StyblinskiTangFunction::Gradient(const MatType& coordinates, GradType& gradient, const size_t batchSize) const { + typedef typename MatType::elem_type ElemType; + gradient.zeros(n, 1); for (size_t j = begin; j < begin + batchSize; ++j) { const size_t p = visitationOrder[j]; - gradient(p) += 0.5 * (4 * std::pow(coordinates(p), 3) - - 32.0 * coordinates(p) + 5.0); + gradient(p) += (4 * std::pow(coordinates(p), ElemType(3)) - + 32 * coordinates(p) + 5) / 2; } } diff --git a/inst/include/ensmallen_bits/problems/three_hump_camel_function_impl.hpp b/inst/include/ensmallen_bits/problems/three_hump_camel_function_impl.hpp index 5a1cc50..3ee9eee 100644 --- a/inst/include/ensmallen_bits/problems/three_hump_camel_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/three_hump_camel_function_impl.hpp @@ -36,8 +36,9 @@ typename MatType::elem_type ThreeHumpCamelFunction::Evaluate( const ElemType x1 = coordinates(0); const ElemType x2 = coordinates(1); - const ElemType objective = (2 * std::pow(x1, 2)) - (1.05 * std::pow(x1, 4)) + - (std::pow(x1, 6) / 6) + (x1 * x2) + std::pow(x2, 2); + const ElemType objective = (2 * std::pow(x1, ElemType(2))) - + (ElemType(1.05) * std::pow(x1, ElemType(4))) + + (std::pow(x1, ElemType(6)) / 6) + (x1 * x2) + std::pow(x2, ElemType(2)); return objective; } @@ -62,7 +63,8 @@ inline void ThreeHumpCamelFunction::Gradient(const MatType& coordinates, const ElemType x2 = coordinates(1); gradient.set_size(2, 1); - gradient(0) = std::pow(x1, 5) - (4.2 * std::pow(x1, 3)) + (4 * x1) + x2; + gradient(0) = std::pow(x1, ElemType(5)) - + (ElemType(4.2) * std::pow(x1, ElemType(3))) + (4 * x1) + x2; gradient(1) = x1 + (2 * x2); } diff --git a/inst/include/ensmallen_bits/problems/wood_function_impl.hpp b/inst/include/ensmallen_bits/problems/wood_function_impl.hpp index 756b81f..7143493 100644 --- a/inst/include/ensmallen_bits/problems/wood_function_impl.hpp +++ b/inst/include/ensmallen_bits/problems/wood_function_impl.hpp @@ -39,12 +39,12 @@ typename MatType::elem_type WoodFunction::Evaluate( const ElemType x4 = coordinates(3); const ElemType objective = - /* f1(x) */ 100 * std::pow(x2 - std::pow(x1, 2), 2) + - /* f2(x) */ std::pow(1 - x1, 2) + - /* f3(x) */ 90 * std::pow(x4 - std::pow(x3, 2), 2) + - /* f4(x) */ std::pow(1 - x3, 2) + - /* f5(x) */ 10 * std::pow(x2 + x4 - 2, 2) + - /* f6(x) */ (1.0 / 10.0) * std::pow(x2 - x4, 2); + /* f1(x) */ 100 * std::pow(x2 - std::pow(x1, ElemType(2)), ElemType(2)) + + /* f2(x) */ std::pow(1 - x1, ElemType(2)) + + /* f3(x) */ 90 * std::pow(x4 - std::pow(x3, ElemType(2)), ElemType(2)) + + /* f4(x) */ std::pow(1 - x3, ElemType(2)) + + /* f5(x) */ 10 * std::pow(x2 + x4 - 2, ElemType(2)) + + /* f6(x) */ ElemType(1.0 / 10.0) * std::pow(x2 - x4, ElemType(2)); return objective; } @@ -72,12 +72,12 @@ inline void WoodFunction::Gradient(const MatType& coordinates, const ElemType x4 = coordinates(3); gradient.set_size(4, 1); - gradient(0) = 400 * (std::pow(x1, 3) - x2 * x1) - 2 * (1 - x1); - gradient(1) = 200 * (x2 - std::pow(x1, 2)) + 20 * (x2 + x4 - 2) + - (1.0 / 5.0) * (x2 - x4); - gradient(2) = 360 * (std::pow(x3, 3) - x4 * x3) - 2 * (1 - x3); - gradient(3) = 180 * (x4 - std::pow(x3, 2)) + 20 * (x2 + x4 - 2) - - (1.0 / 5.0) * (x2 - x4); + gradient(0) = 400 * (std::pow(x1, ElemType(3)) - x2 * x1) - 2 * (1 - x1); + gradient(1) = 200 * (x2 - std::pow(x1, ElemType(2))) + 20 * (x2 + x4 - 2) + + ElemType(1.0 / 5.0) * (x2 - x4); + gradient(2) = 360 * (std::pow(x3, ElemType(3)) - x4 * x3) - 2 * (1 - x3); + gradient(3) = 180 * (x4 - std::pow(x3, ElemType(2))) + 20 * (x2 + x4 - 2) - + ElemType(1.0 / 5.0) * (x2 - x4); } template diff --git a/inst/include/ensmallen_bits/problems/zdt/zdt1_function.hpp b/inst/include/ensmallen_bits/problems/zdt/zdt1_function.hpp index ef8889c..bbc4c51 100644 --- a/inst/include/ensmallen_bits/problems/zdt/zdt1_function.hpp +++ b/inst/include/ensmallen_bits/problems/zdt/zdt1_function.hpp @@ -48,110 +48,112 @@ namespace test { * * @tparam MatType Type of matrix to optimize. */ - template - class ZDT1 +template +class ZDT1 +{ + private: + size_t numParetoPoints {100}; + size_t numObjectives {2}; + size_t numVariables {30}; + + public: + //! Initialize the ZDT1 + ZDT1(size_t numParetoPoints = 100) : + numParetoPoints(numParetoPoints), + objectiveF1(*this), + objectiveF2(*this) + {/* Nothing to do here. */} + + /** + * Evaluate the objectives with the given coordinate. + * + * @param coords The function coordinates. + * @return arma::Col + */ + arma::Col Evaluate(const MatType& coords) { - private: - size_t numParetoPoints {100}; - size_t numObjectives {2}; - size_t numVariables {30}; - - public: - //! Initialize the ZDT1 - ZDT1(size_t numParetoPoints = 100) : - numParetoPoints(numParetoPoints), - objectiveF1(*this), - objectiveF2(*this) - {/* Nothing to do here. */} - - /** - * Evaluate the objectives with the given coordinate. - * - * @param coords The function coordinates. - * @return arma::Col - */ - arma::Col Evaluate(const MatType& coords) - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; + // Convenience typedef. + typedef typename MatType::elem_type ElemType; - arma::Col objectives(numObjectives); - objectives(0) = coords[0]; - ElemType sum = arma::accu(coords(arma::span(1, numVariables - 1), 0)); - ElemType g = 1. + 9. * sum / (static_cast(numVariables) - 1.); - ElemType objectiveRatio = objectives(0) / g; - objectives(1) = g * (1. - std::sqrt(objectiveRatio)); + arma::Col objectives(numObjectives); + objectives(0) = coords[0]; - return objectives; - } + ElemType sum = accu(coords.submat(1, 0, numVariables - 1, 0)); + ElemType g = 1 + 9 * sum / (static_cast(numVariables) - 1.0); + ElemType objectiveRatio = objectives(0) / g; + objectives(1) = g * (1 - std::sqrt(objectiveRatio)); - //! Get the starting point. - MatType GetInitialPoint() - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; + return objectives; + } - return arma::Col(numVariables, 1, arma::fill::zeros); - } + //! Get the starting point. + MatType GetInitialPoint() + { + // Convenience typedef. + typedef typename MatType::elem_type ElemType; + + return arma::Col(numVariables, 1, arma::fill::zeros); + } - struct ObjectiveF1 + struct ObjectiveF1 + { + ObjectiveF1(ZDT1& zdtClass) : zdtClass(zdtClass) + {/*Nothing to do here */} + + typename MatType::elem_type Evaluate(const MatType& coords) { - ObjectiveF1(ZDT1& zdtClass) : zdtClass(zdtClass) - {/*Nothing to do here */} + return coords[0]; + } - typename MatType::elem_type Evaluate(const MatType& coords) - { - return coords[0]; - } + ZDT1& zdtClass; + }; - ZDT1& zdtClass; - }; + struct ObjectiveF2 + { + ObjectiveF2(ZDT1& zdtClass) : zdtClass(zdtClass) + {/*Nothing to do here */} - struct ObjectiveF2 + typename MatType::elem_type Evaluate(const MatType& coords) { - ObjectiveF2(ZDT1& zdtClass) : zdtClass(zdtClass) - {/*Nothing to do here */} + // Convenience typedef. + typedef typename MatType::elem_type ElemType; - typename MatType::elem_type Evaluate(const MatType& coords) - { - // Convenience typedef. - typedef typename MatType::elem_type ElemType; + size_t numVariables = zdtClass.numVariables; + ElemType sum = arma::accu(coords(arma::span(1, numVariables - 1), 0)); + ElemType g = 1 + 9 * sum / (static_cast(numVariables - 1)); + ElemType objectiveRatio = zdtClass.objectiveF1.Evaluate(coords) / g; - size_t numVariables = zdtClass.numVariables; - ElemType sum = arma::accu(coords(arma::span(1, numVariables - 1), 0)); - ElemType g = 1. + 9. * sum / (static_cast(numVariables - 1)); - ElemType objectiveRatio = zdtClass.objectiveF1.Evaluate(coords) / g; + return g * (1 - std::sqrt(objectiveRatio)); + } - return g * (1. - std::sqrt(objectiveRatio)); - } + ZDT1& zdtClass; + }; - ZDT1& zdtClass; - }; + //! Get objective functions. + std::tuple GetObjectives() + { + return std::make_tuple(objectiveF1, objectiveF2); + } - //! Get objective functions. - std::tuple GetObjectives() - { - return std::make_tuple(objectiveF1, objectiveF2); - } + //! Get the Reference Front. + //! Refer PR #273 Ipynb notebook to see the plot of Reference + //! Front. The implementation has been taken from pymoo. + arma::cube GetReferenceFront() + { + arma::cube front(2, 1, numParetoPoints); + arma::vec x = arma::linspace(0, 1, numParetoPoints); + arma::vec y = 1 - arma::sqrt(x); + for (size_t idx = 0; idx < numParetoPoints; ++idx) + front.slice(idx) = arma::vec{ x(idx), y(idx) }; - //! Get the Reference Front. - //! Refer PR #273 Ipynb notebook to see the plot of Reference - //! Front. The implementation has been taken from pymoo. - arma::cube GetReferenceFront() - { - arma::cube front(2, 1, numParetoPoints); - arma::vec x = arma::linspace(0, 1, numParetoPoints); - arma::vec y = 1 - arma::sqrt(x); - for (size_t idx = 0; idx < numParetoPoints; ++idx) - front.slice(idx) = arma::vec{ x(idx), y(idx) }; + return front; + } - return front; - } + ObjectiveF1 objectiveF1; + ObjectiveF2 objectiveF2; +}; - ObjectiveF1 objectiveF1; - ObjectiveF2 objectiveF2; - }; - } //namespace test - } //namespace ens +} // namespace test +} // namespace ens #endif diff --git a/inst/include/ensmallen_bits/problems/zdt/zdt2_function.hpp b/inst/include/ensmallen_bits/problems/zdt/zdt2_function.hpp index 440ce6c..a8ef492 100644 --- a/inst/include/ensmallen_bits/problems/zdt/zdt2_function.hpp +++ b/inst/include/ensmallen_bits/problems/zdt/zdt2_function.hpp @@ -152,7 +152,8 @@ namespace test { ObjectiveF1 objectiveF1; ObjectiveF2 objectiveF2; }; - } //namespace test - } //namespace ens -#endif \ No newline at end of file +} //namespace test +} //namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/problems/zdt/zdt3_function.hpp b/inst/include/ensmallen_bits/problems/zdt/zdt3_function.hpp index b62406a..be0b2b6 100644 --- a/inst/include/ensmallen_bits/problems/zdt/zdt3_function.hpp +++ b/inst/include/ensmallen_bits/problems/zdt/zdt3_function.hpp @@ -125,7 +125,7 @@ namespace test { typedef typename MatType::elem_type ElemType; size_t numVariables = zdtClass.numVariables; - ElemType sum = arma::accu(coords(arma::span(1, numVariables - 1), 0)); + ElemType sum = accu(coords.submat(1, 0, numVariables - 1, 0)); ElemType g = 1. + 9. * sum / (static_cast(numVariables - 1)); ElemType objectiveRatio = zdtClass.objectiveF1.Evaluate(coords) / g; @@ -182,7 +182,8 @@ namespace test { ObjectiveF1 objectiveF1; ObjectiveF2 objectiveF2; }; - } //namespace test - } //namespace ens -#endif \ No newline at end of file +} // namespace test +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/problems/zdt/zdt4_function.hpp b/inst/include/ensmallen_bits/problems/zdt/zdt4_function.hpp index fad2ba9..27b273a 100644 --- a/inst/include/ensmallen_bits/problems/zdt/zdt4_function.hpp +++ b/inst/include/ensmallen_bits/problems/zdt/zdt4_function.hpp @@ -155,6 +155,8 @@ namespace test { ObjectiveF1 objectiveF1; ObjectiveF2 objectiveF2; }; - } //namespace test - } //namespace ens + +} //namespace test +} //namespace ens + #endif diff --git a/inst/include/ensmallen_bits/problems/zdt/zdt6_function.hpp b/inst/include/ensmallen_bits/problems/zdt/zdt6_function.hpp index 68d2364..b404cff 100644 --- a/inst/include/ensmallen_bits/problems/zdt/zdt6_function.hpp +++ b/inst/include/ensmallen_bits/problems/zdt/zdt6_function.hpp @@ -157,6 +157,8 @@ namespace test { ObjectiveF1 objectiveF1; ObjectiveF2 objectiveF2; }; - } //namespace test - } //namespace ens -#endif \ No newline at end of file + +} //namespace test +} //namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/pso/init_policies/default_init.hpp b/inst/include/ensmallen_bits/pso/init_policies/default_init.hpp index 519c515..7a09785 100644 --- a/inst/include/ensmallen_bits/pso/init_policies/default_init.hpp +++ b/inst/include/ensmallen_bits/pso/init_policies/default_init.hpp @@ -12,6 +12,7 @@ */ #ifndef ENSMALLEN_PSO_INIT_POLICIES_DEFAULT_INIT_HPP #define ENSMALLEN_PSO_INIT_POLICIES_DEFAULT_INIT_HPP + #include namespace ens { @@ -65,26 +66,30 @@ class DefaultInit { // Convenience typedef. typedef typename MatType::elem_type ElemType; - typedef typename CubeType::elem_type CubeElemType; + + typedef typename ForwardType::umat UMatType; + typedef typename ForwardType::bmat BaseMatType; // Randomly initialize the particle positions. particlePositions.randu(iterate.n_rows, iterate.n_cols, numParticles); // Check if lowerBound is equal to upperBound. If equal, reinitialize. - arma::umat lbEquality = (lowerBound == upperBound); + UMatType lbEquality = (lowerBound == upperBound); if (lbEquality.n_rows == 1 && lbEquality(0, 0) == 1) { lowerBound.set_size(iterate.n_rows, iterate.n_cols); - lowerBound.fill(-1.0); + lowerBound.fill(-1); upperBound.set_size(iterate.n_rows, iterate.n_cols); - upperBound.fill(1.0); + upperBound.fill(1); } // Check if lowerBound and upperBound are vectors of a single dimension. else if (lbEquality.n_rows == 1 && lbEquality(0, 0) == 0) { - lowerBound = -lowerBound(0) * arma::ones(iterate.n_rows, iterate.n_cols); - upperBound = upperBound(0) * arma::ones(iterate.n_rows, iterate.n_cols); + BoundMatType ones = BoundMatType(iterate.n_rows, iterate.n_cols); + ones.fill(1); + lowerBound = -lowerBound(0) * ones; + upperBound = upperBound(0) * ones; } // Check the dimensions of lowerBound and upperBound. @@ -97,8 +102,8 @@ class DefaultInit for (size_t i = 0; i < numParticles; i++) { particlePositions.slice(i) = particlePositions.slice(i) % - arma::conv_to >::from(upperBound - lowerBound) - + arma::conv_to >::from(lowerBound); + conv_to::from(upperBound - lowerBound) + + conv_to::from(lowerBound); } // Randomly initialize particle velocities. @@ -114,7 +119,6 @@ class DefaultInit particleBestFitnesses.set_size(numParticles); particleBestFitnesses.fill(std::numeric_limits::max()); } - }; } // ens diff --git a/inst/include/ensmallen_bits/pso/pso.hpp b/inst/include/ensmallen_bits/pso/pso.hpp index cda0693..8f9411c 100644 --- a/inst/include/ensmallen_bits/pso/pso.hpp +++ b/inst/include/ensmallen_bits/pso/pso.hpp @@ -88,7 +88,7 @@ class PSOType * @param initPolicy Particle initialization policy. */ PSOType(const size_t numParticles = 64, - const arma::mat& lowerBound = arma::ones(1, 1), + const arma::mat& lowerBound = arma::zeros(1, 1), const arma::mat& upperBound = arma::ones(1, 1), const size_t maxIterations = 3000, const size_t horizonSize = 350, @@ -145,8 +145,8 @@ class PSOType VelocityUpdatePolicy(), const InitPolicy& initPolicy = InitPolicy()) : numParticles(numParticles), - lowerBound(lowerBound * arma::ones(1, 1)), - upperBound(upperBound * arma::ones(1, 1)), + lowerBound({ lowerBound }), + upperBound({ upperBound }), maxIterations(maxIterations), horizonSize(horizonSize), impTolerance(impTolerance), @@ -163,7 +163,7 @@ class PSOType * returned. * * @tparam ArbitraryFunctionType Type of the function to be optimized. - * @tparam MatType Type of matrix to optimize. + * @tparam InputMatType Type of matrix to optimize. * @tparam CallbackTypes Types of callback functions. * @param function Function to be optimized. * @param iterate Initial point (will be modified). @@ -171,11 +171,11 @@ class PSOType * @return Objective value of the final point. */ template - typename MatType::elem_type Optimize(ArbitraryFunctionType& function, - MatType& iterate, - CallbackTypes&&... callbacks); + typename InputMatType::elem_type Optimize(ArbitraryFunctionType& function, + InputMatType& iterate, + CallbackTypes&&... callbacks); //! Retrieve value of numParticles. size_t NumParticles() const { return numParticles; } @@ -259,6 +259,7 @@ class PSOType //! Velocity update policy used. VelocityUpdatePolicy velocityUpdatePolicy; + //! Particle initialization policy used. InitPolicy initPolicy; @@ -266,7 +267,7 @@ class PSOType Any instUpdatePolicy; }; -using LBestPSO = PSOType; +using LBestPSO = PSOType; } // ens #include "pso_impl.hpp" diff --git a/inst/include/ensmallen_bits/pso/pso_impl.hpp b/inst/include/ensmallen_bits/pso/pso_impl.hpp index ed8385a..4731e4b 100644 --- a/inst/include/ensmallen_bits/pso/pso_impl.hpp +++ b/inst/include/ensmallen_bits/pso/pso_impl.hpp @@ -36,16 +36,19 @@ namespace ens { template template -typename MatType::elem_type PSOType::Optimize( +typename InputMatType::elem_type PSOType< + VelocityUpdatePolicy, InitPolicy>::Optimize( ArbitraryFunctionType& function, - MatType& iterateIn, + InputMatType& iterateIn, CallbackTypes&&... callbacks) { // Convenience typedefs. - typedef typename MatType::elem_type ElemType; - typedef typename MatTypeTraits::BaseMatType BaseMatType; + typedef typename InputMatType::elem_type ElemType; + typedef typename ForwardType::bmat BaseMatType; + typedef typename ForwardType::bcol BaseColType; + typedef typename ForwardType::bcube BaseCubeType; // The update policy internally use a templated class so that // we can know MatType only when Optimize() is called. @@ -79,17 +82,18 @@ typename MatType::elem_type PSOType::Optimize( } // Initialize helper variables. - arma::Cube particlePositions; - arma::Cube particleVelocities; - arma::Col particleFitnesses; - arma::Col particleBestFitnesses; - arma::Cube particleBestPositions; + BaseCubeType particlePositions, particleVelocities, particleBestPositions; + BaseColType particleFitnesses, particleBestFitnesses; + + //! Useful temporaries for float-like comparisons. + BaseMatType castedlowerBound = conv_to::from(lowerBound); + BaseMatType castedupperBound = conv_to::from(upperBound); // Initialize particles using the init policy. initPolicy.Initialize(iterate, numParticles, - lowerBound, - upperBound, + castedlowerBound, + castedupperBound, particlePositions, particleVelocities, particleFitnesses, @@ -125,7 +129,8 @@ typename MatType::elem_type PSOType::Optimize( // in the PSO method. // The performanceHorizon will be updated with the best particle // in a FIFO manner. - for (size_t i = 0; (i < horizonSize) && !terminate; i++) + size_t iteration = 0; + for (size_t i = 0; (i < horizonSize) && !terminate; i++, iteration++) { // Calculate fitness and evaluate personal best. for (size_t j = 0; (j < numParticles) && !terminate; j++) @@ -167,15 +172,25 @@ typename MatType::elem_type PSOType::Optimize( // Append bestFitness to performanceHorizon. performanceHorizon.push(bestFitness); + + Info << "PSO: iteration " << iteration << ": objective " << bestFitness + << "." << std::endl; } // Run the remaining iterations of PSO. - for (size_t i = 0; (i < maxIterations - horizonSize) && !terminate; i++) + for (size_t i = 0; (i < maxIterations - horizonSize) && !terminate; i++, + iteration++) { // Check if there is any improvement over the horizon. // If there is no significant improvement, terminate. if (performanceHorizon.front() - performanceHorizon.back() < impTolerance) + { + Info << "PSO: improvement over horizon (" + << (performanceHorizon.front() - performanceHorizon.back()) + << ") below convergence tolerance (" << impTolerance + << "); optimization complete." << std::endl; break; + } // Calculate fitness and evaluate personal best. for (size_t j = 0; (j < numParticles) && !terminate; j++) @@ -217,6 +232,9 @@ typename MatType::elem_type PSOType::Optimize( performanceHorizon.pop(); // Push most recent bestFitness to performanceHorizon. performanceHorizon.push(bestFitness); + + Info << "PSO: iteration " << iteration << ": objective " << bestFitness + << "." << std::endl; } // Copy results back. diff --git a/inst/include/ensmallen_bits/pso/update_policies/lbest_update.hpp b/inst/include/ensmallen_bits/pso/update_policies/lbest_update.hpp index cbb45db..49da9c0 100644 --- a/inst/include/ensmallen_bits/pso/update_policies/lbest_update.hpp +++ b/inst/include/ensmallen_bits/pso/update_policies/lbest_update.hpp @@ -12,6 +12,7 @@ */ #ifndef ENSMALLEN_PSO_UPDATE_POLICIES_LBEST_UPDATE_HPP #define ENSMALLEN_PSO_UPDATE_POLICIES_LBEST_UPDATE_HPP + #include namespace ens { @@ -63,118 +64,121 @@ class LBestUpdate * instantiated at the start of the optimization, and holds parameters * specific to an individual optimization. */ - template + template< + typename MatType, typename ColType = typename ForwardType::bcol> class Policy { - public: + public: + typedef typename MatType::elem_type ElemType; + /** * This is called by the optimizer method before the start of the iteration * update process. * * @param parent Instantiated parent class. */ - Policy(const LBestUpdate& /* parent */) : n(0) - { /* Do nothing. */ } - - /** - * The Initialize method is called by PSO Optimizer method before the - * start of the iteration process. It calculates the value of the - * constriction coefficent, initializes the local best indices of each - * particle to itself, and sets the shape of the r1 and r2 vectors. - * - * @param exploitationFactor Influence of personal best achieved. - * @param explorationFactor Influence of neighbouring particles. - * @param numParticles The number of particles in the swarm. - * @param iterate The user input, used for shaping intermediate vectors. - */ - void Initialize(const double exploitationFactor, - const double explorationFactor, - const size_t numParticles, - MatType& iterate) - { - // Copy values to aliases. - n = numParticles; - c1 = exploitationFactor; - c2 = explorationFactor; - - // Calculate the constriction factor - static double phi = c1 + c2; - assert(phi > 4.0 && "The sum of the exploitation and exploration " - "factors must be greater than 4."); - - chi = 2.0 / std::abs(2.0 - phi - std::sqrt((phi - 4.0) * phi)); - - // Initialize local best indices to self indices of particles. - localBestIndices = arma::linspace< - arma::Col >(0, n-1, n); - - // Set sizes r1 and r2. - r1.set_size(iterate.n_rows, iterate.n_cols); - r2.set_size(iterate.n_rows, iterate.n_cols); - } - - /** - * Update step for LBestPSO. Compares personal best of each particle with - * that of its neighbours, and sets the best of the 3 as the lobal best. - * This particle is then used for calculating the velocity for the update - * step. - * - * @param particlePositions The current coordinates of particles. - * @param particleVelocities The current velocities (will be modified). - * @param particleFitnesses The current fitness values or particles. - * @param particleBestPositions The personal best coordinates of particles. - * @param particleBestFitnesses The personal best fitness values of - * particles. - */ - void Update(arma::Cube& particlePositions, - arma::Cube& particleVelocities, - arma::Cube& particleBestPositions, - arma::Col& particleBestFitnesses) - { - // Velocity update logic. - for (size_t i = 0; i < n; i++) - { - localBestIndices(i) = - particleBestFitnesses(left(i)) < particleBestFitnesses(i) ? - left(i) : i; - localBestIndices(i) = - particleBestFitnesses(right(i)) < particleBestFitnesses(i) ? - right(i) : i; - } - - for (size_t i = 0; i < n; i++) - { - // Generate random numbers for current particle. - r1.randu(); - r2.randu(); - particleVelocities.slice(i) = chi * (particleVelocities.slice(i) + - c1 * r1 % (particleBestPositions.slice(i) - - particlePositions.slice(i)) + c2 * r2 % - (particleBestPositions.slice(localBestIndices(i)) - - particlePositions.slice(i))); - } - } - - private: - //! Number of particles. - size_t n; - - //! Exploitation factor. - typename MatType::elem_type c1; - - //! Exploration factor. - typename MatType::elem_type c2; - - //! Constriction factor chi. - typename MatType::elem_type chi; - - //! Vectors of random numbers. - MatType r1, r2; - - //! Indices of each particle's best neighbour. - arma::Col localBestIndices; - - // Helper functions for calculating neighbours. + Policy(const LBestUpdate& /* parent */) : n(0) + { /* Do nothing. */ } + + /** + * The Initialize method is called by PSO Optimizer method before the + * start of the iteration process. It calculates the value of the + * constriction coefficent, initializes the local best indices of each + * particle to itself, and sets the shape of the r1 and r2 vectors. + * + * @param exploitationFactor Influence of personal best achieved. + * @param explorationFactor Influence of neighbouring particles. + * @param numParticles The number of particles in the swarm. + * @param iterate The user input, used for shaping intermediate vectors. + */ + void Initialize(const double exploitationFactor, + const double explorationFactor, + const size_t numParticles, + MatType& iterate) + { + // Copy values to aliases. + n = numParticles; + c1 = ElemType(exploitationFactor); + c2 = ElemType(explorationFactor); + + // Calculate the constriction factor + const ElemType phi = c1 + c2; + assert(phi > 4 && "The sum of the exploitation and exploration " + "factors must be greater than 4."); + + chi = 2 / std::abs(2 - phi - std::sqrt((phi - 4) * phi)); + + // Initialize local best indices to self indices of particles. + localBestIndices = linspace(0, n - 1, n); + + // Set sizes r1 and r2. + r1.set_size(iterate.n_rows, iterate.n_cols); + r2.set_size(iterate.n_rows, iterate.n_cols); + } + + /** + * Update step for LBestPSO. Compares personal best of each particle with + * that of its neighbours, and sets the best of the 3 as the lobal best. + * This particle is then used for calculating the velocity for the update + * step. + * + * @param particlePositions The current coordinates of particles. + * @param particleVelocities The current velocities (will be modified). + * @param particleFitnesses The current fitness values or particles. + * @param particleBestPositions The personal best coordinates of particles. + * @param particleBestFitnesses The personal best fitness values of + * particles. + */ + template + void Update(CubeType& particlePositions, + CubeType& particleVelocities, + CubeType& particleBestPositions, + VecType& particleBestFitnesses) + { + // Velocity update logic. + for (size_t i = 0; i < n; i++) + { + localBestIndices(i) = + particleBestFitnesses(left(i)) < particleBestFitnesses(i) ? + left(i) : i; + localBestIndices(i) = + particleBestFitnesses(right(i)) < particleBestFitnesses(i) ? + right(i) : i; + } + + for (size_t i = 0; i < n; i++) + { + // Generate random numbers for current particle. + r1.randu(); + r2.randu(); + particleVelocities.slice(i) = chi * (particleVelocities.slice(i) + + c1 * r1 % (particleBestPositions.slice(i) - + particlePositions.slice(i)) + c2 * r2 % + (particleBestPositions.slice(localBestIndices(i)) - + particlePositions.slice(i))); + } + } + + private: + // Number of particles. + size_t n; + + // Exploitation factor. + ElemType c1; + + // Exploration factor. + ElemType c2; + + // Constriction factor chi. + ElemType chi; + + // Vectors of random numbers. + MatType r1, r2; + + //! Indices of each particle's best neighbour. + arma::uvec localBestIndices; + + // Helper functions for calculating neighbours. inline size_t left(size_t index) { return (index + n - 1) % n; } inline size_t right(size_t index) { return (index + 1) % n; } }; diff --git a/inst/include/ensmallen_bits/qhadam/qhadam.hpp b/inst/include/ensmallen_bits/qhadam/qhadam.hpp index e29d4f2..38426bd 100644 --- a/inst/include/ensmallen_bits/qhadam/qhadam.hpp +++ b/inst/include/ensmallen_bits/qhadam/qhadam.hpp @@ -27,10 +27,10 @@ namespace ens { * * @code * @inproceedings{ma2019qh, - * title={Quasi-hyperbolic momentum and Adam for deep learning}, - * author={Jerry Ma and Denis Yarats}, - * booktitle={International Conference on Learning Representations}, - * year={2019} + * title = {Quasi-hyperbolic momentum and Adam for deep learning}, + * author = {Jerry Ma and Denis Yarats}, + * booktitle = {International Conference on Learning Representations}, + * year = {2019} * } * @endcode * @@ -100,7 +100,7 @@ class QHAdam typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/qhadam/qhadam_update.hpp b/inst/include/ensmallen_bits/qhadam/qhadam_update.hpp index f408377..e0d0315 100644 --- a/inst/include/ensmallen_bits/qhadam/qhadam_update.hpp +++ b/inst/include/ensmallen_bits/qhadam/qhadam_update.hpp @@ -94,6 +94,8 @@ class QHAdamUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -104,10 +106,19 @@ class QHAdamUpdate */ Policy(QHAdamUpdate& parent, const size_t rows, const size_t cols) : parent(parent), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), + v1(ElemType(parent.v1)), + v2(ElemType(parent.v2)), iteration(0) { m.zeros(rows, cols); v.zeros(rows, cols); + + // Attempt to detect underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -125,35 +136,40 @@ class QHAdamUpdate ++iteration; // And update the iterate. - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - v *= parent.beta2; - v += (1 - parent.beta2) * (gradient % gradient); + v *= beta2; + v += (1 - beta2) * (gradient % gradient); - const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration); - const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration); + const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration)); + const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration)); GradType mDash = m / biasCorrection1; GradType vDash = v / biasCorrection2; // QHAdam recovers Adam when v2 = v1 = 1. - iterate -= stepSize * - ((((1 - parent.v1) * gradient) + parent.v1 * mDash) / - (arma::sqrt(((1 - parent.v2) * (gradient % gradient)) + - parent.v2 * vDash) + parent.epsilon)); + iterate -= ElemType(stepSize) * ((((1 - v1) * gradient) + v1 * mDash) / + (sqrt(((1 - v2) * square(gradient)) + v2 * vDash) + epsilon)); } private: - //! Instantiated parent object. + // Instantiated parent object. QHAdamUpdate& parent; - //! The exponential moving average of gradient values. + // The exponential moving average of gradient values. GradType m; // The exponential moving average of squared gradient values. GradType v; + // Parameters converted to the element type of the optimization. + ElemType epsilon; + ElemType beta1; + ElemType beta2; + ElemType v1; + ElemType v2; + // The number of iterations. size_t iteration; }; diff --git a/inst/include/ensmallen_bits/rmsprop/rmsprop.hpp b/inst/include/ensmallen_bits/rmsprop/rmsprop.hpp index 9c3607a..53bb3e2 100644 --- a/inst/include/ensmallen_bits/rmsprop/rmsprop.hpp +++ b/inst/include/ensmallen_bits/rmsprop/rmsprop.hpp @@ -109,7 +109,7 @@ class RMSProp typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/rmsprop/rmsprop_update.hpp b/inst/include/ensmallen_bits/rmsprop/rmsprop_update.hpp index c8507ba..e769c28 100644 --- a/inst/include/ensmallen_bits/rmsprop/rmsprop_update.hpp +++ b/inst/include/ensmallen_bits/rmsprop/rmsprop_update.hpp @@ -76,6 +76,8 @@ class RMSPropUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -85,10 +87,16 @@ class RMSPropUpdate * @param cols Number of columns in the gradient matrix. */ Policy(RMSPropUpdate& parent, const size_t rows, const size_t cols) : - parent(parent) + parent(parent), + epsilon(ElemType(parent.epsilon)), + alpha(ElemType(parent.alpha)) { // Leaky sum of squares of parameter gradient. meanSquaredGradient.zeros(rows, cols); + + // Attempt to catch underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -102,10 +110,10 @@ class RMSPropUpdate const double stepSize, const GradType& gradient) { - meanSquaredGradient *= parent.alpha; - meanSquaredGradient += (1 - parent.alpha) * (gradient % gradient); - iterate -= stepSize * gradient / (arma::sqrt(meanSquaredGradient) + - parent.epsilon); + meanSquaredGradient *= alpha; + meanSquaredGradient += (1 - alpha) * (gradient % gradient); + iterate -= ElemType(stepSize) * gradient / (sqrt(meanSquaredGradient) + + epsilon); } private: @@ -113,6 +121,9 @@ class RMSPropUpdate GradType meanSquaredGradient; // Reference to instantiated parent object. RMSPropUpdate& parent; + // Parameters converted to the element type of the optimization. + ElemType epsilon; + ElemType alpha; }; private: diff --git a/inst/include/ensmallen_bits/sa/exponential_schedule.hpp b/inst/include/ensmallen_bits/sa/exponential_schedule.hpp index ff9acf5..25baa20 100644 --- a/inst/include/ensmallen_bits/sa/exponential_schedule.hpp +++ b/inst/include/ensmallen_bits/sa/exponential_schedule.hpp @@ -46,11 +46,10 @@ class ExponentialSchedule * @param currentEnergy Current energy of system (not used). */ template - double NextTemperature( - const double currentTemperature, - const ElemType /* currentEnergy */) + ElemType NextTemperature( + const double currentTemperature, const ElemType /* currentEnergy */) { - return (1 - lambda) * currentTemperature; + return ElemType((1 - lambda) * currentTemperature); } //! Get the cooling speed, lambda. diff --git a/inst/include/ensmallen_bits/sa/sa_impl.hpp b/inst/include/ensmallen_bits/sa/sa_impl.hpp index a969d5c..e2d70a4 100644 --- a/inst/include/ensmallen_bits/sa/sa_impl.hpp +++ b/inst/include/ensmallen_bits/sa/sa_impl.hpp @@ -76,9 +76,9 @@ typename MatType::elem_type SA::Optimize( size_t idx = 0; size_t sweepCounter = 0; - BaseMatType accept(rows, cols, arma::fill::zeros); - BaseMatType moveSize(rows, cols, arma::fill::none); - moveSize.fill(initMoveCoef); + BaseMatType accept(rows, cols); + BaseMatType moveSize(rows, cols, GetFillType::none); + moveSize.fill(ElemType(initMoveCoef)); Callback::BeginOptimization(*this, function, iterate, callbacks...); @@ -158,7 +158,7 @@ bool SA::GenerateMove( // MoveControl() is derived for the Laplace distribution. // Sample from a Laplace distribution with scale parameter moveSize(idx). - const double unif = 2.0 * arma::randu() - 1.0; + const ElemType unif = 2 * arma::randu() - 1; const ElemType move = (unif < 0) ? (moveSize(idx) * std::log(1 + unif)) : (-moveSize(idx) * std::log(1 - unif)); @@ -219,17 +219,15 @@ inline void SA::MoveControl(const size_t nMoves, MatType& accept, MatType& moveSize) { - MatType target; - target.copy_size(accept); - target.fill(0.44); - moveSize = arma::log(moveSize); - moveSize += gain * (accept / (double) nMoves - target); - moveSize = arma::exp(moveSize); - - // To avoid the use of element-wise arma::min(), which is only available in - // Armadillo after v3.930, we use a for loop here instead. - for (size_t i = 0; i < accept.n_elem; ++i) - moveSize(i) = (moveSize(i) > maxMoveCoef) ? maxMoveCoef : moveSize(i); + typedef typename MatType::elem_type ElemType; + + MatType target(accept.n_rows, accept.n_cols, GetFillType::none); + target.fill(ElemType(0.44)); + + moveSize = log(moveSize); + moveSize += ElemType(gain) * (accept / (ElemType) nMoves - target); + moveSize = exp(moveSize); + moveSize.clamp(ElemType(-maxMoveCoef), ElemType(maxMoveCoef)); accept.zeros(); } diff --git a/inst/include/ensmallen_bits/sarah/sarah.hpp b/inst/include/ensmallen_bits/sarah/sarah.hpp index 074f462..e92aa95 100644 --- a/inst/include/ensmallen_bits/sarah/sarah.hpp +++ b/inst/include/ensmallen_bits/sarah/sarah.hpp @@ -97,7 +97,7 @@ class SARAHType typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/sarah/sarah_impl.hpp b/inst/include/ensmallen_bits/sarah/sarah_impl.hpp index 3b001cd..d7a7b80 100644 --- a/inst/include/ensmallen_bits/sarah/sarah_impl.hpp +++ b/inst/include/ensmallen_bits/sarah/sarah_impl.hpp @@ -45,8 +45,8 @@ template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type SARAHType::Optimize( SeparableFunctionType& functionIn, MatType& iterateIn, @@ -145,15 +145,15 @@ SARAHType::Optimize( f += effectiveBatchSize; } - v /= (double) numFunctions; + v /= (ElemType) numFunctions; if (terminate) break; // Update iterate with full gradient (v). - iterate -= stepSize * v; + iterate -= ElemType(stepSize) * v; - const ElemType vNorm = arma::norm(v); + const ElemType vNorm = norm(v); for (size_t f = 0, currentFunction = 0; f < innerIterations; /* incrementing done manually */) @@ -228,7 +228,8 @@ SARAHType::Optimize( for (size_t i = 0; i < numFunctions; i += batchSize) { const size_t effectiveBatchSize = std::min(batchSize, numFunctions - i); - const ElemType objective = function.Evaluate(iterate, i, effectiveBatchSize); + const ElemType objective = function.Evaluate(iterate, i, + effectiveBatchSize); overallObjective += objective; // The optimization is finished, so we don't need to care about the result diff --git a/inst/include/ensmallen_bits/sarah/sarah_plus_update.hpp b/inst/include/ensmallen_bits/sarah/sarah_plus_update.hpp index 2ddf64a..12e669c 100644 --- a/inst/include/ensmallen_bits/sarah/sarah_plus_update.hpp +++ b/inst/include/ensmallen_bits/sarah/sarah_plus_update.hpp @@ -52,10 +52,12 @@ class SARAHPlusUpdate const double stepSize, const double vNorm) { - v += (gradient - gradient0) / (double) batchSize; - iterate -= stepSize * v; + typedef typename MatType::elem_type ElemType; - if (arma::norm(v) <= gamma * vNorm) + v += (gradient - gradient0) / (ElemType) batchSize; + iterate -= ElemType(stepSize) * v; + + if (norm(v) <= ElemType(gamma * vNorm)) return true; return false; diff --git a/inst/include/ensmallen_bits/sarah/sarah_update.hpp b/inst/include/ensmallen_bits/sarah/sarah_update.hpp index 0c38ba4..0a63f02 100644 --- a/inst/include/ensmallen_bits/sarah/sarah_update.hpp +++ b/inst/include/ensmallen_bits/sarah/sarah_update.hpp @@ -41,8 +41,10 @@ class SARAHUpdate const double stepSize, const double /* vNorm */) { - v += (gradient - gradient0) / (double) batchSize; - iterate -= stepSize * v; + typedef typename MatType::elem_type ElemType; + + v += (gradient - gradient0) / (ElemType) batchSize; + iterate -= ElemType(stepSize) * v; return false; } }; diff --git a/inst/include/ensmallen_bits/sdp/lin_alg.hpp b/inst/include/ensmallen_bits/sdp/lin_alg.hpp index bfd70c7..6cdf1ac 100644 --- a/inst/include/ensmallen_bits/sdp/lin_alg.hpp +++ b/inst/include/ensmallen_bits/sdp/lin_alg.hpp @@ -92,7 +92,7 @@ inline void Smat(const MatAType& input, MatBType& output) MatBType iMat(input); const size_t n = static_cast - (ceil((-1. + sqrt(1. + 8. * iMat.n_elem))/2.)); + (ceil((-1. + std::sqrt(1. + 8. * iMat.n_elem))/2.)); output.zeros(n, n); diff --git a/inst/include/ensmallen_bits/sdp/lrsdp.hpp b/inst/include/ensmallen_bits/sdp/lrsdp.hpp index c918163..de85d83 100644 --- a/inst/include/ensmallen_bits/sdp/lrsdp.hpp +++ b/inst/include/ensmallen_bits/sdp/lrsdp.hpp @@ -75,6 +75,22 @@ class LRSDP typename MatType::elem_type Optimize(MatType& coordinates, CallbackTypes&&... callbacks); + /** + * Optimize the LRSDP and return the final objective value, using the given + * starting Lagrange multipliers and penalty parameter for the augmented + * Lagrangian inner optimizer. The given coordinates will be modified to + * contain the final solution, and the given lambda/sigma will be modified to + * contain the final values. + * + * @param coordinates Starting coordinates for the optimization. + * @param callbacks Callback functions. + */ + template + typename MatType::elem_type Optimize(MatType& coordinates, + VecType& lambda, + double& sigma, + CallbackTypes&&... callbacks); + //! Return the SDP that will be solved. const SDPType& SDP() const { return function.SDP(); } //! Modify the SDP that will be solved. diff --git a/inst/include/ensmallen_bits/sdp/lrsdp_function.hpp b/inst/include/ensmallen_bits/sdp/lrsdp_function.hpp index a0ac114..da6050d 100644 --- a/inst/include/ensmallen_bits/sdp/lrsdp_function.hpp +++ b/inst/include/ensmallen_bits/sdp/lrsdp_function.hpp @@ -101,8 +101,7 @@ class LRSDPFunction template MatType GetInitialPoint() const { - MatType result = arma::conv_to::from(initialPoint); - return result; + return conv_to::from(initialPoint); } //! Return the SDP object representing the problem. @@ -143,48 +142,48 @@ class LRSDPFunction template<> template inline typename MatType::elem_type -AugLagrangianFunction>>::Evaluate( +AugLagrangianFunction>, arma::vec>::Evaluate( const MatType& coordinates) const; template<> template inline typename MatType::elem_type -AugLagrangianFunction>>::Evaluate( +AugLagrangianFunction>, arma::vec>::Evaluate( const MatType& coordinates) const; template<> template -inline void AugLagrangianFunction>>::Gradient( +inline void AugLagrangianFunction>, arma::vec>::Gradient( const MatType& coordinates, GradType& gradient) const; template<> template -inline void AugLagrangianFunction>>::Gradient( +inline void AugLagrangianFunction>, arma::vec>::Gradient( const MatType& coordinates, GradType& gradient) const; template<> template inline typename MatType::elem_type -AugLagrangianFunction>>::Evaluate( +AugLagrangianFunction>, arma::vec>::Evaluate( const MatType& coordinates) const; template<> template inline typename MatType::elem_type -AugLagrangianFunction>>::Evaluate( +AugLagrangianFunction>, arma::vec>::Evaluate( const MatType& coordinates) const; template<> template -inline void AugLagrangianFunction>>::Gradient( +inline void AugLagrangianFunction>, arma::vec>::Gradient( const MatType& coordinates, GradType& gradient) const; template<> template -inline void AugLagrangianFunction>>::Gradient( +inline void AugLagrangianFunction>, arma::vec>::Gradient( const MatType& coordinates, GradType& gradient) const; diff --git a/inst/include/ensmallen_bits/sdp/lrsdp_function_impl.hpp b/inst/include/ensmallen_bits/sdp/lrsdp_function_impl.hpp index 6cf2230..f30d70a 100644 --- a/inst/include/ensmallen_bits/sdp/lrsdp_function_impl.hpp +++ b/inst/include/ensmallen_bits/sdp/lrsdp_function_impl.hpp @@ -109,10 +109,10 @@ void LRSDPFunction::GradientConstraint( "for arbitrary optimizers!"); } -//! Utility function for updating R*R^T matrix. -//! Note: Caching R*R^T provide significant computation optimization -//! by reducing redundant R*R^T calculations in case of functions are not used -//! updating coordinates matrix, hence leaving R*R^T unchanged. +// Utility function for updating R*R^T matrix. +// Note: Caching R*R^T provide significant computation optimization +// by reducing redundant R*R^T calculations in case of functions are not used +// updating coordinates matrix, hence leaving R*R^T unchanged. template void UpdateRRT(LRSDPFunction& function, MatType&& newrrt) @@ -120,15 +120,15 @@ void UpdateRRT(LRSDPFunction& function, function.template RRT() = std::move(newrrt); } -//! Utility function for calculating part of the objective when AugLagrangian is -//! used with an LRSDPFunction. +// Utility function for calculating part of the objective when AugLagrangian is +// used with an LRSDPFunction. template static inline void UpdateObjective(typename MatType::elem_type& objective, const MatType& rrt, const std::vector& ais, const VecType& bis, - const arma::vec& lambda, + const VecType& lambda, const size_t lambdaOffset, const double sigma) { @@ -144,15 +144,15 @@ UpdateObjective(typename MatType::elem_type& objective, } } -//! Utility function for calculating part of the gradient when AugLagrangian is -//! used with an LRSDPFunction. +// Utility function for calculating part of the gradient when AugLagrangian is +// used with an LRSDPFunction. template static inline void UpdateGradient(MatType& s, const MatType& rrt, const std::vector& ais, const VecType& bis, - const arma::vec& lambda, + const VecType& lambda, const size_t lambdaOffset, const double sigma) { @@ -167,11 +167,11 @@ UpdateGradient(MatType& s, } } -template +template static inline double EvaluateImpl(LRSDPFunction& function, const MatType& coordinates, - const arma::vec& lambda, + const VecType& lambda, const double sigma) { // We can calculate the entire objective in a smart way. @@ -220,11 +220,14 @@ EvaluateImpl(LRSDPFunction& function, return objective; } -template +template static inline void GradientImpl(const LRSDPFunction& function, const MatType& coordinates, - const arma::vec& lambda, + const VecType& lambda, const double sigma, GradType& gradient) { @@ -254,7 +257,7 @@ GradientImpl(const LRSDPFunction& function, template<> template inline typename MatType::elem_type -AugLagrangianFunction>>::Evaluate( +AugLagrangianFunction>, arma::vec>::Evaluate( const MatType& coordinates) const { return EvaluateImpl(function, coordinates, lambda, sigma); @@ -263,7 +266,7 @@ AugLagrangianFunction>>::Evaluate( template<> template inline typename MatType::elem_type -AugLagrangianFunction>>::Evaluate( +AugLagrangianFunction>, arma::vec>::Evaluate( const MatType& coordinates) const { return EvaluateImpl(function, coordinates, lambda, sigma); @@ -271,7 +274,7 @@ AugLagrangianFunction>>::Evaluate( template<> template -inline void AugLagrangianFunction>>::Gradient( +inline void AugLagrangianFunction>, arma::vec>::Gradient( const MatType& coordinates, GradType& gradient) const { @@ -280,7 +283,7 @@ inline void AugLagrangianFunction>>::Gradient( template<> template -inline void AugLagrangianFunction>>::Gradient( +inline void AugLagrangianFunction>, arma::vec>::Gradient( const MatType& coordinates, GradType& gradient) const { @@ -290,7 +293,7 @@ inline void AugLagrangianFunction>>::Gradient( template<> template inline typename MatType::elem_type -AugLagrangianFunction>>::Evaluate( +AugLagrangianFunction>, arma::fvec>::Evaluate( const MatType& coordinates) const { return EvaluateImpl(function, coordinates, lambda, sigma); @@ -299,7 +302,7 @@ AugLagrangianFunction>>::Evaluate( template<> template inline typename MatType::elem_type -AugLagrangianFunction>>::Evaluate( +AugLagrangianFunction>, arma::fvec>::Evaluate( const MatType& coordinates) const { return EvaluateImpl(function, coordinates, lambda, sigma); @@ -307,7 +310,7 @@ AugLagrangianFunction>>::Evaluate( template<> template -inline void AugLagrangianFunction>>::Gradient( +inline void AugLagrangianFunction>, arma::fvec>::Gradient( const MatType& coordinates, GradType& gradient) const { @@ -316,7 +319,7 @@ inline void AugLagrangianFunction>>::Gradient( template<> template -inline void AugLagrangianFunction>>::Gradient( +inline void AugLagrangianFunction>, arma::fvec>::Gradient( const MatType& coordinates, GradType& gradient) const { diff --git a/inst/include/ensmallen_bits/sdp/lrsdp_impl.hpp b/inst/include/ensmallen_bits/sdp/lrsdp_impl.hpp index 85c5479..1fd0094 100644 --- a/inst/include/ensmallen_bits/sdp/lrsdp_impl.hpp +++ b/inst/include/ensmallen_bits/sdp/lrsdp_impl.hpp @@ -35,9 +35,28 @@ typename MatType::elem_type LRSDP::Optimize( function.RRTAny().template Set( new MatType(coordinates * coordinates.t())); - augLag.Sigma() = 10; augLag.MaxIterations() = maxIterations; - augLag.Optimize(function, coordinates, callbacks...); + typename ForwardType::bvec lambda(function.NumConstraints()); + double sigma = 10; + augLag.Optimize(function, coordinates, lambda, sigma, callbacks...); + + return function.Evaluate(coordinates); +} + +template +template +typename MatType::elem_type LRSDP::Optimize( + MatType& coordinates, + VecType& lambda, + double& sigma, + CallbackTypes&&... callbacks) +{ + function.RRTAny().Clean(); + function.RRTAny().template Set( + new MatType(coordinates * coordinates.t())); + + augLag.MaxIterations() = maxIterations; + augLag.Optimize(function, coordinates, lambda, sigma, callbacks...); return function.Evaluate(coordinates); } diff --git a/inst/include/ensmallen_bits/sdp/primal_dual_impl.hpp b/inst/include/ensmallen_bits/sdp/primal_dual_impl.hpp index d7b65ee..38d16b0 100644 --- a/inst/include/ensmallen_bits/sdp/primal_dual_impl.hpp +++ b/inst/include/ensmallen_bits/sdp/primal_dual_impl.hpp @@ -92,7 +92,7 @@ Alpha(const MatType& a, const MatType& dA, double tau, double& alpha) * * where A, H are symmetric matrices. * - * TODO(stephentu): Note this method current uses arma's builtin arma::syl + * TODO(stephentu): Note this method current uses arma's builtin arma::sylvester * method, which is overkill for this situation. See Lemma 7.2 of [AHO98] for * how to solve this Lyapunov equation using an eigenvalue decomposition of A. * @@ -101,7 +101,7 @@ template static inline void SolveLyapunov(MatType& x, const AType& a, const BType& h) { - arma::syl(x, a, a, -h); + arma::sylvester(x, a, a, -h); } /** @@ -163,7 +163,7 @@ SolveKKTSystem(const SparseConstraintType& aSparse, } MatType subTerm(aSparse.n_cols, 1, arma::fill::zeros); - + if (aSparse.n_rows) { dySparse = dy(arma::span(0, aSparse.n_rows - 1), 0); @@ -483,8 +483,8 @@ typename MatType::elem_type PrimalDualSolver::Optimize( const double sparsePrimalInfeas = arma::norm(sdp.SparseB() - aSparse * sx, 2); const double densePrimalInfeas = arma::norm(sdp.DenseB() - aDense * sx, 2); - const double primalInfeas = sqrt(sparsePrimalInfeas * sparsePrimalInfeas + - densePrimalInfeas * densePrimalInfeas); + const double primalInfeas = std::sqrt(sparsePrimalInfeas * + sparsePrimalInfeas + densePrimalInfeas * densePrimalInfeas); primalObj = arma::dot(sdp.C(), coordinates); diff --git a/inst/include/ensmallen_bits/sgd/sgd.hpp b/inst/include/ensmallen_bits/sgd/sgd.hpp index 3b8a92a..c81516f 100644 --- a/inst/include/ensmallen_bits/sgd/sgd.hpp +++ b/inst/include/ensmallen_bits/sgd/sgd.hpp @@ -124,7 +124,7 @@ class SGD typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/sgd/sgd_impl.hpp b/inst/include/ensmallen_bits/sgd/sgd_impl.hpp index d34115b..a6b98f5 100644 --- a/inst/include/ensmallen_bits/sgd/sgd_impl.hpp +++ b/inst/include/ensmallen_bits/sgd/sgd_impl.hpp @@ -58,8 +58,8 @@ template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type SGD::Optimize( SeparableFunctionType& function, MatType& iterateIn, @@ -150,8 +150,14 @@ SGD::Optimize( gradient, callbacks...); // Use the update policy to take a step. + // TODO: remove old behavior in ensmallen 4.0.0. + #if defined(ENS_OLD_SEPARABLE_STEP_BEHAVIOR) instUpdatePolicy.As().Update(iterate, stepSize, gradient); + #else + instUpdatePolicy.As().Update(iterate, + (stepSize / effectiveBatchSize), gradient); + #endif terminate |= Callback::StepTaken(*this, f, iterate, callbacks...); @@ -194,9 +200,12 @@ SGD::Optimize( overallObjective, callbacks...); // Reset the counter variables. - lastObjective = overallObjective; - overallObjective = 0; - currentFunction = 0; + if (i != actualMaxIterations) + { + lastObjective = overallObjective; + overallObjective = 0; + currentFunction = 0; + } if (shuffle) // Determine order of visitation. f.Shuffle(); diff --git a/inst/include/ensmallen_bits/sgd/update_policies/momentum_update.hpp b/inst/include/ensmallen_bits/sgd/update_policies/momentum_update.hpp index 5c19e63..6d8555d 100644 --- a/inst/include/ensmallen_bits/sgd/update_policies/momentum_update.hpp +++ b/inst/include/ensmallen_bits/sgd/update_policies/momentum_update.hpp @@ -84,6 +84,8 @@ class MomentumUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This is called by the optimizer method before the start of the iteration * update process. @@ -94,9 +96,10 @@ class MomentumUpdate */ Policy(const MomentumUpdate& parent, const size_t rows, const size_t cols) : parent(parent), - velocity(arma::zeros(rows, cols)) + velocity(rows, cols), + momentum(ElemType(parent.momentum)) { - // Nothing to do. + // Nothing to do here. } /** @@ -112,7 +115,7 @@ class MomentumUpdate const double stepSize, const GradType& gradient) { - velocity = parent.momentum * velocity - stepSize * gradient; + velocity = momentum * velocity - ElemType(stepSize) * gradient; iterate += velocity; } @@ -121,6 +124,8 @@ class MomentumUpdate const MomentumUpdate& parent; // The velocity matrix. MatType velocity; + // The momentum, converted to the element type of the optimization. + ElemType momentum; }; private: diff --git a/inst/include/ensmallen_bits/sgd/update_policies/nesterov_momentum_update.hpp b/inst/include/ensmallen_bits/sgd/update_policies/nesterov_momentum_update.hpp index 540cb55..d1ecc3c 100644 --- a/inst/include/ensmallen_bits/sgd/update_policies/nesterov_momentum_update.hpp +++ b/inst/include/ensmallen_bits/sgd/update_policies/nesterov_momentum_update.hpp @@ -58,6 +58,8 @@ class NesterovMomentumUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This is called by the optimizer method before the start of the iteration * update process. @@ -70,9 +72,10 @@ class NesterovMomentumUpdate const size_t rows, const size_t cols) : parent(parent), - velocity(arma::zeros(rows, cols)) + velocity(rows, cols), + momentum(ElemType(parent.momentum)) { - // Nothing to do. + // Nothing to do here. } /** @@ -89,9 +92,8 @@ class NesterovMomentumUpdate const double stepSize, const GradType& gradient) { - velocity = parent.momentum * velocity - stepSize * gradient; - - iterate += parent.momentum * velocity - stepSize * gradient; + velocity = momentum * velocity - ElemType(stepSize) * gradient; + iterate += momentum * velocity - ElemType(stepSize) * gradient; } private: @@ -99,6 +101,8 @@ class NesterovMomentumUpdate const NesterovMomentumUpdate& parent; // The velocity matrix. MatType velocity; + // The momentum, converted to the element type of the optimization. + ElemType momentum; }; private: diff --git a/inst/include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp b/inst/include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp index 788cd1f..c99df52 100644 --- a/inst/include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp +++ b/inst/include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp @@ -42,8 +42,7 @@ class QHUpdate */ QHUpdate(const double v = 0.7, const double momentum = 0.999) : - momentum(momentum), - v(v) + momentum(momentum), v(v) { // Nothing to do. } @@ -68,6 +67,8 @@ class QHUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -77,10 +78,12 @@ class QHUpdate * @param cols Number of columns in the gradient matrix. */ Policy(QHUpdate& parent, const size_t rows, const size_t cols) : - parent(parent) + parent(parent), + velocity(rows, cols), + momentum(ElemType(parent.momentum)), + v(ElemType(parent.v)) { - // Initialize an empty velocity matrix. - velocity.zeros(rows, cols); + // Nothing to do here. } /** @@ -94,18 +97,22 @@ class QHUpdate const double stepSize, const GradType& gradient) { - velocity *= parent.momentum; - velocity += (1 - parent.momentum) * gradient; + velocity *= momentum; + velocity += (1 - momentum) * gradient; - iterate -= stepSize * ((1 - parent.v) * gradient + parent.v * velocity); + iterate -= ElemType(stepSize) * ((1 - v) * gradient + v * velocity); } private: - //! Instantiated parent object. + // Instantiated parent object. QHUpdate& parent; - //! The velocity matrix. + // The velocity matrix. GradType velocity; + + // Parameters converted to the element type of the optimization. + ElemType momentum; + ElemType v; }; private: diff --git a/inst/include/ensmallen_bits/sgd/update_policies/vanilla_update.hpp b/inst/include/ensmallen_bits/sgd/update_policies/vanilla_update.hpp index 41d75fe..8212f3a 100644 --- a/inst/include/ensmallen_bits/sgd/update_policies/vanilla_update.hpp +++ b/inst/include/ensmallen_bits/sgd/update_policies/vanilla_update.hpp @@ -37,6 +37,8 @@ class VanillaUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This is called by the optimizer method before the start of the iteration * update process. The vanilla update doesn't initialize anything. @@ -63,7 +65,7 @@ class VanillaUpdate const GradType& gradient) { // Perform the vanilla SGD update. - iterate -= stepSize * gradient; + iterate -= ElemType(stepSize) * gradient; } }; }; diff --git a/inst/include/ensmallen_bits/sgdr/cyclical_decay.hpp b/inst/include/ensmallen_bits/sgdr/cyclical_decay.hpp index 6591e9a..3776adb 100644 --- a/inst/include/ensmallen_bits/sgdr/cyclical_decay.hpp +++ b/inst/include/ensmallen_bits/sgdr/cyclical_decay.hpp @@ -129,7 +129,7 @@ class CyclicalDecay { // n_t = n_min^i + 0.5(n_max^i - n_min^i)(1 + cos(T_cur/T_i * pi)). stepSize = 0.5 * parent.constStepSize * - (1 + cos((parent.batchRestart / parent.epochBatches) + (1 + std::cos((parent.batchRestart / parent.epochBatches) * arma::datum::pi)); // Keep track of the number of batches since the last restart. diff --git a/inst/include/ensmallen_bits/sgdr/sgdr.hpp b/inst/include/ensmallen_bits/sgdr/sgdr.hpp index de2f6c2..30faaf2 100644 --- a/inst/include/ensmallen_bits/sgdr/sgdr.hpp +++ b/inst/include/ensmallen_bits/sgdr/sgdr.hpp @@ -103,7 +103,7 @@ class SGDR typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/sgdr/sgdr_impl.hpp b/inst/include/ensmallen_bits/sgdr/sgdr_impl.hpp index defbad8..13e840e 100644 --- a/inst/include/ensmallen_bits/sgdr/sgdr_impl.hpp +++ b/inst/include/ensmallen_bits/sgdr/sgdr_impl.hpp @@ -49,8 +49,8 @@ SGDR::SGDR( template template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type SGDR::Optimize( SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/sgdr/snapshot_ensembles.hpp b/inst/include/ensmallen_bits/sgdr/snapshot_ensembles.hpp index a318c02..7f221b7 100644 --- a/inst/include/ensmallen_bits/sgdr/snapshot_ensembles.hpp +++ b/inst/include/ensmallen_bits/sgdr/snapshot_ensembles.hpp @@ -155,7 +155,7 @@ class SnapshotEnsembles { // n_t = n_min^i + 0.5(n_max^i - n_min^i)(1 + cos(T_cur/T_i * pi)). stepSize = 0.5 * parent.constStepSize * - (1 + cos((parent.batchRestart / parent.epochBatches) + (1 + std::cos((parent.batchRestart / parent.epochBatches) * arma::datum::pi)); // Keep track of the number of batches since the last restart. diff --git a/inst/include/ensmallen_bits/sgdr/snapshot_sgdr.hpp b/inst/include/ensmallen_bits/sgdr/snapshot_sgdr.hpp index 0187584..4a30b67 100644 --- a/inst/include/ensmallen_bits/sgdr/snapshot_sgdr.hpp +++ b/inst/include/ensmallen_bits/sgdr/snapshot_sgdr.hpp @@ -120,7 +120,7 @@ class SnapshotSGDR typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, @@ -169,15 +169,37 @@ class SnapshotSGDR //! Modify whether or not the actual objective is calculated. bool& ExactObjective() { return optimizer.ExactObjective(); } - //! Get the snapshots. - std::vector Snapshots() const + // Get the snapshots. The template parameters must be the same as the last + // call to Optimize()! + template + std::vector Snapshots() const { - return optimizer.DecayPolicy().Snapshots(); + if (!optimizer.InstDecayPolicy().template Has< + SnapshotEnsembles::Policy>()) + { + throw std::runtime_error("SnapshotSGDR::Snapshots(): got unexpected type;" + " make sure to call with the same matrix type as the previous " + "optimization!"); + } + + return optimizer.InstDecayPolicy().template As< + SnapshotEnsembles::Policy>().Snapshots(); } - //! Modify the snapshots. - std::vector& Snapshots() + // Modify the snapshots. The template parameters must be the same as the last + // call to Optimize()! + template + std::vector& Snapshots() { - return optimizer.DecayPolicy().Snapshots(); + if (!optimizer.InstDecayPolicy().template Has< + SnapshotEnsembles::Policy>()) + { + throw std::runtime_error("SnapshotSGDR::Snapshots(): got unexpected type;" + " make sure to call with the same matrix type as the previous " + "optimization!"); + } + + return optimizer.InstDecayPolicy().template As< + SnapshotEnsembles::Policy>().Snapshots(); } //! Get whether or not to accumulate the snapshots. diff --git a/inst/include/ensmallen_bits/sgdr/snapshot_sgdr_impl.hpp b/inst/include/ensmallen_bits/sgdr/snapshot_sgdr_impl.hpp index 38f39c1..fd13c53 100644 --- a/inst/include/ensmallen_bits/sgdr/snapshot_sgdr_impl.hpp +++ b/inst/include/ensmallen_bits/sgdr/snapshot_sgdr_impl.hpp @@ -57,8 +57,8 @@ template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type SnapshotSGDR::Optimize( SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/smorms3/smorms3.hpp b/inst/include/ensmallen_bits/smorms3/smorms3.hpp index e45f5ed..d603c96 100644 --- a/inst/include/ensmallen_bits/smorms3/smorms3.hpp +++ b/inst/include/ensmallen_bits/smorms3/smorms3.hpp @@ -92,14 +92,12 @@ class SMORMS3 typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, CallbackTypes&&... callbacks) { - // TODO: disallow sp_mat - return optimizer.Optimize(function, iterate, std::forward(callbacks)...); diff --git a/inst/include/ensmallen_bits/smorms3/smorms3_update.hpp b/inst/include/ensmallen_bits/smorms3/smorms3_update.hpp index 98f49b6..eb0520c 100644 --- a/inst/include/ensmallen_bits/smorms3/smorms3_update.hpp +++ b/inst/include/ensmallen_bits/smorms3/smorms3_update.hpp @@ -37,15 +37,14 @@ class SMORMS3Update /** * Construct the SMORMS3 update policy with given epsilon parameter. * - * @param epsilon Value used to initialise the mean squared gradient - * parameter. + * @param epsilon Value used to avoid divisions by zero. */ SMORMS3Update(const double epsilon = 1e-16) : epsilon(epsilon) { /* Do nothing. */ } - //! Get the value used to initialise the mean squared gradient parameter. + // Get the value used to avoid divisions by zero. double Epsilon() const { return epsilon; } - //! Modify the value used to initialise the mean squared gradient parameter. + // Modify the value used to avoid divisions by zero. double& Epsilon() { return epsilon; } /** @@ -57,6 +56,8 @@ class SMORMS3Update class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This is called by the optimizer method before the start of the iteration * update process. @@ -66,12 +67,17 @@ class SMORMS3Update * @param cols Number of columns in the gradient matrix. */ Policy(SMORMS3Update& parent, const size_t rows, const size_t cols) : - parent(parent) + parent(parent), + epsilon(ElemType(parent.epsilon)) { // Initialise the parameters mem, g and g2. mem.ones(rows, cols); g.zeros(rows, cols); g2.zeros(rows, cols); + + // Attempt to detect underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -94,30 +100,30 @@ class SMORMS3Update g2 = (1 - r) % g2; g2 += r % (gradient % gradient); - MatType x = (g % g) / (g2 + parent.epsilon); - - x.transform( [stepSize](typename MatType::elem_type &v) - { return std::min(v, (typename MatType::elem_type) stepSize); } ); + MatType x = clamp((g % g) / (g2 + epsilon), ElemType(0), + ElemType(stepSize)); - iterate -= gradient % x / (arma::sqrt(g2) + parent.epsilon); + iterate -= gradient % x / (sqrt(g2) + epsilon); mem %= (1 - x); mem += 1; } private: - // Instantiated parent object. + //! Instantiated parent object. SMORMS3Update& parent; - // Memory parameter. + //! Memory parameter. MatType mem; - // Gradient estimate parameter. + //! Gradient estimate parameter. GradType g; - // Squared gradient estimate parameter. + //! Squared gradient estimate parameter. GradType g2; + // Epsilon value converted to the element type of the optimization. + ElemType epsilon; }; private: - //! The value used to initialise the mean squared gradient parameter. + // The value used to avoid divisions by zero. double epsilon; }; diff --git a/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd.hpp b/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd.hpp index fad6f00..164f3c8 100644 --- a/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd.hpp +++ b/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd.hpp @@ -135,7 +135,7 @@ class SPALeRASGD typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd_impl.hpp b/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd_impl.hpp index e0cac7d..8706806 100644 --- a/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd_impl.hpp +++ b/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd_impl.hpp @@ -58,8 +58,8 @@ template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type SPALeRASGD::Optimize( SeparableFunctionType& function, MatType& iterateIn, @@ -213,9 +213,12 @@ SPALeRASGD::Optimize( } // Reset the counter variables. - lastObjective = overallObjective; - overallObjective = 0; - currentFunction = 0; + if (i != actualMaxIterations) + { + lastObjective = overallObjective; + overallObjective = 0; + currentFunction = 0; + } terminate |= Callback::BeginEpoch(*this, f, iterate, epoch, overallObjective, callbacks...); diff --git a/inst/include/ensmallen_bits/spalera_sgd/spalera_stepsize.hpp b/inst/include/ensmallen_bits/spalera_sgd/spalera_stepsize.hpp index 7b8e7ac..ea39b1e 100644 --- a/inst/include/ensmallen_bits/spalera_sgd/spalera_stepsize.hpp +++ b/inst/include/ensmallen_bits/spalera_sgd/spalera_stepsize.hpp @@ -87,6 +87,8 @@ class SPALeRAStepsize class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This is called by the optimizer method before the start of the iteration * update process. @@ -106,12 +108,19 @@ class SPALeRAStepsize mn(0), relaxedObjective(0), phCounter(0), - eveCounter(0) + eveCounter(0), + alpha(ElemType(parent.alpha)), + epsilon(ElemType(parent.epsilon)), + adaptRate(ElemType(parent.adaptRate)) { learningRates.ones(rows, cols); relaxedSums.zeros(rows, cols); parent.lambda = lambda; + + // Attempt to detect underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -127,7 +136,7 @@ class SPALeRAStepsize * @return Stop or continue the learning process. */ bool Update(const double stepSize, - const typename MatType::elem_type objective, + const ElemType objective, const size_t batchSize, const size_t numFunctions, MatType& iterate, @@ -135,7 +144,7 @@ class SPALeRAStepsize { // The ratio of mini-batch size to training set size; needed for the // Page-Hinkley relaxed objective computations. - const double mbRatio = batchSize / (double) numFunctions; + const ElemType mbRatio = batchSize / (ElemType) numFunctions; // Page-Hinkley iteration, check if we have to reset the parameter and // adjust the step size. @@ -162,7 +171,7 @@ class SPALeRAStepsize mn = un; // If the condition is true we reset the parameter and update parameter. - if ((un - mn) > parent.lambda) + if ((un - mn) > ElemType(parent.lambda)) { // Backtracking, reset the parameter. iterate = previousIterate; @@ -172,7 +181,9 @@ class SPALeRAStepsize // Faster. learningRates /= 2; - if (arma::any(arma::vectorise(learningRates) <= 1e-15)) + constexpr const ElemType eps = + 10 * std::numeric_limits::epsilon(); + if (learningRates.min() <= eps) { // Stop because learning rate too low. return false; @@ -183,26 +194,26 @@ class SPALeRAStepsize } else { - const double paramMean = (parent.alpha / (2 - parent.alpha) * - (1 - std::pow(1 - parent.alpha, 2 * (eveCounter + 1)))) / + const ElemType paramMean = (alpha / (2 - alpha) * + (1 - std::pow(1 - alpha, ElemType(2 * (eveCounter + 1))))) / iterate.n_elem; - const double paramStd = (parent.alpha / std::sqrt(iterate.n_elem)) / - std::sqrt(iterate.n_elem); + const ElemType paramStd = + (alpha / std::sqrt(ElemType(iterate.n_elem))) / + std::sqrt(ElemType(iterate.n_elem)); - const typename MatType::elem_type normGradient = - std::sqrt(arma::accu(arma::pow(gradient, 2))); + const ElemType normGradient = std::sqrt(accu(square(gradient))); - relaxedSums *= (1 - parent.alpha); - if (normGradient > parent.epsilon) - relaxedSums += gradient * (parent.alpha / normGradient); + relaxedSums *= (1 - alpha); + if (normGradient > epsilon) + relaxedSums += gradient * (alpha / normGradient); - learningRates %= arma::exp((arma::pow(relaxedSums, 2) - paramMean) * - (parent.adaptRate / paramStd)); + learningRates %= exp((square(relaxedSums) - paramMean) * + (adaptRate / paramStd)); previousIterate = iterate; - iterate -= stepSize * (learningRates % gradient); + iterate -= ElemType(stepSize) * (learningRates % gradient); // Keep track of the the number of evaluations and Page-Hinkley steps. eveCounter++; @@ -216,25 +227,25 @@ class SPALeRAStepsize //! Instantiated parent object. SPALeRAStepsize& parent; - //! Page-Hinkley update parameter. - double mu0; + // Page-Hinkley update parameter. + ElemType mu0; - //! Page-Hinkley update parameter. - double un; + // Page-Hinkley update parameter. + ElemType un; - //! Page-Hinkley update parameter. - double mn; + // Page-Hinkley update parameter. + ElemType mn; - //! Page-Hinkley update parameter. - typename MatType::elem_type relaxedObjective; + // Page-Hinkley update parameter. + ElemType relaxedObjective; - //! Page-Hinkley step counter. + // Page-Hinkley step counter. size_t phCounter; - //! Evaluations step counter. + // Evaluations step counter. size_t eveCounter; - //! Locally-stored parameter wise learning rates. + // Locally-stored parameter wise learning rates. MatType learningRates; //! Locally-stored parameter wise sums. @@ -242,6 +253,11 @@ class SPALeRAStepsize //! Locally-stored previous parameter matrix (backtracking). MatType previousIterate; + + // Parameters converted to the element type of the optimization. + ElemType alpha; + ElemType epsilon; + ElemType adaptRate; }; private: diff --git a/inst/include/ensmallen_bits/spsa/spsa_impl.hpp b/inst/include/ensmallen_bits/spsa/spsa_impl.hpp index 4cb0102..9a0ebf5 100644 --- a/inst/include/ensmallen_bits/spsa/spsa_impl.hpp +++ b/inst/include/ensmallen_bits/spsa/spsa_impl.hpp @@ -45,7 +45,8 @@ typename MatType::elem_type SPSA::Optimize(ArbitraryFunctionType& function, { // Convenience typedefs. typedef typename MatType::elem_type ElemType; - typedef typename MatTypeTraits::BaseMatType BaseMatType; + typedef typename ForwardType::bmat BaseMatType; + typedef typename ForwardType::distr_param DistrParam; // Make sure that we have the methods that we need. traits::CheckArbitraryFunctionTypeAPI(); BaseMatType gradient(iterate.n_rows, iterate.n_cols); - arma::Mat spVector(iterate.n_rows, iterate.n_cols); + BaseMatType spVector(iterate.n_rows, iterate.n_cols); // To keep track of where we are and how things are going. ElemType overallObjective = 0; @@ -90,21 +91,20 @@ typename MatType::elem_type SPSA::Optimize(ArbitraryFunctionType& function, lastObjective = overallObjective; // Gain sequences. - const double akLocal = stepSize / std::pow(k + 1 + ak, alpha); - const double ck = evaluationStepSize / std::pow(k + 1, gamma); + const ElemType akLocal = ElemType(stepSize / std::pow(k + 1 + ak, alpha)); + const ElemType ck = ElemType(evaluationStepSize / std::pow(k + 1, gamma)); // Choose stochastic directions. - spVector = arma::conv_to>::from( - arma::randi(iterate.n_rows, iterate.n_cols, - arma::distr_param(0, 1))) * 2 - 1; + spVector = randi( + iterate.n_rows, iterate.n_cols, DistrParam(0, 1)) * 2 - 1; iterate += ck * spVector; - const double fPlus = function.Evaluate(iterate); + const ElemType fPlus = function.Evaluate(iterate); terminate |= Callback::Evaluate(*this, function, iterate, fPlus, callbacks...); iterate -= 2 * ck * spVector; - const double fMinus = function.Evaluate(iterate); + const ElemType fMinus = function.Evaluate(iterate); terminate |= Callback::Evaluate(*this, function, iterate, fMinus, callbacks...); diff --git a/inst/include/ensmallen_bits/svrg/barzilai_borwein_decay.hpp b/inst/include/ensmallen_bits/svrg/barzilai_borwein_decay.hpp index 49c437e..0937741 100644 --- a/inst/include/ensmallen_bits/svrg/barzilai_borwein_decay.hpp +++ b/inst/include/ensmallen_bits/svrg/barzilai_borwein_decay.hpp @@ -70,11 +70,20 @@ class BarzilaiBorweinDecay class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. */ - Policy(BarzilaiBorweinDecay& parent) : parent(parent) { /* Do nothing. */ } + Policy(BarzilaiBorweinDecay& parent) : + parent(parent), + epsilon(ElemType(parent.epsilon)) + { + // Attempt to detect underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); + } /** * Barzilai-Borwein update step for SVRG. @@ -96,9 +105,9 @@ class BarzilaiBorweinDecay if (!fullGradient0.is_empty()) { // Step size selection based on Barzilai-Borwein (BB). - stepSize = std::pow(arma::norm(iterate - iterate0), 2.0) / - (arma::dot(iterate - iterate0, fullGradient - fullGradient0) + - parent.epsilon) / (double) numBatches; + stepSize = std::pow(norm(iterate - iterate0), ElemType(2)) / + (dot(iterate - iterate0, fullGradient - fullGradient0) + + epsilon) / (ElemType) numBatches; stepSize = std::min(stepSize, parent.maxStepSize); } @@ -107,11 +116,14 @@ class BarzilaiBorweinDecay } private: - //! Reference to instantiated parent object. + // Reference to instantiated parent object. BarzilaiBorweinDecay& parent; - //! Locally-stored full gradient. + // Locally-stored full gradient. GradType fullGradient0; + + // Copy of epsilon parameter casted to the optimization element type. + ElemType epsilon; }; //! The value used for numerical stability. diff --git a/inst/include/ensmallen_bits/svrg/svrg.hpp b/inst/include/ensmallen_bits/svrg/svrg.hpp index 0c121c7..d451fc2 100644 --- a/inst/include/ensmallen_bits/svrg/svrg.hpp +++ b/inst/include/ensmallen_bits/svrg/svrg.hpp @@ -143,7 +143,7 @@ class SVRGType typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/svrg/svrg_impl.hpp b/inst/include/ensmallen_bits/svrg/svrg_impl.hpp index 032bfd4..65685ff 100644 --- a/inst/include/ensmallen_bits/svrg/svrg_impl.hpp +++ b/inst/include/ensmallen_bits/svrg/svrg_impl.hpp @@ -55,8 +55,8 @@ template -typename std::enable_if::value, -typename MatType::elem_type>::type +typename std::enable_if::value, + typename MatType::elem_type>::type SVRGType::Optimize( SeparableFunctionType& functionIn, MatType& iterateIn, @@ -189,7 +189,7 @@ SVRGType::Optimize( f += effectiveBatchSize; } - fullGradient /= (double) numFunctions; + fullGradient /= (ElemType) numFunctions; if (terminate) break; diff --git a/inst/include/ensmallen_bits/svrg/svrg_update.hpp b/inst/include/ensmallen_bits/svrg/svrg_update.hpp index 09dacbe..7da343d 100644 --- a/inst/include/ensmallen_bits/svrg/svrg_update.hpp +++ b/inst/include/ensmallen_bits/svrg/svrg_update.hpp @@ -62,8 +62,8 @@ class SVRGUpdate const double stepSize) { // Perform the vanilla SVRG update. - iterate -= stepSize * (fullGradient + (gradient - gradient0) / - (double) batchSize); + iterate -= typename MatType::elem_type(stepSize) * + (fullGradient + (gradient - gradient0) / batchSize); } }; }; diff --git a/inst/include/ensmallen_bits/swats/swats.hpp b/inst/include/ensmallen_bits/swats/swats.hpp index 1230d19..0fe5ba9 100644 --- a/inst/include/ensmallen_bits/swats/swats.hpp +++ b/inst/include/ensmallen_bits/swats/swats.hpp @@ -98,7 +98,7 @@ class SWATS typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/swats/swats_update.hpp b/inst/include/ensmallen_bits/swats/swats_update.hpp index 0a80a77..dbc3672 100644 --- a/inst/include/ensmallen_bits/swats/swats_update.hpp +++ b/inst/include/ensmallen_bits/swats/swats_update.hpp @@ -96,6 +96,8 @@ class SWATSUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This is called by the optimizer method before the start of the iteration * update process. @@ -106,12 +108,21 @@ class SWATSUpdate */ Policy(SWATSUpdate& parent, const size_t rows, const size_t cols) : parent(parent), - iteration(0) + iteration(0), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)), + sgdRate(ElemType(parent.sgdRate)), + sgdLambda(ElemType(parent.sgdLambda)) { m.zeros(rows, cols); v.zeros(rows, cols); sgdV.zeros(rows, cols); + + // Attempt to catch underflow. + if (epsilon == ElemType(0) && parent.epsilon != 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -132,35 +143,37 @@ class SWATSUpdate { // Note we reuse the exponential moving average parameter here instead // of introducing a new parameter (sgdV) as done in the paper. - v *= parent.beta1; + v *= beta1; v += gradient; - iterate -= (1 - parent.beta1) * parent.sgdRate * v; + iterate -= (1 - beta1) * sgdRate * v; return; } - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - v *= parent.beta2; - v += (1 - parent.beta2) * (gradient % gradient); + v *= beta2; + v += (1 - beta2) * (gradient % gradient); - const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration); - const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration); + const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration)); + const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration)); - GradType delta = stepSize * m / biasCorrection1 / - (arma::sqrt(v / biasCorrection2) + parent.epsilon); + GradType delta = ElemType(stepSize) * m / biasCorrection1 / + (sqrt(v / biasCorrection2) + epsilon); iterate -= delta; - const double deltaGradient = arma::dot(delta, gradient); - if (deltaGradient != 0) + const ElemType deltaGradient = dot(delta, gradient); + if (deltaGradient != ElemType(0)) { - const double rate = arma::dot(delta, delta) / deltaGradient; - parent.sgdLambda = parent.beta2 * parent.sgdLambda + - (1 - parent.beta2) * rate; - parent.sgdRate = parent.sgdLambda / biasCorrection2; + const ElemType rate = dot(delta, delta) / deltaGradient; + sgdLambda = beta2 * sgdLambda + (1 - beta2) * rate; + sgdRate = sgdLambda / biasCorrection2; - if (std::abs(parent.sgdRate - rate) < parent.epsilon && iteration > 1) + parent.sgdLambda = (double) sgdLambda; + parent.sgdRate = (double) sgdRate; + + if (std::abs(sgdRate - rate) < epsilon && iteration > 1) { parent.phaseSGD = true; v.zeros(); @@ -169,39 +182,46 @@ class SWATSUpdate } private: - //! Reference to instantiated parent object. + // Reference to instantiated parent object. SWATSUpdate& parent; - //! The exponential moving average of gradient values. + // The exponential moving average of gradient values. GradType m; - //! The exponential moving average of squared gradient values (Adam). + // The exponential moving average of squared gradient values (Adam). GradType v; - //! The exponential moving average of squared gradient values (SGD). + // The exponential moving average of squared gradient values (SGD). GradType sgdV; - //! The number of iterations. + // The number of iterations. size_t iteration; + + // Parameters casted to the element type of the optimization. + ElemType epsilon; + ElemType beta1; + ElemType beta2; + ElemType sgdRate; + ElemType sgdLambda; }; private: - //! The epsilon value used to initialise the squared gradient parameter. + // The epsilon value used to initialise the squared gradient parameter. double epsilon; - //! The smoothing parameter. + // The smoothing parameter. double beta1; - //! The second moment coefficient. + // The second moment coefficient. double beta2; - //! Wether to use the SGD or Adam update rule. + // Whether to use the SGD or Adam update rule. bool phaseSGD; - //! SGD scaling parameter. + // SGD scaling parameter. double sgdRate; - //! SGD learning rate. + // SGD learning rate. double sgdLambda; }; diff --git a/inst/include/ensmallen_bits/utility/detect_callbacks.hpp b/inst/include/ensmallen_bits/utility/detect_callbacks.hpp new file mode 100644 index 0000000..1d87562 --- /dev/null +++ b/inst/include/ensmallen_bits/utility/detect_callbacks.hpp @@ -0,0 +1,41 @@ +/** + * @file ensmallen_bits/utility/detect_callbacks.hpp + * @author Ryan Curtin + * + * This provides the IsAllNonMatrix utility struct, meant to be used with SFINAE + * to ensure that template arguments are only non-Armadillo classes. (This does + * not actually check that callback functions are implemented!) + * + * mlpack is free software; you may redistribute it and/or modify it under the + * terms of the 3-clause BSD license. You should have received a copy of the + * 3-clause BSD license along with mlpack. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENS_CORE_UTIL_DETECT_CALLBACKS_HPP +#define ENS_CORE_UTIL_DETECT_CALLBACKS_HPP + +namespace ens { + +template +struct IsAllNonMatrix; + +template +struct IsAllNonMatrix +{ + constexpr static bool tIsClass = std::is_class::type>::type>::value; + + constexpr static bool value = + tIsClass && !IsMatrixType::value && + IsAllNonMatrix::value; +}; + +template<> +struct IsAllNonMatrix<> +{ + constexpr static bool value = true; +}; + +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/utility/arma_traits.hpp b/inst/include/ensmallen_bits/utility/function_traits.hpp similarity index 58% rename from inst/include/ensmallen_bits/utility/arma_traits.hpp rename to inst/include/ensmallen_bits/utility/function_traits.hpp index 0555ea5..4c94d23 100644 --- a/inst/include/ensmallen_bits/utility/arma_traits.hpp +++ b/inst/include/ensmallen_bits/utility/function_traits.hpp @@ -18,6 +18,35 @@ namespace ens { // Structs have public members by default (that's why they are chosen over // classes). +template struct IsArmaType; +template struct IsCootType; + +/** + * If value == true, then MatType is a matrix type matching the Armadillo API + * that is supported by ensmallen. + */ +template +struct IsMatrixType +{ + const static bool value = IsArmaType::value || + IsCootType::value; +}; + +/** + * If value == true, then MatType is an Armadillo sparse matrix. + */ +template +struct IsSparseMatrixType +{ + const static bool value = false; +}; + +template +struct IsSparseMatrixType> +{ + const static bool value = true; +}; + /** * If value == true, then MatType is some sort of Armadillo vector or subview. * You might use this struct like this: @@ -40,58 +69,48 @@ struct IsArmaType const static bool value = false; }; -// Commenting out the first template per case, because -// Visual Studio doesn't like this instantiaion pattern (error C2910). -// template<> template struct IsArmaType > { const static bool value = true; }; -// template<> template struct IsArmaType > { const static bool value = true; }; -// template<> template struct IsArmaType > { const static bool value = true; }; -// template<> template struct IsArmaType > { const static bool value = true; }; -// template<> template struct IsArmaType > { const static bool value = true; }; -// template<> template struct IsArmaType > { const static bool value = true; }; -// template<> template struct IsArmaType > { const static bool value = true; }; -// template<> template struct IsArmaType > { @@ -110,34 +129,104 @@ struct IsArmaType > const static bool value = true; }; -// template<> template struct IsArmaType > { const static bool value = true; }; -// template<> template struct IsArmaType > { const static bool value = true; }; -// template<> template struct IsArmaType > { const static bool value = true; }; -// template<> template struct IsArmaType > { const static bool value = true; }; +/** + * If value == true, then MatType is some sort of Bandicoot vector or subview. + * You might use this struct like this: + * + * @code + * // Only accepts VecTypes that are actually Bandicoot vector types. + * template + * void Function(const MatType& argumentA, + * typename std::enable_if_t::value>* = 0); + * @endcode + * + * The use of the enable_if_t object allows the compiler to instantiate + * Function() only if VecType is one of the Bandicoot vector types. It has a + * default argument because it isn't meant to be used in either the function + * call or the function body. + */ +template +struct IsCootType +{ + const static bool value = false; +}; + +#ifdef ENS_HAVE_COOT + +template +struct IsCootType > +{ + const static bool value = true; +}; + +template +struct IsCootType > +{ + const static bool value = true; +}; + +template +struct IsCootType > +{ + const static bool value = true; +}; + +template +struct IsCootType > +{ + const static bool value = true; +}; + +template +struct IsCootType > +{ + const static bool value = true; +}; + +template +struct IsCootType > +{ + const static bool value = true; +}; + +template +struct IsCootType > +{ + const static bool value = true; +}; + +template +struct IsCootType > +{ + const static bool value = true; +}; + +#endif + template struct tuple_element; diff --git a/inst/include/ensmallen_bits/utility/indicators/epsilon.hpp b/inst/include/ensmallen_bits/utility/indicators/epsilon.hpp index d33236d..e627689 100644 --- a/inst/include/ensmallen_bits/utility/indicators/epsilon.hpp +++ b/inst/include/ensmallen_bits/utility/indicators/epsilon.hpp @@ -20,11 +20,11 @@ namespace ens { /** * The epsilon indicator is one of the binary quality indicators that was proposed by - * Zitzler et. al.. The indicator originally calculates a weak dominance relation - * between two approximation sets. It returns "epsilon" which is the factor by which - * the given approximation set is worse than the reference front with respect to + * Zitzler et. al.. The indicator originally calculates a weak dominance relation + * between two approximation sets. It returns "epsilon" which is the factor by which + * the given approximation set is worse than the reference front with respect to * all the objectives. - * + * * \f[ I_{\epsilon}(A,B) = \max_{z^2 \in B} \ * \min_{z^1 \in A} \ * \max_{1 \leq i \leq n} \ \frac{z^1_i}{z^2_i}\ @@ -43,49 +43,50 @@ namespace ens { * } * @endcode */ - class Epsilon - { - public: - /** - * Default constructor does nothing, but is required to satisfy the Indicator - * policy. - */ - Epsilon() { } +class Epsilon +{ + public: + /** + * Default constructor does nothing, but is required to satisfy the Indicator + * policy. + */ + Epsilon() { } - /** - * Find the epsilon value of the front with respect to the given reference - * front. - * - * @tparam CubeType The cube data type of front. - * @param front The given approximation front. - * @param referenceFront The given reference front. - * @return The epsilon value of the front. - */ - template - static typename CubeType::elem_type Evaluate(const CubeType& front, - const CubeType& referenceFront) + /** + * Find the epsilon value of the front with respect to the given reference + * front. + * + * @tparam CubeType The cube data type of front. + * @param front The given approximation front. + * @param referenceFront The given reference front. + * @return The epsilon value of the front. + */ + template + static typename CubeType::elem_type Evaluate(const CubeType& front, + const CubeType& referenceFront) + { + // Convenience typedefs. + typedef typename CubeType::elem_type ElemType; + ElemType eps = 0; + for (size_t i = 0; i < referenceFront.n_slices; i++) { - // Convenience typedefs. - typedef typename CubeType::elem_type ElemType; - ElemType eps = 0; - for (size_t i = 0; i < referenceFront.n_slices; i++) + ElemType epsjMin = std::numeric_limits::max(); + for (size_t j = 0; j < front.n_slices; j++) { - ElemType epsjMin = std::numeric_limits::max(); - for (size_t j = 0; j < front.n_slices; j++) - { - arma::Mat frontRatio = front.slice(j) / referenceFront.slice(i); - frontRatio.replace(arma::datum::inf, -1.); // Handle zero division case. - ElemType epsj = frontRatio.max(); - if (epsj < epsjMin) - epsjMin = epsj; - } - if (epsjMin > eps) - eps = epsjMin; + arma::Mat frontRatio = front.slice(j) / + referenceFront.slice(i); + frontRatio.replace(arma::datum::inf, -1.); // Handle zero division case. + ElemType epsj = frontRatio.max(); + if (epsj < epsjMin) + epsjMin = epsj; } - - return eps; + if (epsjMin > eps) + eps = epsjMin; } - }; + + return eps; + } +}; } // namespace ens diff --git a/inst/include/ensmallen_bits/utility/indicators/igd.hpp b/inst/include/ensmallen_bits/utility/indicators/igd.hpp index 3adf1d7..3c309b2 100644 --- a/inst/include/ensmallen_bits/utility/indicators/igd.hpp +++ b/inst/include/ensmallen_bits/utility/indicators/igd.hpp @@ -18,8 +18,8 @@ namespace ens { /** * The inverted generational distance( IGD) is a metric for assessing the quality * of approximations to the Pareto front obtained by multi-objective optimization - * algorithms.The IGD indicator returns the average distance from each point in - * the reference front to the nearest point to it's solution. + * algorithms.The IGD indicator returns the average distance from each point in + * the reference front to the nearest point to it's solution. * * \f[ d(z,a) = \sqrt{\sum_{i = 1}^{n}(a_i - z_i)^2 \ } \ * \f] @@ -28,67 +28,73 @@ namespace ens { * * @code * @inproceedings{coello2004study, - * title={A study of the parallelization of a coevolutionary multi-objective evolutionary algorithm}, - * author={Coello Coello, Carlos A and Reyes Sierra, Margarita}, - * booktitle={MICAI 2004: Advances in Artificial Intelligence: Third Mexican International Conference on Artificial Intelligence, Mexico City, Mexico, April 26-30, 2004. Proceedings 3}, - * pages={688--697}, - * year={2004}, - * organization={Springer} + * title = {A study of the parallelization of a coevolutionary + * multi-objective evolutionary algorithm}, + * author = {Coello Coello, Carlos A and Reyes Sierra, Margarita}, + * booktitle = {MICAI 2004: Advances in Artificial Intelligence: Third + * Mexican International Conference on Artificial + * Intelligence, Mexico City, Mexico, April 26-30, + * 2004. Proceedings 3}, + * pages = {688--697}, + * year = {2004}, + * organization = {Springer} * } * @endcode */ - class IGD +class IGD +{ + public: + /** + * Default constructor does nothing, but is required to satisfy the Indicator + * policy. + */ + IGD() { /* Nothing to do here. */ } + + /** + * Find the IGD value of the front with respect to the given reference + * front. + * + * @tparam CubeType The cube data type of front. + * @param front The given approximation front. + * @param referenceFront The given reference front. + * @param p The power constant in the distance formula. + * @return The IGD value of the front. + */ + template + static typename CubeType::elem_type Evaluate(const CubeType& front, + const CubeType& referenceFront, + double p) { - public: - /** - * Default constructor does nothing, but is required to satisfy the Indicator - * policy. - */ - IGD() { } + // Convenience typedefs. + typedef typename CubeType::elem_type ElemType; - /** - * Find the IGD value of the front with respect to the given reference - * front. - * - * @tparam CubeType The cube data type of front. - * @param front The given approximation front. - * @param referenceFront The given reference front. - * @param p The power constant in the distance formula. - * @return The IGD value of the front. - */ - template - static typename CubeType::elem_type Evaluate(const CubeType& front, - const CubeType& referenceFront, - double p) + ElemType igd = 0; + for (size_t i = 0; i < referenceFront.n_slices; i++) { - // Convenience typedefs. - typedef typename CubeType::elem_type ElemType; - ElemType igd = 0; - for (size_t i = 0; i < referenceFront.n_slices; i++) + ElemType min = std::numeric_limits::max(); + for (size_t j = 0; j < front.n_slices; j++) { - ElemType min = std::numeric_limits::max(); - for (size_t j = 0; j < front.n_slices; j++) + ElemType dist = 0; + for (size_t k = 0; k < front.slice(j).n_rows; k++) { - ElemType dist = 0; - for (size_t k = 0; k < front.slice(j).n_rows; k++) - { - ElemType z = referenceFront(k, 0, i); - ElemType a = front(k, 0, j); - // Assuming minimization of all objectives. - //! IGD does not clip negative differences to 0 - dist += std::pow(a - z, 2); - } - dist = std::sqrt(dist); - if (dist < min) - min = dist; + ElemType z = referenceFront(k, 0, i); + ElemType a = front(k, 0, j); + // Assuming minimization of all objectives. + // IGD does not clip negative differences to 0. + dist += std::pow(a - z, 2); } - igd += std::pow(min,p); + dist = std::sqrt(dist); + if (dist < min) + min = dist; } - igd /= referenceFront.n_slices; - igd = std::pow(igd, 1.0 / p); - return igd; + igd += std::pow(min,p); } - }; + igd /= referenceFront.n_slices; + igd = std::pow(igd, 1.0 / p); + + return igd; + } +}; } // namespace ens diff --git a/inst/include/ensmallen_bits/utility/indicators/igd_plus.hpp b/inst/include/ensmallen_bits/utility/indicators/igd_plus.hpp index d8f674e..9ec887d 100644 --- a/inst/include/ensmallen_bits/utility/indicators/igd_plus.hpp +++ b/inst/include/ensmallen_bits/utility/indicators/igd_plus.hpp @@ -39,55 +39,56 @@ namespace ens { * } * @endcode */ - class IGDPlus +class IGDPlus +{ + public: + /** + * Default constructor does nothing, but is required to satisfy the Indicator + * policy. + */ + IGDPlus() { /* Nothing to do here. */} + + /** + * Find the IGD+ value of the front with respect to the given reference + * front. + * + * @tparam CubeType The cube data type of front. + * @param front The given approximation front. + * @param referenceFront The given reference front. + * @return The IGD value of the front. + */ + template + static typename CubeType::elem_type Evaluate(const CubeType& front, + const CubeType& referenceFront) { - public: - /** - * Default constructor does nothing, but is required to satisfy the Indicator - * policy. - */ - IGDPlus() { } + // Convenience typedefs. + typedef typename CubeType::elem_type ElemType; - /** - * Find the IGD+ value of the front with respect to the given reference - * front. - * - * @tparam CubeType The cube data type of front. - * @param front The given approximation front. - * @param referenceFront The given reference front. - * @return The IGD value of the front. - */ - template - static typename CubeType::elem_type Evaluate(const CubeType& front, - const CubeType& referenceFront) + ElemType igd = 0; + for (size_t i = 0; i < referenceFront.n_slices; i++) { - // Convenience typedefs. - typedef typename CubeType::elem_type ElemType; - ElemType igd = 0; - for (size_t i = 0; i < referenceFront.n_slices; i++) + ElemType min = std::numeric_limits::max(); + for (size_t j = 0; j < front.n_slices; j++) { - ElemType min = std::numeric_limits::max(); - for (size_t j = 0; j < front.n_slices; j++) + ElemType dist = 0; + for (size_t k = 0; k < front.slice(j).n_rows; k++) { - ElemType dist = 0; - for (size_t k = 0; k < front.slice(j).n_rows; k++) - { - ElemType z = referenceFront(k, 0, i); - ElemType a = front(k, 0, j); - // Assuming minimization of all objectives. - dist += std::pow(std::max(a - z, 0), 2); - } - dist = std::sqrt(dist); - if (dist < min) - min = dist; + ElemType z = referenceFront(k, 0, i); + ElemType a = front(k, 0, j); + // Assuming minimization of all objectives. + dist += std::pow(std::max(a - z, 0), 2); } - igd += min; + dist = std::sqrt(dist); + if (dist < min) + min = dist; } - igd /= referenceFront.n_slices; - - return igd; + igd += min; } - }; + igd /= referenceFront.n_slices; + + return igd; + } +}; } // namespace ens diff --git a/inst/include/ensmallen_bits/utility/proxies.hpp b/inst/include/ensmallen_bits/utility/proxies.hpp new file mode 100644 index 0000000..418f3a6 --- /dev/null +++ b/inst/include/ensmallen_bits/utility/proxies.hpp @@ -0,0 +1,164 @@ +/** + * @file proxies.hpp + * @author Marcus Edel + * + * Simple proxies that based on the data type forwards to `coot` or `arma`. + * + * ensmallen is free software; you may redistribute it and/or modify it under + * the terms of the 3-clause BSD license. You should have received a copy of + * the 3-clause BSD license along with ensmallen. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENSMALLEN_UTILITY_PROXIES_HPP +#define ENSMALLEN_UTILITY_PROXIES_HPP + +#include "function_traits.hpp" + +namespace ens { + +template +struct ForwardTypeHelper; + +/** + * Helper struct that based on the data type `MatType` forwards to the + * corresponding `coot` or `arma` types. For example: + * If `MatType` is an `arma::mat`, then `ForwardType::bmat` + * will be an `arma::Mat`. + * If `MatType` is a `coot::mat`, then `ForwardType::bmat` + * will be a `coot::Mat`. + * + * This allows for writing generic code that can work with both `coot` and + * `arma` types without needing to know which library is being used at compile + * time. + */ +template +struct ForwardType : public ForwardTypeHelper::value> { }; + +// Internal helper class that sets the typedefs to Armadillo types if Bandicoot +// is not available or in use. +template +struct ForwardTypeHelper +{ + // `uword` is a typedef for an unsigned integer type; it is used for matrix + // indices as well as all internal counters and loops. + typedef arma::uword uword; + + // `vec` is a typedef for column vectors (dense matrices with one column). + typedef arma::vec vec; + + // `bvec` (base vector) is a typedef for a vector type, in comparison to + // `vec`, `bvec` uses the given element type `ElemType`. + typedef arma::Col bvec; + + // `bcol` (base col) is a typedef for a column vector type, in comparison to + // `col`, `bcol` uses the given element type `ElemType`. + typedef arma::Col bcol; + + // `brow` (base row) is a typedef for a row vector type, in comparison to + // `row`, `brow` uses the given element type `ElemType`. + typedef arma::Row brow; + + // `mat` is a typedef for dense matrices, with elements stored in + // column-major ordering (ie. column by column). + typedef arma::mat mat; + + // `bmat` (base matrix) is a typedef for a matrix type, in comparison to + // `mat`, `bmat` uses the given element type `ElemType`. + typedef arma::Mat bmat; + + // `cube` is a typedef for 3D matrices (cubes), with elements stored in + // column-major ordering (ie. column by column, then page by page). + typedef arma::cube cube; + + // `bcube` (base cube) is a typedef for a cube type, in comparison to `cube`, + // `bcube` uses the given element type `ElemType`. + typedef arma::Cube bcube; + + // `umat` is a typedef for unsigned integer matrices, with elements stored in + // column-major ordering (ie. column by column). + typedef arma::umat umat; + + // `uvec` is a typedef for unsigned integer vectors (dense matrices with one + // column). + typedef arma::uvec uvec; + + // `ucolvec` is a typedef for unsigned integer column vectors (dense matrices + // with one column). + typedef arma::ucolvec ucolvec; + + // `urowvec` is a typedef for unsigned integer row vectors (dense matrices + // with one row). + typedef arma::urowvec urowvec; + + // `distr_param` is a typedef for the distribution parameters used in + // random number generation. + typedef arma::distr_param distr_param; +}; + +// Internal helper class that sets the typedefs to Bandicoot types if Bandicoot +// is available and in use. +#ifdef ENS_HAVE_COOT +template +struct ForwardTypeHelper +{ + // `uword` is a typedef for an unsigned integer type; it is used for matrix + // indices as well as all internal counters and loops. + typedef coot::uword uword; + + // `vec` is a typedef for column vectors (dense matrices with one column). + typedef coot::vec vec; + + // `bvec` (base vector) is a typedef for a vector type, in comparison to + // `vec`, `bvec` uses the given element type `ElemType`. + typedef coot::Col bvec; + + // `bcol` (base col) is a typedef for a column vector type, in comparison to + // `col`, `bcol` uses the given element type `ElemType`. + typedef coot::Col bcol; + + // `brow` (base row) is a typedef for a row vector type, isn comparison to + // `row`, brow uses the given element type ElemType. + typedef coot::Row brow; + + // `mat` is a typedef for dense matrices, with elements stored in + // column-major ordering (ie. column by column). + typedef coot::mat mat; + + // `bmat` (base matrix) is a typedef for a matrix type, in comparison to + // `mat`, `bmat` uses the given element type `ElemType`. + typedef coot::Mat bmat; + + // `cube` is a typedef for 3D matrices (cubes), with elements stored in + // column-major ordering (ie. column by column, then page by page). + typedef coot::cube cube; + + // `bcube` (base cube) is a typedef for a cube type, in comparison to `cube`, + // `bcube` uses the given element type `ElemType`. + typedef coot::Cube bcube; + + // `umat` is a typedef for unsigned integer matrices, with elements stored in + // column-major ordering (ie. column by column). + typedef coot::umat umat; + + // `uvec` is a typedef for unsigned integer vectors (dense matrices with one + // column). + typedef coot::uvec uvec; + + // `ucolvec` is a typedef for unsigned integer column vectors (dense matrices + // with one column). + typedef coot::ucolvec ucolvec; + + // `urowvec` is a typedef for unsigned integer row vectors (dense matrices + // with one row). + typedef coot::urowvec urowvec; + + // `distr_param` is a typedef for the distribution parameters used in + // random number generation. + typedef coot::distr_param distr_param; +}; +#endif + +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/utility/using.hpp b/inst/include/ensmallen_bits/utility/using.hpp new file mode 100644 index 0000000..6f11f20 --- /dev/null +++ b/inst/include/ensmallen_bits/utility/using.hpp @@ -0,0 +1,174 @@ +/** + * @file ensmallen_bits/utility/using.hpp + * @author Omar Shrit + * @author Ryan Curtin + * @author Conrad Sanderson + * + * This is a set of `using` statements to mitigate any possible risks or + * conflicts with local functions. The compiler is supposed to proritise the + * following functions to be looked up first. This is to be considered as a + * replacement to the ADL solution that we had deployed earlier. + * + * mlpack is free software; you may redistribute it and/or modify it under the + * terms of the 3-clause BSD license. You should have received a copy of the + * 3-clause BSD license along with mlpack. If not, see + * http://www.opensource.org/licenses/BSD-3-Clause for more information. + */ +#ifndef ENS_CORE_UTIL_USING_HPP +#define ENS_CORE_UTIL_USING_HPP + +#include "function_traits.hpp" + +namespace ens { + +#ifdef ENS_HAVE_COOT + +/* using for bandicoot namespace*/ +using coot::abs; +using coot::accu; +using coot::chol; +using coot::clamp; +using coot::conv_to; +using coot::cos; +using coot::dot; +using coot::exp; +using coot::join_cols; +using coot::join_rows; +using coot::linspace; +using coot::log; +using coot::max; +using coot::mean; +using coot::min; +using coot::norm; +using coot::normalise; +using coot::ones; +using coot::pow; +using coot::randi; +using coot::randn; +using coot::randu; +using coot::regspace; +using coot::repmat; +using coot::shuffle; +using coot::sign; +using coot::size; +using coot::sort; +using coot::sort_index; +using coot::sqrt; +using coot::square; +using coot::sum; +using coot::trans; +using coot::vectorise; +using coot::zeros; + +#endif + +/* using for armadillo namespace */ +using arma::abs; +using arma::accu; +using arma::chol; +using arma::clamp; + +// If Bandicoot is used, using arma::conv_to is already +// part of including bandicoot. +#ifndef ENS_HAVE_COOT +using arma::conv_to; +#endif + +using arma::cos; +using arma::dot; +using arma::exp; +using arma::join_cols; +using arma::join_rows; +using arma::linspace; +using arma::log; +using arma::max; +using arma::mean; +using arma::min; +using arma::norm; +using arma::normalise; +using arma::ones; +using arma::pow; +using arma::randi; +using arma::randn; +using arma::randu; +using arma::regspace; +using arma::repmat; +using arma::shuffle; +using arma::sign; +using arma::size; +using arma::sort; +using arma::sort_index; +using arma::sqrt; +using arma::square; +using arma::sum; +using arma::trans; +using arma::vectorise; +using arma::zeros; + +template +struct GetFillTypeInternal +{ + // Default empty implementation +}; + +template +struct GetFillType : public GetFillTypeInternal::value, IsCootType::value> { }; + +// By default, assume that we are using an Armadillo object. +template +struct GetFillTypeInternal +{ + static constexpr const decltype(arma::fill::none)& none = arma::fill::none; + static constexpr const decltype(arma::fill::zeros)& zeros = arma::fill::zeros; + static constexpr const decltype(arma::fill::ones)& ones = arma::fill::ones; + static constexpr const decltype(arma::fill::randu)& randu = arma::fill::randu; + static constexpr const decltype(arma::fill::randn)& randn = arma::fill::randn; + static constexpr const decltype(arma::fill::eye)& eye = arma::fill::eye; +}; + +template +struct GetProxyTypeInternal +{ + // Default empty implementation +}; + +template +struct GetProxyType : public GetProxyTypeInternal::value, IsCootType::value> { }; + +// By default, assume that we are using an Armadillo object. +template +struct GetProxyTypeInternal +{ + using span = arma::span; + static constexpr const decltype(arma::span::all)& all = arma::span::all; +}; + +#ifdef ENS_HAVE_COOT +// If the matrix type is a Bandicoot type, use Bandicoot fill objects instead. +template< + typename MatType> +struct GetFillTypeInternal +{ + static constexpr const decltype(coot::fill::none)& none = coot::fill::none; + static constexpr const decltype(coot::fill::zeros)& zeros = coot::fill::zeros; + static constexpr const decltype(coot::fill::ones)& ones = coot::fill::ones; + static constexpr const decltype(coot::fill::randu)& randu = coot::fill::randu; + static constexpr const decltype(coot::fill::randn)& randn = coot::fill::randn; + static constexpr const decltype(coot::fill::eye)& eye = coot::fill::eye; +}; + +// If the matrix type is a Bandicoot type, use Bandicoot types instead. +template +struct GetProxyTypeInternal +{ + using span = coot::span; + static constexpr const decltype(coot::span::all)& all = coot::span::all; +}; + +#endif + +} // namespace ens + +#endif diff --git a/inst/include/ensmallen_bits/wn_grad/wn_grad.hpp b/inst/include/ensmallen_bits/wn_grad/wn_grad.hpp index 9b31a3f..89f2aa9 100644 --- a/inst/include/ensmallen_bits/wn_grad/wn_grad.hpp +++ b/inst/include/ensmallen_bits/wn_grad/wn_grad.hpp @@ -87,7 +87,7 @@ class WNGrad typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, diff --git a/inst/include/ensmallen_bits/wn_grad/wn_grad_update.hpp b/inst/include/ensmallen_bits/wn_grad/wn_grad_update.hpp index 502058c..0306fe8 100644 --- a/inst/include/ensmallen_bits/wn_grad/wn_grad_update.hpp +++ b/inst/include/ensmallen_bits/wn_grad/wn_grad_update.hpp @@ -56,6 +56,8 @@ class WNGradUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This is called by the optimizer method before the start of the iteration * update process. @@ -67,7 +69,8 @@ class WNGradUpdate Policy(WNGradUpdate& parent, const size_t /* rows */, const size_t /* cols */) : - parent(parent) + parent(parent), + b(ElemType(parent.b)) { /* Nothing to do. */ } @@ -83,18 +86,21 @@ class WNGradUpdate const double stepSize, const GradType& gradient) { - parent.b += std::pow(stepSize, 2.0) / parent.b * - std::pow(arma::norm(gradient), 2); - iterate -= stepSize * gradient / parent.b; + b += std::pow(ElemType(stepSize), ElemType(2)) / b * + std::pow(norm(gradient), ElemType(2)); + parent.b = (double) b; + iterate -= ElemType(stepSize) * gradient / b; } private: - //! Reference to the instantiated parent object. + // Reference to the instantiated parent object. WNGradUpdate& parent; + // Learning rate adjustment using the element type of the optimization. + ElemType b; }; private: - //! Learning rate adjustment. + // Learning rate adjustment. double b; }; diff --git a/inst/include/ensmallen_bits/yogi/yogi.hpp b/inst/include/ensmallen_bits/yogi/yogi.hpp index 4529d24..8f7fbc9 100644 --- a/inst/include/ensmallen_bits/yogi/yogi.hpp +++ b/inst/include/ensmallen_bits/yogi/yogi.hpp @@ -1,6 +1,6 @@ /** * @file yogi.hpp - * @author Marcus Edel + * @author Marcus Edel * * Class wrapper for the Yogi update Policy. Yogi is based on Adam with more * fine grained effective learning rate control. @@ -42,7 +42,7 @@ namespace ens { * see the documentation on function types included with this distribution or * on the ensmallen website. */ -class Yogi +class Yogi { public: /** @@ -100,15 +100,15 @@ class Yogi typename MatType, typename GradType, typename... CallbackTypes> - typename std::enable_if::value, + typename std::enable_if::value, typename MatType::elem_type>::type Optimize(SeparableFunctionType& function, MatType& iterate, CallbackTypes&&... callbacks) { - return optimizer.Optimize(function, iterate, - std::forward(callbacks)...); + return optimizer.template Optimize< + SeparableFunctionType, MatType, GradType, CallbackTypes...>( + function, iterate, std::forward(callbacks)...); } //! Forward the MatType as GradType. @@ -176,7 +176,7 @@ class Yogi //! are reset before Optimize call. bool& ResetPolicy() { return optimizer.ResetPolicy(); } - private: + private: //! The Stochastic Gradient Descent object with Yogi policy. SGD optimizer; }; diff --git a/inst/include/ensmallen_bits/yogi/yogi_update.hpp b/inst/include/ensmallen_bits/yogi/yogi_update.hpp index cdba28d..1c3693f 100644 --- a/inst/include/ensmallen_bits/yogi/yogi_update.hpp +++ b/inst/include/ensmallen_bits/yogi/yogi_update.hpp @@ -45,8 +45,6 @@ class YogiUpdate * parameter. * @param beta1 The smoothing parameter. * @param beta2 The second moment coefficient. - * @param v1 The first quasi-hyperbolic term. - * @param v1 The second quasi-hyperbolic term. */ YogiUpdate(const double epsilon = 1e-8, const double beta1 = 0.9, @@ -83,6 +81,8 @@ class YogiUpdate class Policy { public: + typedef typename MatType::elem_type ElemType; + /** * This constructor is called by the SGD Optimize() method before the start * of the iteration update process. @@ -92,10 +92,17 @@ class YogiUpdate * @param cols Number of columns in the gradient matrix. */ Policy(YogiUpdate& parent, const size_t rows, const size_t cols) : - parent(parent) + parent(parent), + epsilon(ElemType(parent.epsilon)), + beta1(ElemType(parent.beta1)), + beta2(ElemType(parent.beta2)) { m.zeros(rows, cols); v.zeros(rows, cols); + + // Attempt to catch underflow. + if (epsilon == ElemType(0) && parent.epsilon == 0.0) + epsilon = 10 * std::numeric_limits::epsilon(); } /** @@ -109,25 +116,30 @@ class YogiUpdate const double stepSize, const GradType& gradient) { - m *= parent.beta1; - m += (1 - parent.beta1) * gradient; + m *= beta1; + m += (1 - beta1) * gradient; - const MatType gSquared = arma::square(gradient); - v -= (1 - parent.beta2) * arma::sign(v - gSquared) % gSquared; + const MatType gSquared = square(gradient); + v -= (1 - beta2) * sign(v - gSquared) % gSquared; // Now update the iterate. - iterate -= stepSize * m / (arma::sqrt(v) + parent.epsilon); + iterate -= ElemType(stepSize) * m / (sqrt(v) + epsilon); } private: - //! Instantiated parent object. + // Instantiated parent object. YogiUpdate& parent; - //! The exponential moving average of gradient values. + // The exponential moving average of gradient values. GradType m; // The exponential moving average of squared gradient values. GradType v; + + // Parameters converted to the element type of the optimization. + ElemType epsilon; + ElemType beta1; + ElemType beta2; }; private: diff --git a/tools/HISTORYold.md b/tools/HISTORYold.md index 5b33034..c9e759b 100644 --- a/tools/HISTORYold.md +++ b/tools/HISTORYold.md @@ -1,3 +1,79 @@ +### ensmallen 3.10.0: "Unexpected Rain" +###### 2025-09-25 + * SGD-like optimizers now all divide the step size by the batch size so that + step sizes don't need to be tuned in addition to batch sizes. If you require + behavior from ensmallen 2, define the `ENS_OLD_SEPARABLE_STEP_BEHAVIOR` macro + before including `ensmallen.hpp` + ([#431](https://github.com/mlpack/ensmallen/pull/431)). + + * Remove deprecated `ParetoFront()` and `ParetoSet()` from multi-objective + optimizers ([#435](https://github.com/mlpack/ensmallen/pull/435)). Instead, + pass objects to the `Optimize()` function; see the documentation for each + multi-objective optimizer for more details. A typical transition will change + code like: + + ```c++ + optimizer.Optimize(objectives, coordinates); + arma::cube paretoFront = optimizer.ParetoFront(); + arma::cube paretoSet = optimizer.ParetoSet(); + ``` + + to instead gather the Pareto front and set in the call: + + ```c++ + arma::cube paretoFront, paretoSet; + optimizer.Optimize(objectives, coordinates, paretoFront, paretoSet); + ``` + + * Remove deprecated constructor for Active CMA-ES that takes `lowerBound` and + `upperBound` ([#435](https://github.com/mlpack/ensmallen/pull/435)). + Instead, pass an instantiated `BoundaryBoxConstraint` to the constructor. A + typical transition will change code like: + + ```c++ + ActiveCMAES opt(lambda, + lowerBound, upperBound, ...); + ``` + + into + + ```c++ + ActiveCMAES opt(lambda, + BoundaryBoxConstraint(lowerBound, upperBound), ...); + ``` + * Add proximal gradient optimizers for L1-constrained and other related + problems: `FBS`, `FISTA`, and `FASTA` + ([#427](https://github.com/mlpack/ensmallen/pull/427)). See the + documentation for more details. + + * The `Lambda()` and `Sigma()` functions of the `AugLagrangian` optimizer, + which could be used to retrieve the Lagrange multipliers and penalty + parameter after optimization, are now deprecated + ([#439](https://github.com/mlpack/ensmallen/pull/439)). Instead, pass a + vector and a double to the `Optimize()` function directly: + + ```c++ + augLag.Optimize(function, coordinates, lambda, sigma) + ``` + + and these will be filled with the final Lagrange multiplier estimates and + penalty parameters. + +### ensmallen 2.22.2: "E-Bike Excitement" +###### 2025-04-30 + * Fix include statement in `tests/de_test.cpp` + ([#419](https://github.com/mlpack/ensmallen/pull/419)). + + * Fix `exactObjective` output for SGD-like optimizers when the number of + iterations is an even number of epochs + ([#417](https://github.com/mlpack/ensmallen/pull/417)). + + * Increase tolerance in `demon_sgd_test.cpp` + ([#420](https://github.com/mlpack/ensmallen/pull/420)). + + * Set cmake version range to 3.5...4.0 + ([#422](https://github.com/mlpack/ensmallen/pull/422)). + ### ensmallen 2.22.1: "E-Bike Excitement" ###### 2024-12-02 * Remove unused variables to fix compiler warnings