diff --git a/ChangeLog b/ChangeLog
index f1fb586..742666d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2025-09-30  James Balamuta  <balamut2@illinois.edu>
+
+	* DESCRIPTION (Version): Release 3.10.0
+	* NEWS.md: Update for Ensmallen release 3.10.0
+	* inst/include/ensmallen_bits: Upgraded to Ensmallen 3.10.0
+	* inst/include/ensmallen.hpp: ditto
+
 2025-09-09  James Balamuta  <balamut2@illinois.edu>
 
     * DESCRIPTION: Updated requirements for RcppArmadillo
diff --git a/DESCRIPTION b/DESCRIPTION
index e7430f2..b982fe6 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: RcppEnsmallen
 Title: Header-Only C++ Mathematical Optimization Library for 'Armadillo'
-Version: 0.2.22.1.2
+Version: 0.3.10.0.1
 Authors@R: c(
     person("James Joseph", "Balamuta", email = "balamut2@illinois.edu", 
            role = c("aut", "cre", "cph"), 
diff --git a/NEWS.md b/NEWS.md
index 8a44fdc..213fb90 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,3 +1,64 @@
+# RcppEnsmallen 0.3.10.0.1
+
+- Upgraded to ensmallen 3.10.0: "Unexpected Rain" (2025-09-30)
+  - SGD-like optimizers now all divide the step size by the batch size so that
+    step sizes don't need to be tuned in addition to batch sizes.  If you require
+    behavior from ensmallen 2, define the `ENS_OLD_SEPARABLE_STEP_BEHAVIOR` macro
+    before including `ensmallen.hpp`
+    ([#431](https://github.com/mlpack/ensmallen/pull/431)).
+  - Remove deprecated `ParetoFront()` and `ParetoSet()` from multi-objective
+    optimizers ([#435](https://github.com/mlpack/ensmallen/pull/435)).  Instead,
+    pass objects to the `Optimize()` function; see the documentation for each
+    multi-objective optimizer for more details.  A typical transition will change
+    code like:
+     ```c++
+     optimizer.Optimize(objectives, coordinates);
+     arma::cube paretoFront = optimizer.ParetoFront();
+     arma::cube paretoSet = optimizer.ParetoSet();
+     ```
+    to instead gather the Pareto front and set in the call:
+     ```c++
+     arma::cube paretoFront, paretoSet;
+     optimizer.Optimize(objectives, coordinates, paretoFront, paretoSet);
+     ```
+  - Remove deprecated constructor for Active CMA-ES that takes `lowerBound` and
+    `upperBound` ([#435](https://github.com/mlpack/ensmallen/pull/435)).
+    Instead, pass an instantiated `BoundaryBoxConstraint` to the constructor.  A
+    typical transition will change code like:
+     ```c++
+     ActiveCMAES<FullSelection, BoundaryBoxConstraint> opt(lambda,
+         lowerBound, upperBound, ...);
+     ```
+    into
+     ```c++
+     ActiveCMAES<FullSelection, BoundaryBoxConstraint> opt(lambda,
+         BoundaryBoxConstraint(lowerBound, upperBound), ...);
+     ```
+  - Add proximal gradient optimizers for L1-constrained and other related
+    problems: `FBS`, `FISTA`, and `FASTA`
+    ([#427](https://github.com/mlpack/ensmallen/pull/427)).  See the
+    documentation for more details.
+  - The `Lambda()` and `Sigma()` functions of the `AugLagrangian` optimizer,
+    which could be used to retrieve the Lagrange multipliers and penalty
+    parameter after optimization, are now deprecated
+    ([#439](https://github.com/mlpack/ensmallen/pull/439)).  Instead, pass a
+    vector and a double to the `Optimize()` function directly:
+     ```c++
+     augLag.Optimize(function, coordinates, lambda, sigma)
+     ```
+    and these will be filled with the final Lagrange multiplier estimates and
+    penalty parameters.
+  - Fix include statement in `tests/de_test.cpp`
+    ([#419](https://github.com/mlpack/ensmallen/pull/419)).
+  - Fix `exactObjective` output for SGD-like optimizers when the number of
+    iterations is an even number of epochs
+    ([#417](https://github.com/mlpack/ensmallen/pull/417)).
+  - Increase tolerance in `demon_sgd_test.cpp`
+    ([#420](https://github.com/mlpack/ensmallen/pull/420)).
+  - Set cmake version range to 3.5...4.0
+    ([#422](https://github.com/mlpack/ensmallen/pull/422)).
+
+
 # RcppEnsmallen 0.2.22.1.2
 
 - `-DARMA_USE_CURRENT` added to `PKG_CXXFLAGS` to use Armadillo 15.0.2 or higher
diff --git a/inst/include/ensmallen.hpp b/inst/include/ensmallen.hpp
index b7b08c6..5b32ca3 100644
--- a/inst/include/ensmallen.hpp
+++ b/inst/include/ensmallen.hpp
@@ -34,7 +34,16 @@
 
 #include <armadillo>
 
-#if ((ARMA_VERSION_MAJOR < 10) || ((ARMA_VERSION_MAJOR == 10) && (ARMA_VERSION_MINOR < 8)))
+#if defined(COOT_VERSION_MAJOR) && \
+    ((COOT_VERSION_MAJOR >= 2) || \
+     (COOT_VERSION_MAJOR == 2 && COOT_VERSION_MINOR >= 1))
+  // The version of Bandicoot is new enough that we can use it.
+  #undef ENS_HAVE_COOT
+  #define ENS_HAVE_COOT
+#endif
+
+#if ((ARMA_VERSION_MAJOR < 10) || \
+    ((ARMA_VERSION_MAJOR == 10) && (ARMA_VERSION_MINOR < 8)))
   #error "need Armadillo version 10.8 or newer"
 #endif
 
@@ -69,7 +78,10 @@
 #include "ensmallen_bits/log.hpp" // TODO: should move to another place
 
 #include "ensmallen_bits/utility/any.hpp"
-#include "ensmallen_bits/utility/arma_traits.hpp"
+#include "ensmallen_bits/utility/proxies.hpp"
+#include "ensmallen_bits/utility/function_traits.hpp"
+#include "ensmallen_bits/utility/using.hpp"
+#include "ensmallen_bits/utility/detect_callbacks.hpp"
 #include "ensmallen_bits/utility/indicators/epsilon.hpp"
 #include "ensmallen_bits/utility/indicators/igd.hpp"
 #include "ensmallen_bits/utility/indicators/igd_plus.hpp"
@@ -109,8 +121,10 @@
 #include "ensmallen_bits/cne/cne.hpp"
 #include "ensmallen_bits/de/de.hpp"
 #include "ensmallen_bits/eve/eve.hpp"
+#include "ensmallen_bits/fasta/fasta.hpp"
+#include "ensmallen_bits/fbs/fbs.hpp"
+#include "ensmallen_bits/fista/fista.hpp"
 #include "ensmallen_bits/ftml/ftml.hpp"
-
 #include "ensmallen_bits/fw/frank_wolfe.hpp"
 #include "ensmallen_bits/gradient_descent/gradient_descent.hpp"
 #include "ensmallen_bits/grid_search/grid_search.hpp"
diff --git a/inst/include/ensmallen_bits/ada_belief/ada_belief.hpp b/inst/include/ensmallen_bits/ada_belief/ada_belief.hpp
index 1a4b13c..d346c54 100644
--- a/inst/include/ensmallen_bits/ada_belief/ada_belief.hpp
+++ b/inst/include/ensmallen_bits/ada_belief/ada_belief.hpp
@@ -97,7 +97,7 @@ class AdaBelief
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/ada_belief/ada_belief_update.hpp b/inst/include/ensmallen_bits/ada_belief/ada_belief_update.hpp
index f768987..2cddb76 100644
--- a/inst/include/ensmallen_bits/ada_belief/ada_belief_update.hpp
+++ b/inst/include/ensmallen_bits/ada_belief/ada_belief_update.hpp
@@ -79,6 +79,8 @@ class AdaBeliefUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -89,10 +91,16 @@ class AdaBeliefUpdate
      */
     Policy(AdaBeliefUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
+        epsilon(ElemType(parent.epsilon)),
         iteration(0)
     {
       m.zeros(rows, cols);
       s.zeros(rows, cols);
+      // Prevent underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -109,18 +117,18 @@ class AdaBeliefUpdate
       // Increment the iteration counter variable.
       ++iteration;
 
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      s *= parent.beta2;
-      s += (1 - parent.beta2) * arma::pow(gradient - m, 2.0) + parent.epsilon;
+      s *= beta2;
+      s += (1 - beta2) * pow(gradient - m, 2) + epsilon;
 
-      const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
-      const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
+      const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
+      const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));
 
       // And update the iterate.
-      iterate -= ((m / biasCorrection1) * stepSize) / (arma::sqrt(s /
-          biasCorrection2) + parent.epsilon);
+      iterate -= ((m / biasCorrection1) * ElemType(stepSize)) /
+          (sqrt(s / biasCorrection2) + epsilon);
     }
 
    private:
@@ -133,6 +141,11 @@ class AdaBeliefUpdate
     // The exponential moving average of squared gradient values.
     GradType s;
 
+    // Parent parameters converted to the element type of the matrix.
+    ElemType beta1;
+    ElemType beta2;
+    ElemType epsilon;
+
     // The number of iterations.
     size_t iteration;
   };
diff --git a/inst/include/ensmallen_bits/ada_bound/ada_bound.hpp b/inst/include/ensmallen_bits/ada_bound/ada_bound.hpp
index 94283c3..35bdc01 100644
--- a/inst/include/ensmallen_bits/ada_bound/ada_bound.hpp
+++ b/inst/include/ensmallen_bits/ada_bound/ada_bound.hpp
@@ -107,7 +107,7 @@ class AdaBoundType
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(DecomposableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/ada_bound/ada_bound_update.hpp b/inst/include/ensmallen_bits/ada_bound/ada_bound_update.hpp
index 3a84d87..3221d10 100644
--- a/inst/include/ensmallen_bits/ada_bound/ada_bound_update.hpp
+++ b/inst/include/ensmallen_bits/ada_bound/ada_bound_update.hpp
@@ -96,6 +96,8 @@ class AdaBoundUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -105,10 +107,24 @@ class AdaBoundUpdate
      * @param cols Number of columns in the gradient matrix.
      */
     Policy(AdaBoundUpdate& parent, const size_t rows, const size_t cols) :
-        parent(parent), first(true), initialStepSize(0), iteration(0)
+        parent(parent),
+        finalLr(ElemType(parent.finalLr)),
+        gamma(ElemType(parent.gamma)),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
+        first(true),
+        initialStepSize(0),
+        iteration(0)
     {
       m.zeros(rows, cols);
       v.zeros(rows, cols);
+
+      // Check for underflows in conversions.
+      if (gamma == ElemType(0) && parent.gamma != 0.0)
+        gamma = 10 * std::numeric_limits<ElemType>::epsilon();
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -129,30 +145,30 @@ class AdaBoundUpdate
       if (first)
       {
         first = false;
-        initialStepSize = stepSize;
+        initialStepSize = ElemType(stepSize);
       }
 
       // Increment the iteration counter variable.
       ++iteration;
 
       // Decay the first and second moment running average coefficient.
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      v *= parent.beta2;
-      v += (1 - parent.beta2) * (gradient % gradient);
+      v *= beta2;
+      v += (1 - beta2) * (gradient % gradient);
 
-      const ElemType biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
-      const ElemType biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
+      const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
+      const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));
 
-      const ElemType fl = parent.finalLr * stepSize / initialStepSize;
-      const ElemType lower = fl * (1.0 - 1.0 / (parent.gamma * iteration + 1));
-      const ElemType upper = fl * (1.0 + 1.0 / (parent.gamma * iteration));
+      const ElemType fl = finalLr * ElemType(stepSize) / initialStepSize;
+      const ElemType lower = fl * (1 - 1 / (gamma * iteration + 1));
+      const ElemType upper = fl * (1 + 1 / (gamma * iteration));
 
-       // Applies bounds on actual learning rate.
-      iterate -= arma::clamp((stepSize *
-          std::sqrt(biasCorrection2) / biasCorrection1) / (arma::sqrt(v) +
-          parent.epsilon), lower, upper) % m;
+      // Applies bounds on actual learning rate.
+      iterate -= clamp((ElemType(stepSize) *
+          std::sqrt(biasCorrection2) / biasCorrection1) / (sqrt(v) + epsilon),
+          lower, upper) % m;
     }
 
    private:
@@ -165,11 +181,18 @@ class AdaBoundUpdate
     // The exponential moving average of squared gradient values.
     GradType v;
 
+    // Parameters of the parent, casted to the element type of the problem.
+    ElemType finalLr;
+    ElemType gamma;
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
+
     // Whether this is the first call of the Update method.
     bool first;
 
     // The initial (Adam) learning rate.
-    double initialStepSize;
+    ElemType initialStepSize;
 
     // The number of iterations.
     size_t iteration;
diff --git a/inst/include/ensmallen_bits/ada_bound/ams_bound_update.hpp b/inst/include/ensmallen_bits/ada_bound/ams_bound_update.hpp
index 270f8eb..26bad48 100644
--- a/inst/include/ensmallen_bits/ada_bound/ams_bound_update.hpp
+++ b/inst/include/ensmallen_bits/ada_bound/ams_bound_update.hpp
@@ -96,6 +96,8 @@ class AMSBoundUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -105,11 +107,25 @@ class AMSBoundUpdate
      * @param cols Number of columns in the gradient matrix.
      */
     Policy(AMSBoundUpdate& parent, const size_t rows, const size_t cols) :
-        parent(parent), first(true), initialStepSize(0), iteration(0)
+        parent(parent),
+        finalLr(ElemType(parent.finalLr)),
+        gamma(ElemType(parent.gamma)),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
+        first(true),
+        initialStepSize(0),
+        iteration(0)
     {
       m.zeros(rows, cols);
       v.zeros(rows, cols);
       vImproved.zeros(rows, cols);
+
+      // Check for underflows in conversions.
+      if (gamma == ElemType(0) && parent.gamma != 0.0)
+        gamma = 10 * std::numeric_limits<ElemType>::epsilon();
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -123,40 +139,36 @@ class AMSBoundUpdate
                 const double stepSize,
                 const GradType& gradient)
     {
-      // Convenience typedefs.
-      typedef typename MatType::elem_type ElemType;
-
       // Save the initial step size.
       if (first)
       {
         first = false;
-        initialStepSize = stepSize;
+        initialStepSize = ElemType(stepSize);
       }
 
       // Increment the iteration counter variable.
       ++iteration;
 
       // Decay the first and second moment running average coefficient.
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      v *= parent.beta2;
-      v += (1 - parent.beta2) * (gradient % gradient);
+      v *= beta2;
+      v += (1 - beta2) * (gradient % gradient);
 
-      const ElemType biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
-      const ElemType biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
+      const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
+      const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));
 
-      const ElemType fl = parent.finalLr * stepSize / initialStepSize;
-      const ElemType lower = fl * (1.0 - 1.0 / (parent.gamma * iteration + 1));
-      const ElemType upper = fl * (1.0 + 1.0 / (parent.gamma * iteration));
+      const ElemType fl = finalLr * ElemType(stepSize) / initialStepSize;
+      const ElemType lower = fl * (1 - 1 / (gamma * iteration + 1));
+      const ElemType upper = fl * (1 + 1 / (gamma * iteration));
 
       // Element wise maximum of past and present squared gradients.
-      vImproved = arma::max(vImproved, v);
+      vImproved = max(vImproved, v);
 
       // Applies bounds on actual learning rate.
-      iterate -= arma::clamp((stepSize *
-          std::sqrt(biasCorrection2) / biasCorrection1) /
-          (arma::sqrt(vImproved) + parent.epsilon),  lower, upper) % m;
+      iterate -= clamp((ElemType(stepSize) * std::sqrt(biasCorrection2) /
+          biasCorrection1) / (sqrt(vImproved) + epsilon), lower, upper) % m;
     }
 
    private:
@@ -169,11 +181,18 @@ class AMSBoundUpdate
     // The exponential moving average of squared gradient values.
     GradType v;
 
+    // Parameters of the parent, casted to the element type of the problem.
+    ElemType finalLr;
+    ElemType gamma;
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
+
     // Whether this is the first call of the Update method.
     bool first;
 
     // The initial (Adam) learning rate.
-    double initialStepSize;
+    ElemType initialStepSize;
 
     // The optimal squared gradient value.
     GradType vImproved;
diff --git a/inst/include/ensmallen_bits/ada_delta/ada_delta.hpp b/inst/include/ensmallen_bits/ada_delta/ada_delta.hpp
index d958ee2..5c8348b 100644
--- a/inst/include/ensmallen_bits/ada_delta/ada_delta.hpp
+++ b/inst/include/ensmallen_bits/ada_delta/ada_delta.hpp
@@ -98,7 +98,7 @@ class AdaDelta
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/ada_delta/ada_delta_update.hpp b/inst/include/ensmallen_bits/ada_delta/ada_delta_update.hpp
index 26c6dd7..dca4a3f 100644
--- a/inst/include/ensmallen_bits/ada_delta/ada_delta_update.hpp
+++ b/inst/include/ensmallen_bits/ada_delta/ada_delta_update.hpp
@@ -71,6 +71,8 @@ class AdaDeltaUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD optimizer method before the start
      * of the iteration update process. In AdaDelta update policy, the mean
@@ -82,10 +84,16 @@ class AdaDeltaUpdate
      * @param cols Number of columns in the gradient matrix.
      */
     Policy(AdaDeltaUpdate& parent, const size_t rows, const size_t cols) :
-        parent(parent)
+        parent(parent),
+        rho(ElemType(parent.rho)),
+        epsilon(ElemType(parent.epsilon))
     {
       meanSquaredGradient.zeros(rows, cols);
       meanSquaredGradientDx.zeros(rows, cols);
+
+      // Check for underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -102,17 +110,17 @@ class AdaDeltaUpdate
                 const GradType& gradient)
     {
       // Accumulate gradient.
-      meanSquaredGradient *= parent.rho;
-      meanSquaredGradient += (1 - parent.rho) * (gradient % gradient);
-      GradType dx = arma::sqrt((meanSquaredGradientDx + parent.epsilon) /
-          (meanSquaredGradient + parent.epsilon)) % gradient;
+      meanSquaredGradient *= rho;
+      meanSquaredGradient += (1 - rho) * (gradient % gradient);
+      GradType dx = sqrt((meanSquaredGradientDx + epsilon) /
+          (meanSquaredGradient + epsilon)) % gradient;
 
       // Accumulate updates.
-      meanSquaredGradientDx *= parent.rho;
-      meanSquaredGradientDx += (1 - parent.rho) * (dx % dx);
+      meanSquaredGradientDx *= rho;
+      meanSquaredGradientDx += (1 - rho) * (dx % dx);
 
       // Apply update.
-      iterate -= (stepSize * dx);
+      iterate -= (ElemType(stepSize) * dx);
     }
 
    private:
@@ -124,6 +132,10 @@ class AdaDeltaUpdate
 
     // The delta mean squared gradient matrix.
     GradType meanSquaredGradientDx;
+
+    // Parameters of the update, converted to the matrix element type.
+    ElemType rho;
+    ElemType epsilon;
   };
 
  private:
diff --git a/inst/include/ensmallen_bits/ada_grad/ada_grad.hpp b/inst/include/ensmallen_bits/ada_grad/ada_grad.hpp
index 677d300..7522668 100644
--- a/inst/include/ensmallen_bits/ada_grad/ada_grad.hpp
+++ b/inst/include/ensmallen_bits/ada_grad/ada_grad.hpp
@@ -94,7 +94,7 @@ class AdaGrad
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/ada_grad/ada_grad_update.hpp b/inst/include/ensmallen_bits/ada_grad/ada_grad_update.hpp
index b096dd4..4fe8e9d 100644
--- a/inst/include/ensmallen_bits/ada_grad/ada_grad_update.hpp
+++ b/inst/include/ensmallen_bits/ada_grad/ada_grad_update.hpp
@@ -64,6 +64,8 @@ class AdaGradUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD optimizer before the start of the
      * iteration update process. In AdaGrad update policy, squared gradient
@@ -76,10 +78,14 @@ class AdaGradUpdate
      */
     Policy(AdaGradUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
-        squaredGradient(rows, cols)
+        squaredGradient(rows, cols),
+        epsilon(ElemType(parent.epsilon))
     {
       // Initialize an empty matrix for sum of squares of parameter gradient.
       squaredGradient.zeros();
+      // Detect underflow for epsilon and try to address it.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -96,8 +102,8 @@ class AdaGradUpdate
                 const GradType& gradient)
     {
       squaredGradient += (gradient % gradient);
-      iterate -= (stepSize * gradient) / (arma::sqrt(squaredGradient) +
-          parent.epsilon);
+      iterate -= (ElemType(stepSize) * gradient) / (sqrt(squaredGradient) +
+          epsilon);
     }
 
    private:
@@ -105,6 +111,8 @@ class AdaGradUpdate
     AdaGradUpdate& parent;
     // The squared gradient matrix.
     GradType squaredGradient;
+    // The epsilon value, converted to the element type of the matrix.
+    ElemType epsilon;
   };
 
  private:
diff --git a/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt.hpp b/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt.hpp
index 7f1788c..ebdf212 100644
--- a/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt.hpp
+++ b/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt.hpp
@@ -89,7 +89,7 @@ class AdaSqrt
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt_update.hpp b/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt_update.hpp
index feae24c..4bdb001 100644
--- a/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt_update.hpp
+++ b/inst/include/ensmallen_bits/ada_sqrt/ada_sqrt_update.hpp
@@ -59,6 +59,8 @@ class AdaSqrtUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD optimizer before the start of the
      * iteration update process. In AdaSqrt update policy, squared gradient
@@ -72,10 +74,14 @@ class AdaSqrtUpdate
     Policy(AdaSqrtUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
         squaredGradient(rows, cols),
+        epsilon(ElemType(parent.epsilon)),
         iteration(0)
     {
       // Initialize an empty matrix for sum of squares of parameter gradient.
       squaredGradient.zeros();
+      // Check for underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -93,10 +99,10 @@ class AdaSqrtUpdate
     {
       ++iteration;
 
-      squaredGradient += arma::square(gradient);
+      squaredGradient += square(gradient);
 
-      iterate -= stepSize * std::sqrt(iteration) * gradient /
-          (squaredGradient + parent.epsilon);
+      iterate -= ElemType(stepSize) * std::sqrt(ElemType(iteration)) *
+          gradient / (squaredGradient + epsilon);
     }
 
    private:
@@ -104,6 +110,8 @@ class AdaSqrtUpdate
     AdaSqrtUpdate& parent;
     // The squared gradient matrix.
     GradType squaredGradient;
+    // Epsilon converted to the element type of the optimization.
+    ElemType epsilon;
     // The number of iterations.
     size_t iteration;
   };
diff --git a/inst/include/ensmallen_bits/adam/adam.hpp b/inst/include/ensmallen_bits/adam/adam.hpp
index 13c2f96..4595cf4 100644
--- a/inst/include/ensmallen_bits/adam/adam.hpp
+++ b/inst/include/ensmallen_bits/adam/adam.hpp
@@ -120,7 +120,7 @@ class AdamType
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/adam/adam_update.hpp b/inst/include/ensmallen_bits/adam/adam_update.hpp
index de7f61e..dde10a7 100644
--- a/inst/include/ensmallen_bits/adam/adam_update.hpp
+++ b/inst/include/ensmallen_bits/adam/adam_update.hpp
@@ -82,6 +82,8 @@ class AdamUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -92,10 +94,17 @@ class AdamUpdate
      */
     Policy(AdamUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
         iteration(0)
     {
       m.zeros(rows, cols);
       v.zeros(rows, cols);
+
+      // Attempt to detect underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -113,22 +122,23 @@ class AdamUpdate
       ++iteration;
 
       // And update the iterate.
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      v *= parent.beta2;
-      v += (1 - parent.beta2) * (gradient % gradient);
+      v *= beta2;
+      v += (1 - beta2) * square(gradient);
 
-      const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
-      const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
+      const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
+      const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));
 
       /**
        * It should be noted that the term, m / (arma::sqrt(v) + eps), in the
        * following expression is an approximation of the following actual term;
        * m / (arma::sqrt(v) + (arma::sqrt(biasCorrection2) * eps).
        */
-      iterate -= (stepSize * std::sqrt(biasCorrection2) / biasCorrection1) *
-          m / (arma::sqrt(v) + parent.epsilon);
+      iterate -= (ElemType(stepSize) *
+          std::sqrt(biasCorrection2) / biasCorrection1) *
+          m / (sqrt(v) + epsilon);
     }
 
    private:
@@ -141,6 +151,11 @@ class AdamUpdate
     // The exponential moving average of squared gradient values.
     GradType v;
 
+    // Parameters converted to the element type of the optimization.
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
+
     // The number of iterations.
     size_t iteration;
   };
diff --git a/inst/include/ensmallen_bits/adam/adamax_update.hpp b/inst/include/ensmallen_bits/adam/adamax_update.hpp
index a6c9f2f..13e8bae 100644
--- a/inst/include/ensmallen_bits/adam/adamax_update.hpp
+++ b/inst/include/ensmallen_bits/adam/adamax_update.hpp
@@ -30,11 +30,11 @@ namespace ens {
  *
  * @code
  * @article{Kingma2014,
- *   author    = {Diederik P. Kingma and Jimmy Ba},
- *   title     = {Adam: {A} Method for Stochastic Optimization},
- *   journal   = {CoRR},
- *   year      = {2014},
- *   url       = {http://arxiv.org/abs/1412.6980}
+ *   author  = {Diederik P. Kingma and Jimmy Ba},
+ *   title   = {Adam: {A} Method for Stochastic Optimization},
+ *   journal = {CoRR},
+ *   year    = {2014},
+ *   url     = {http://arxiv.org/abs/1412.6980}
  * }
  * @endcode
  */
@@ -84,6 +84,8 @@ class AdaMaxUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -94,10 +96,16 @@ class AdaMaxUpdate
      */
     Policy(AdaMaxUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
         iteration(0)
     {
       m.zeros(rows, cols);
       u.zeros(rows, cols);
+      // Attempt to detect underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -115,17 +123,17 @@ class AdaMaxUpdate
       ++iteration;
 
       // And update the iterate.
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
       // Update the exponentially weighted infinity norm.
-      u *= parent.beta2;
-      u = arma::max(u, arma::abs(gradient));
+      u *= beta2;
+      u = max(u, abs(gradient));
 
-      const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
+      const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
 
       if (biasCorrection1 != 0)
-        iterate -= (stepSize / biasCorrection1 * m / (u + parent.epsilon));
+        iterate -= (ElemType(stepSize) / biasCorrection1 * m / (u + epsilon));
     }
 
    private:
@@ -135,6 +143,10 @@ class AdaMaxUpdate
     GradType m;
     // The exponentially weighted infinity norm.
     GradType u;
+    // Tuning parameters converted to the element type of the optimization.
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
     // The number of iterations.
     size_t iteration;
   };
diff --git a/inst/include/ensmallen_bits/adam/amsgrad_update.hpp b/inst/include/ensmallen_bits/adam/amsgrad_update.hpp
index f1f420e..a5d4562 100644
--- a/inst/include/ensmallen_bits/adam/amsgrad_update.hpp
+++ b/inst/include/ensmallen_bits/adam/amsgrad_update.hpp
@@ -2,7 +2,7 @@
  * @file amsgrad_update.hpp
  * @author Haritha Nair
  *
- * Implementation of AMSGrad optimizer. AMSGrad is an exponential moving average 
+ * Implementation of AMSGrad optimizer. AMSGrad is an exponential moving average
  * optimizer that dynamically adapts over time with guaranteed convergence.
  *
  * ensmallen is free software; you may redistribute it and/or modify it under
@@ -25,9 +25,9 @@ namespace ens {
  *
  * @code
  * @article{
- *   title   = {On the convergence of Adam and beyond},
- *   url     = {https://openreview.net/pdf?id=ryQu7f-RZ}
- *   year    = {2018}
+ *   title = {On the convergence of Adam and beyond},
+ *   url   = {https://openreview.net/pdf?id=ryQu7f-RZ}
+ *   year  = {2018}
  * }
  * @endcode
  */
@@ -77,6 +77,8 @@ class AMSGradUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -87,11 +89,18 @@ class AMSGradUpdate
      */
     Policy(AMSGradUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
         iteration(0)
     {
       m.zeros(rows, cols);
       v.zeros(rows, cols);
       vImproved.zeros(rows, cols);
+
+      // Attempt to detect underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -109,20 +118,21 @@ class AMSGradUpdate
       ++iteration;
 
       // And update the iterate.
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      v *= parent.beta2;
-      v += (1 - parent.beta2) * (gradient % gradient);
+      v *= beta2;
+      v += (1 - beta2) * (gradient % gradient);
 
-      const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
-      const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
+      const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
+      const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));
 
       // Element wise maximum of past and present squared gradients.
-      vImproved = arma::max(vImproved, v);
+      vImproved = max(vImproved, v);
 
-      iterate -= (stepSize * std::sqrt(biasCorrection2) / biasCorrection1) *
-                  m / (arma::sqrt(vImproved) + parent.epsilon);
+      iterate -= (ElemType(stepSize) *
+          std::sqrt(biasCorrection2) / biasCorrection1) *
+          m / (sqrt(vImproved) + epsilon);
     }
 
    private:
@@ -138,6 +148,11 @@ class AMSGradUpdate
     // The optimal squared gradient value.
     GradType vImproved;
 
+    // Parameters converted to the element type of the optimization.
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
+
     // The number of iterations.
     size_t iteration;
   };
diff --git a/inst/include/ensmallen_bits/adam/nadam_update.hpp b/inst/include/ensmallen_bits/adam/nadam_update.hpp
index 24f105c..9014095 100644
--- a/inst/include/ensmallen_bits/adam/nadam_update.hpp
+++ b/inst/include/ensmallen_bits/adam/nadam_update.hpp
@@ -85,6 +85,8 @@ class NadamUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the optimizer before the start of the
      * iteration update process.
@@ -96,10 +98,17 @@ class NadamUpdate
     Policy(NadamUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
         cumBeta1(1),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
         iteration(0)
     {
       m.zeros(rows, cols);
       v.zeros(rows, cols);
+
+      // Attempt to detect underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -117,30 +126,31 @@ class NadamUpdate
       ++iteration;
 
       // And update the iterate.
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      v *= parent.beta2;
-      v += (1 - parent.beta2) * gradient % gradient;
+      v *= beta2;
+      v += (1 - beta2) * gradient % gradient;
 
-      double beta1T = parent.beta1 * (1 - (0.5 *
+      ElemType beta1T = beta1 * (1 - ElemType(0.5 *
           std::pow(0.96, iteration * parent.scheduleDecay)));
 
-      double beta1T1 = parent.beta1 * (1 - (0.5 *
+      ElemType beta1T1 = beta1 * (1 - ElemType(0.5 *
           std::pow(0.96, (iteration + 1) * parent.scheduleDecay)));
 
       cumBeta1 *= beta1T;
 
-      const double biasCorrection1 = 1.0 - cumBeta1;
-      const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
-      const double biasCorrection3 = 1.0 - (cumBeta1 * beta1T1);
+      const ElemType biasCorrection1 = 1 - cumBeta1;
+      const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));
+      const ElemType biasCorrection3 = 1 - (cumBeta1 * beta1T1);
 
       /* Note :- arma::sqrt(v) + epsilon * sqrt(biasCorrection2) is approximated
        * as arma::sqrt(v) + epsilon
        */
-      iterate -= (stepSize * (((1 - beta1T) / biasCorrection1) * gradient
-          + (beta1T1 / biasCorrection3) * m) * sqrt(biasCorrection2))
-          / (arma::sqrt(v) + parent.epsilon);
+      iterate -= (ElemType(stepSize) *
+          (((1 - beta1T) / biasCorrection1) * gradient +
+          (beta1T1 / biasCorrection3) * m) * std::sqrt(biasCorrection2)) /
+          (sqrt(v) + epsilon);
     }
 
    private:
@@ -154,7 +164,12 @@ class NadamUpdate
     GradType v;
 
     // The cumulative product of decay coefficients.
-    double cumBeta1;
+    ElemType cumBeta1;
+
+    // Parameters converted to the element type of the optimization.
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
 
     // The number of iterations.
     size_t iteration;
diff --git a/inst/include/ensmallen_bits/adam/nadamax_update.hpp b/inst/include/ensmallen_bits/adam/nadamax_update.hpp
index f0d9b0c..570fb7b 100644
--- a/inst/include/ensmallen_bits/adam/nadamax_update.hpp
+++ b/inst/include/ensmallen_bits/adam/nadamax_update.hpp
@@ -85,6 +85,8 @@ class NadaMaxUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor method is called by the optimizer before the start of
      * the iteration update process.
@@ -96,10 +98,17 @@ class NadaMaxUpdate
     Policy(NadaMaxUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
         cumBeta1(1),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
         iteration(0)
     {
       m.zeros(rows, cols);
       u.zeros(rows, cols);
+
+      // Attempt to detect underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -117,27 +126,27 @@ class NadaMaxUpdate
       ++iteration;
 
       // And update the iterate.
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      u = arma::max(u * parent.beta2, arma::abs(gradient));
+      u = max(u * beta2, abs(gradient));
 
-      double beta1T = parent.beta1 * (1 - (0.5 *
+      ElemType beta1T = beta1 * (1 - ElemType(0.5 *
           std::pow(0.96, iteration * parent.scheduleDecay)));
 
-      double beta1T1 = parent.beta1 * (1 - (0.5 *
+      ElemType beta1T1 = beta1 * (1 - ElemType(0.5 *
           std::pow(0.96, (iteration + 1) * parent.scheduleDecay)));
 
       cumBeta1 *= beta1T;
 
-      const double biasCorrection1 = 1.0 - cumBeta1;
-
-      const double biasCorrection2 = 1.0 - (cumBeta1 * beta1T1);
+      const ElemType biasCorrection1 = 1 - cumBeta1;
+      const ElemType biasCorrection2 = 1 - (cumBeta1 * beta1T1);
 
       if ((biasCorrection1 != 0) && (biasCorrection2 != 0))
       {
-         iterate -= (stepSize * (((1 - beta1T) / biasCorrection1) * gradient
-             + (beta1T1 / biasCorrection2) * m)) / (u + parent.epsilon);
+        iterate -= (ElemType(stepSize) *
+            (((1 - beta1T) / biasCorrection1) * gradient +
+            (beta1T1 / biasCorrection2) * m)) / (u + epsilon);
       }
     }
 
@@ -152,7 +161,12 @@ class NadaMaxUpdate
     GradType u;
 
     // The cumulative product of decay coefficients.
-    double cumBeta1;
+    ElemType cumBeta1;
+
+    // Parameters converted to the element type of the optimization.
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
 
     // The number of iterations.
     size_t iteration;
diff --git a/inst/include/ensmallen_bits/adam/optimisticadam_update.hpp b/inst/include/ensmallen_bits/adam/optimisticadam_update.hpp
index 426a5bb..9e381ed 100644
--- a/inst/include/ensmallen_bits/adam/optimisticadam_update.hpp
+++ b/inst/include/ensmallen_bits/adam/optimisticadam_update.hpp
@@ -27,11 +27,11 @@ namespace ens {
  *
  * @code
  * @article{
- *   author  = {Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis,
- *              Haoyang Zeng},
- *   title   = {Training GANs with Optimism},
- *   year    = {2017},
- *   url     = {https://arxiv.org/abs/1711.00141}
+ *   author = {Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis,
+ *             Haoyang Zeng},
+ *   title  = {Training GANs with Optimism},
+ *   year   = {2017},
+ *   url    = {https://arxiv.org/abs/1711.00141}
  * }
  * @endcode
  */
@@ -81,6 +81,8 @@ class OptimisticAdamUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -91,11 +93,18 @@ class OptimisticAdamUpdate
      */
     Policy(OptimisticAdamUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
         iteration(0)
     {
       m.zeros(rows, cols);
       v.zeros(rows, cols);
       g.zeros(rows, cols);
+
+      // Attempt to detect underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -113,18 +122,18 @@ class OptimisticAdamUpdate
       ++iteration;
 
       // And update the iterate.
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      v *= parent.beta2;
-      v += (1 - parent.beta2) * arma::square(gradient);
+      v *= beta2;
+      v += (1 - beta2) * square(gradient);
 
-      GradType mCorrected = m / (1.0 - std::pow(parent.beta1, iteration));
-      GradType vCorrected = v / (1.0 - std::pow(parent.beta2, iteration));
+      GradType mCorrected = m / (1 - std::pow(beta1, ElemType(iteration)));
+      GradType vCorrected = v / (1 - std::pow(beta2, ElemType(iteration)));
 
-      GradType update = mCorrected / (arma::sqrt(vCorrected) + parent.epsilon);
+      GradType update = mCorrected / (sqrt(vCorrected) + epsilon);
 
-      iterate -= (2 * stepSize * update - stepSize * g);
+      iterate -= (2 * ElemType(stepSize) * update - ElemType(stepSize) * g);
 
       g = std::move(update);
     }
@@ -142,6 +151,11 @@ class OptimisticAdamUpdate
     // The previous update.
     GradType g;
 
+    // Parameters converted to the element type of the optimization.
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
+
     // The number of iterations.
     size_t iteration;
   };
diff --git a/inst/include/ensmallen_bits/agemoea/agemoea.hpp b/inst/include/ensmallen_bits/agemoea/agemoea.hpp
index 9f914da..452bc98 100644
--- a/inst/include/ensmallen_bits/agemoea/agemoea.hpp
+++ b/inst/include/ensmallen_bits/agemoea/agemoea.hpp
@@ -126,6 +126,33 @@ class AGEMOEA
       MatType& iterate,
       CallbackTypes&&... callbacks);
 
+  /**
+   * Optimize a set of objectives. The initial population is generated using the
+   * starting point. The output is the best generated front.
+   *
+   * @tparam ArbitraryFunctionType std::tuple of multiple objectives.
+   * @tparam MatType Type of matrix to optimize.
+   * @tparam CubeType The type of cube used to store the front and Pareto set.
+   * @tparam CallbackTypes Types of callback functions.
+   * @param objectives Vector of objective functions to optimize for.
+   * @param iterate Starting point.
+   * @param front The generated front.
+   * @param paretoSet The generated Pareto set.
+   * @param callbacks Callback functions.
+   * @return MatType::elem_type The minimum of the accumulated sum over the
+   *     objective values in the best front.
+   */
+  template<typename MatType,
+           typename CubeType,
+           typename... ArbitraryFunctionType,
+           typename... CallbackTypes>
+  typename MatType::elem_type Optimize(
+      std::tuple<ArbitraryFunctionType...>& objectives,
+      MatType& iterate,
+      CubeType& front,
+      CubeType& paretoSet,
+      CallbackTypes&&... callbacks);
+
   //! Get the population size.
   size_t PopulationSize() const { return populationSize; }
   //! Modify the population size.
@@ -166,34 +193,6 @@ class AGEMOEA
   //! Modify value of upperBound.
   arma::vec& UpperBound() { return upperBound; }
 
-  //! Retrieve the Pareto optimal points in variable space. This returns an empty cube
-  //! until `Optimize()` has been called.
-  const arma::cube& ParetoSet() const { return paretoSet; }
-
-  //! Retrieve the best front (the Pareto frontier). This returns an empty cube until
-  //! `Optimize()` has been called.
-  const arma::cube& ParetoFront() const { return paretoFront; }
-
-  /**
-   * Retrieve the best front (the Pareto frontier).  This returns an empty
-   * vector until `Optimize()` has been called.  Note that this function is
-   * deprecated and will be removed in ensmallen 3.x!  Use `ParetoFront()`
-   * instead.
-   */
-  const std::vector<arma::mat>& Front()
-  {
-    if (rcFront.size() == 0)
-    {
-      // Match the old return format.
-      for (size_t i = 0; i < paretoFront.n_slices; ++i)
-      {
-        rcFront.push_back(arma::mat(paretoFront.slice(i)));
-      }
-    }
-
-    return rcFront;
-  }
-
  private:
   /**
    * Evaluate objectives for the elite population.
@@ -205,21 +204,22 @@ class AGEMOEA
    * @param calculatedObjectives Vector to store calculated objectives.
    */
   template<std::size_t I = 0,
-           typename MatType,
+           typename InputMatType,
+           typename ObjectiveMatType,
            typename ...ArbitraryFunctionType>
   typename std::enable_if<I == sizeof...(ArbitraryFunctionType), void>::type
-  EvaluateObjectives(std::vector<MatType>&,
+  EvaluateObjectives(std::vector<InputMatType>&,
                      std::tuple<ArbitraryFunctionType...>&,
-                     std::vector<arma::Col<typename MatType::elem_type> >&);
+                     std::vector<ObjectiveMatType>&);
 
   template<std::size_t I = 0,
-           typename MatType,
+           typename InputMatType,
+           typename ObjectiveMatType,
            typename ...ArbitraryFunctionType>
   typename std::enable_if<I < sizeof...(ArbitraryFunctionType), void>::type
-  EvaluateObjectives(std::vector<MatType>& population,
+  EvaluateObjectives(std::vector<InputMatType>& population,
                      std::tuple<ArbitraryFunctionType...>& objectives,
-                     std::vector<arma::Col<typename MatType::elem_type> >&
-                     calculatedObjectives);
+                     std::vector<ObjectiveMatType>& calculatedObjectives);
 
   /**
    * Reproduce candidates from the elite population to generate a new
@@ -283,7 +283,8 @@ class AGEMOEA
   void FastNonDominatedSort(
       std::vector<std::vector<size_t> >& fronts,
       std::vector<size_t>& ranks,
-      std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives);
+      std::vector<arma::Col<typename MatType::elem_type> >&
+          calculatedObjectives);
 
   /**
    * Operator to check if one candidate Pareto-dominates the other.
@@ -304,17 +305,18 @@ class AGEMOEA
       size_t candidateP,
       size_t candidateQ);
 
- /**
-  * Assigns Survival Score metric for sorting.
-  *
-  * @param front The previously generated Pareto fronts.
-  * @param idealPoint The ideal point of teh first front.
-  * @param calculatedObjectives The previously calculated objectives.
-  * @param survivalScore The Survival Score vector to be updated for each individual in the population.
-  * @param normalize The normlization vector of the fronts.
-  * @param dimension The dimension of the first front.
-  * @param fNum teh current front index.
-  */
+  /**
+   * Assigns Survival Score metric for sorting.
+   *
+   * @param front The previously generated Pareto fronts.
+   * @param idealPoint The ideal point of teh first front.
+   * @param calculatedObjectives The previously calculated objectives.
+   * @param survivalScore The Survival Score vector to be updated for each
+   *     individual in the population.
+   * @param normalize The normlization vector of the fronts.
+   * @param dimension The dimension of the first front.
+   * @param fNum teh current front index.
+   */
   template <typename MatType>
   void SurvivalScoreAssignment(
       const std::vector<size_t>& front,
@@ -322,7 +324,7 @@ class AGEMOEA
       std::vector<arma::Col<typename MatType::elem_type>>& calculatedObjectives,
       std::vector<typename MatType::elem_type>& survivalScore,
       arma::Col<typename MatType::elem_type>& normalize,
-      double& dimension,
+      typename MatType::elem_type& dimension,
       size_t fNum);
 
   /**
@@ -338,7 +340,7 @@ class AGEMOEA
    *     being sorted.
    * @param ranks The previously calculated ranks.
    * @param survivalScore The Survival score for each individual in
-   *	the population.
+   *     the population.
    * @return true if the first candidate is preferred, otherwise, false.
    */
   template<typename MatType>
@@ -347,37 +349,39 @@ class AGEMOEA
       size_t idxQ,
       const std::vector<size_t>& ranks,
       const std::vector<typename MatType::elem_type>& survivalScore);
-  
- /**
-  * Normalizes the front given the extreme points in the current front.
-  *
-  * @tparam The type of population datapoints.
-  * @param calculatedObjectives The current population evaluated objectives.
-  * @param normalization The normalizing vector.
-  * @param front The previously generated Pareto front.
-  * @param extreme The indexes of the extreme points in the front.
-  */
- template <typename MatType>
- void NormalizeFront(
-     std::vector<arma::Col<typename MatType::elem_type>>& calculatedObjectives,
-     arma::Col<typename MatType::elem_type>& normalization,
-     const std::vector<size_t>& front,
-     const arma::Row<size_t>& extreme);
- 
- /**
-  * Get the geometry information p of Lp norm (p > 0).
-  *
-  * @param calculatedObjectives The current population evaluated objectives.
-  * @param front The previously generated Pareto fronts.
-  * @param extreme The indexes of the extreme points in the front.
-  * @return The variable p in the Lp norm that best fits the geometry of the current front.
-  */
- template <typename MatType>
- double GetGeometry(
-      std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives,
+
+  /**
+   * Normalizes the front given the extreme points in the current front.
+   *
+   * @tparam The type of population datapoints.
+   * @param calculatedObjectives The current population evaluated objectives.
+   * @param normalization The normalizing vector.
+   * @param front The previously generated Pareto front.
+   * @param extreme The indexes of the extreme points in the front.
+   */
+  template <typename MatType>
+  void NormalizeFront(
+      std::vector<arma::Col<typename MatType::elem_type>>& calculatedObjectives,
+      arma::Col<typename MatType::elem_type>& normalization,
+      const std::vector<size_t>& front,
+      const arma::Row<size_t>& extreme);
+
+  /**
+   * Get the geometry information p of Lp norm (p > 0).
+   *
+   * @param calculatedObjectives The current population evaluated objectives.
+   * @param front The previously generated Pareto fronts.
+   * @param extreme The indexes of the extreme points in the front.
+   * @return The variable p in the Lp norm that best fits the geometry of the
+   *     current front.
+   */
+  template <typename MatType>
+  typename MatType::elem_type GetGeometry(
+      std::vector<arma::Col<typename MatType::elem_type> >&
+          calculatedObjectives,
       const std::vector<size_t>& front,
       const arma::Row<size_t>& extreme);
-  
+
   /**
    * Finds the pairwise Lp distance between all the points in the front.
    *
@@ -389,13 +393,14 @@ class AGEMOEA
   template <typename MatType>
   void PairwiseDistance(
       MatType& final,
-      std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives,
+      std::vector<arma::Col<typename MatType::elem_type> >&
+          calculatedObjectives,
       const std::vector<size_t>& front,
-      double dimension);
+      const typename MatType::elem_type dimension);
 
   /**
    * Finding the indexes of the extreme points in the front.
-   *  
+   *
    * @param indexes vector containing the slected indexes.
    * @param calculatedObjectives The current population objectives.
    * @param front The front of the current generation.
@@ -405,32 +410,37 @@ class AGEMOEA
       arma::Row<size_t>& indexes,
       std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives,
       const std::vector<size_t>& front);
-  
+
   /**
    * Finding the distance of each point in the front from the line formed
    * by pointA and pointB.
-   * 
-   * @param distance The vector containing the distances of the points in the fron from the line.
-   * @param calculatedObjectives Reference to the current population evaluated Objectives.
+   *
+   * @param distance The vector containing the distances of the points in the
+   *    from from the line.
+   * @param calculatedObjectives Reference to the current population evaluated
+   *    objectives.
    * @param front The front of the current generation(indices of population).
    * @param pointA The first point on the line.
    * @param pointB The second point on the line.
-  */
+   */
   template <typename MatType>
   void PointToLineDistance(
       arma::Row<typename MatType::elem_type>& distances,
-      std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives,
+      std::vector<arma::Col<typename MatType::elem_type> >&
+          calculatedObjectives,
       const std::vector<size_t>& front,
       const arma::Col<typename MatType::elem_type>& pointA,
       const arma::Col<typename MatType::elem_type>& pointB);
-  
+
   /**
-   * Find the Diversity score corresponding the solution S using the selected set.
-   * 
+   * Find the Diversity score corresponding the solution S using the selected
+   * set.
+   *
    * @param selected The current selected set.
    * @param pairwiseDistance The current pairwise distance for the whole front.
    * @param S The relative index of S being considered within the front.
-   * @return The diversity score for S which the sum of the two smallest elements.
+   * @return The diversity score for S which the sum of the two smallest
+   *     elements.
   */
  template <typename MatType>
  typename MatType::elem_type DiversityScore(std::set<size_t>& selected,
@@ -467,19 +477,6 @@ class AGEMOEA
 
   //! Upper bound of the initial swarm.
   arma::vec upperBound;
-
-  //! The set of all the Pareto optimal points.
-  //! Stored after Optimize() is called.
-  arma::cube paretoSet;
-
-  //! The set of all the Pareto optimal objective vectors.
-  //! Stored after Optimize() is called.
-  arma::cube paretoFront;
-
-  //! A different representation of the Pareto front, for reverse compatibility
-  //! purposes.  This can be removed when ensmallen 3.x is released!  (Along
-  //! with `Front()`.)  This is only populated when `Front()` is called.
-  std::vector<arma::mat> rcFront;
 };
 
 } // namespace ens
diff --git a/inst/include/ensmallen_bits/agemoea/agemoea_impl.hpp b/inst/include/ensmallen_bits/agemoea/agemoea_impl.hpp
index c226095..0f7815d 100644
--- a/inst/include/ensmallen_bits/agemoea/agemoea_impl.hpp
+++ b/inst/include/ensmallen_bits/agemoea/agemoea_impl.hpp
@@ -67,6 +67,24 @@ typename MatType::elem_type AGEMOEA::Optimize(
     std::tuple<ArbitraryFunctionType...>& objectives,
     MatType& iterateIn,
     CallbackTypes&&... callbacks)
+{
+  typedef typename ForwardType<MatType>::bcube CubeType;
+  CubeType paretoFront, paretoSet;
+  return Optimize(objectives, iterateIn, paretoFront, paretoSet,
+      std::forward<CallbackTypes>(callbacks)...);
+}
+
+//! Optimize the function.
+template<typename MatType,
+         typename CubeType,
+         typename... ArbitraryFunctionType,
+         typename... CallbackTypes>
+typename MatType::elem_type AGEMOEA::Optimize(
+    std::tuple<ArbitraryFunctionType...>& objectives,
+    MatType& iterateIn,
+    CubeType& paretoFrontIn,
+    CubeType& paretoSetIn,
+    CallbackTypes&&... callbacks)
 {
   // Make sure for evolution to work at least four candidates are present.
   if (populationSize < 4 && populationSize % 4 != 0)
@@ -78,6 +96,8 @@ typename MatType::elem_type AGEMOEA::Optimize(
   // Convenience typedefs.
   typedef typename MatType::elem_type ElemType;
   typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
+  typedef typename ForwardType<MatType>::bcol BaseColType;
+  typedef typename ForwardType<CubeType>::bmat CubeBaseMatType;
 
   BaseMatType& iterate = (BaseMatType&) iterateIn;
 
@@ -104,7 +124,7 @@ typename MatType::elem_type AGEMOEA::Optimize(
   numVariables = iterate.n_rows;
 
   // Cache calculated objectives.
-  std::vector<arma::Col<ElemType> > calculatedObjectives(populationSize);
+  std::vector<BaseColType> calculatedObjectives(populationSize);
 
   // Population size reserved to 2 * populationSize + 1 to accommodate
   // for the size of intermediate candidate population.
@@ -120,8 +140,8 @@ typename MatType::elem_type AGEMOEA::Optimize(
   std::vector<size_t> ranks;
 
   //! Useful temporaries for float-like comparisons.
-  const BaseMatType castedLowerBound = arma::conv_to<BaseMatType>::from(lowerBound);
-  const BaseMatType castedUpperBound = arma::conv_to<BaseMatType>::from(upperBound);
+  const BaseMatType castedLowerBound = conv_to<BaseMatType>::from(lowerBound);
+  const BaseMatType castedUpperBound = conv_to<BaseMatType>::from(upperBound);
 
   // Controls early termination of the optimization process.
   bool terminate = false;
@@ -131,10 +151,10 @@ typename MatType::elem_type AGEMOEA::Optimize(
   for (size_t i = 0; i < populationSize; i++)
   {
     population.push_back(arma::randu<BaseMatType>(iterate.n_rows,
-        iterate.n_cols) - 0.5 + iterate);
+        iterate.n_cols) - ElemType(0.5) + iterate);
 
     // Constrain all genes to be within bounds.
-    population[i] = arma::min(arma::max(population[i], castedLowerBound), 
+    population[i] = min(max(population[i], castedLowerBound),
         castedUpperBound);
   }
 
@@ -152,26 +172,24 @@ typename MatType::elem_type AGEMOEA::Optimize(
     // Evaluate the objectives for the new population.
     calculatedObjectives.resize(population.size());
     std::fill(calculatedObjectives.begin(), calculatedObjectives.end(),
-        arma::Col<ElemType>(numObjectives, arma::fill::zeros));
+        BaseColType(numObjectives, GetFillType<MatType>::zeros));
     EvaluateObjectives(population, objectives, calculatedObjectives);
 
     // Perform fast non dominated sort on P_t ∪ G_t.
     ranks.resize(population.size());
     FastNonDominatedSort<BaseMatType>(fronts, ranks, calculatedObjectives);
-    
+
     arma::Col<ElemType> idealPoint(calculatedObjectives[fronts[0][0]]);
     for (size_t index = 1; index < fronts[0].size(); index++)
     {
-      idealPoint = arma::min(idealPoint, 
-          calculatedObjectives[fronts[0][index]]);
+      idealPoint = min(idealPoint, calculatedObjectives[fronts[0][index]]);
     }
 
     // Perform survival score assignment.
     survivalScore.resize(population.size());
     std::fill(survivalScore.begin(), survivalScore.end(), 0.);
-    double dimension;
-    arma::Col<typename MatType::elem_type> normalize(numObjectives, 
-        arma::fill::zeros);
+    ElemType dimension;
+    BaseColType normalize(numObjectives, GetFillType<MatType>::zeros);
     for (size_t fNum = 0; fNum < fronts.size(); fNum++)
     {
       SurvivalScoreAssignment<BaseMatType>(fronts[fNum], idealPoint,
@@ -186,16 +204,16 @@ typename MatType::elem_type AGEMOEA::Optimize(
             size_t idxP{}, idxQ{};
             for (size_t i = 0; i < population.size(); i++)
             {
-              if (arma::approx_equal(population[i], candidateP, 
-                  "absdiff", epsilon))
+              if (approx_equal(population[i], candidateP, "absdiff",
+                  ElemType(epsilon)))
                 idxP = i;
 
-              if (arma::approx_equal(population[i], candidateQ, 
-                  "absdiff", epsilon))
+              if (approx_equal(population[i], candidateQ, "absdiff",
+                  ElemType(epsilon)))
                 idxQ = i;
             }
 
-            return SurvivalScoreOperator<BaseMatType>(idxP, idxQ, ranks, 
+            return SurvivalScoreOperator<BaseMatType>(idxP, idxQ, ranks,
                 survivalScore);
           }
     );
@@ -209,29 +227,24 @@ typename MatType::elem_type AGEMOEA::Optimize(
   }
   EvaluateObjectives(population, objectives, calculatedObjectives);
   // Set the candidates from the Pareto Set as the output.
-  paretoSet.set_size(population[0].n_rows, population[0].n_cols, 
+  paretoSetIn.set_size(population[0].n_rows, population[0].n_cols,
       population.size());
   // The Pareto Set is stored, can be obtained via ParetoSet() getter.
   for (size_t solutionIdx = 0; solutionIdx < population.size(); ++solutionIdx)
   {
-    paretoSet.slice(solutionIdx) =
-      arma::conv_to<arma::mat>::from(population[solutionIdx]);
+    paretoSetIn.slice(solutionIdx) =
+      conv_to<CubeBaseMatType>::from(population[solutionIdx]);
   }
 
   // Set the candidates from the Pareto Front as the output.
-  paretoFront.set_size(calculatedObjectives[0].n_rows, 
+  paretoFrontIn.set_size(calculatedObjectives[0].n_rows,
       calculatedObjectives[0].n_cols, population.size());
-  // The Pareto Front is stored, can be obtained via ParetoFront() getter.
   for (size_t solutionIdx = 0; solutionIdx < population.size(); ++solutionIdx)
   {
-    paretoFront.slice(solutionIdx) =
-      arma::conv_to<arma::mat>::from(calculatedObjectives[solutionIdx]);
+    paretoFrontIn.slice(solutionIdx) =
+      conv_to<CubeBaseMatType>::from(calculatedObjectives[solutionIdx]);
   }
 
-  // Clear rcFront, in case it is later requested by the user for reverse
-  // compatibility reasons.
-  rcFront.clear();
-
   // Assign iterate to first element of the Pareto Set.
   iterate = population[fronts[0][0]];
 
@@ -239,57 +252,62 @@ typename MatType::elem_type AGEMOEA::Optimize(
 
   ElemType performance = std::numeric_limits<ElemType>::max();
 
-  for (const arma::Col<ElemType>& objective: calculatedObjectives)
-    if (arma::accu(objective) < performance)
-      performance = arma::accu(objective);
+  for (const BaseColType& objective: calculatedObjectives)
+    if (accu(objective) < performance)
+      performance = accu(objective);
 
   return performance;
 }
 
 //! No objectives to evaluate.
 template<std::size_t I,
-         typename MatType,
+         typename InputMatType,
+         typename ObjectiveMatType,
          typename ...ArbitraryFunctionType>
 typename std::enable_if<I == sizeof...(ArbitraryFunctionType), void>::type
 AGEMOEA::EvaluateObjectives(
-    std::vector<MatType>&,
+    std::vector<InputMatType>&,
     std::tuple<ArbitraryFunctionType...>&,
-    std::vector<arma::Col<typename MatType::elem_type> >&)
+    std::vector<ObjectiveMatType>&)
 {
   // Nothing to do here.
 }
 
 //! Evaluate the objectives for the entire population.
 template<std::size_t I,
-         typename MatType,
+         typename InputMatType,
+         typename ObjectiveMatType,
          typename ...ArbitraryFunctionType>
 typename std::enable_if<I < sizeof...(ArbitraryFunctionType), void>::type
 AGEMOEA::EvaluateObjectives(
-    std::vector<MatType>& population,
+    std::vector<InputMatType>& population,
     std::tuple<ArbitraryFunctionType...>& objectives,
-    std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives)
+    std::vector<ObjectiveMatType>& calculatedObjectives)
 {
   for (size_t i = 0; i < population.size(); i++)
   {
     calculatedObjectives[i](I) = std::get<I>(objectives).Evaluate(population[i]);
-    EvaluateObjectives<I+1, MatType, ArbitraryFunctionType...>(population, objectives,
+    EvaluateObjectives<I+1, InputMatType, ObjectiveMatType,
+        ArbitraryFunctionType...>(population, objectives,
         calculatedObjectives);
   }
 }
 
 //! Reproduce and generate new candidates.
-template<typename MatType>
-inline void AGEMOEA::BinaryTournamentSelection(std::vector<MatType>& population,
-                                             const MatType& lowerBound,
-                                             const MatType& upperBound)
+template<typename InputMatType>
+inline void AGEMOEA::BinaryTournamentSelection(std::vector<InputMatType>& population,
+                                               const InputMatType& lowerBound,
+                                               const InputMatType& upperBound)
 {
-  std::vector<MatType> children;
+  std::vector<InputMatType> children;
 
   while (children.size() < population.size())
   {
     // Choose two random parents for reproduction from the elite population.
-    size_t indexA = arma::randi<size_t>(arma::distr_param(0, populationSize - 1));
-    size_t indexB = arma::randi<size_t>(arma::distr_param(0, populationSize - 1));
+    size_t indexA = arma::randi<size_t>(
+        arma::distr_param(0, populationSize - 1));
+    size_t indexB = arma::randi<size_t>(
+        arma::distr_param(0, populationSize - 1));
 
     // Make sure that the parents differ.
     if (indexA == indexB)
@@ -301,10 +319,10 @@ inline void AGEMOEA::BinaryTournamentSelection(std::vector<MatType>& population,
     }
 
     // Initialize the children to the respective parents.
-    MatType childA = population[indexA], childB = population[indexB];
+    InputMatType childA = population[indexA], childB = population[indexB];
 
     if (arma::randu() <= crossoverProb)
-      Crossover(childA, childB, population[indexA], population[indexB], 
+      Crossover(childA, childB, population[indexA], population[indexB],
                 lowerBound, upperBound);
 
     Mutate(childA, 1.0 / static_cast<double>(numVariables),
@@ -318,68 +336,74 @@ inline void AGEMOEA::BinaryTournamentSelection(std::vector<MatType>& population,
   }
 
   // Add the candidates to the elite population.
-  population.insert(std::end(population), std::begin(children), std::end(children));
+  population.insert(std::end(population), std::begin(children),
+      std::end(children));
 }
 
 //! Perform simulated binary crossover (SBX) of genes for the children.
-template<typename MatType>
-inline void AGEMOEA::Crossover(MatType& childA,
-                               MatType& childB,
-                               const MatType& parentA,
-                               const MatType& parentB,
-                               const MatType& lowerBound,
-                               const MatType& upperBound)
+template<typename InputMatType>
+inline void AGEMOEA::Crossover(InputMatType& childA,
+                               InputMatType& childB,
+                               const InputMatType& parentA,
+                               const InputMatType& parentB,
+                               const InputMatType& lowerBound,
+                               const InputMatType& upperBound)
 {
-    //! Generates a child from two parent individuals
-    // according to the polynomial probability distribution.
-    arma::Cube<typename MatType::elem_type> parents(parentA.n_rows, 
-      parentA.n_cols, 2);
-    parents.slice(0) = parentA;
-    parents.slice(1) = parentB;
-    MatType current_min =  arma::min(parents, 2);
-    MatType current_max =  arma::max(parents, 2);
-
-    if (arma::accu(parentA - parentB < 1e-14))
-    {
-      childA = parentA;
-      childB = parentB;
-      return;
-    }
-    MatType current_diff = current_max - current_min;
-    current_diff.transform( [](typename MatType::elem_type val)
-      { return (val < 1e-10 ? 1e-10:val); } );
-
-    // Calculating beta used for the final crossover.
-    MatType beta1 = 1 + 2.0 * (current_min - lowerBound) / current_diff;
-    MatType beta2 = 1 + 2.0 * (upperBound - current_max) / current_diff;
-    MatType alpha1 = 2 - arma::pow(beta1, -(eta + 1));
-    MatType alpha2 = 2 - arma::pow(beta2, -(eta + 1));
-
-    MatType us(arma::size(alpha1), arma::fill::randu);
-    arma::umat mask1 = us > (1.0 / alpha1); 
-    MatType betaq1 = arma::pow(us % alpha1, 1. / (eta + 1));
-    betaq1 = betaq1 % (mask1 != 1.0) + arma::pow((1.0 / (2.0 - us % alpha1)), 
-        1.0 / (eta + 1)) % mask1;
-    arma::umat mask2 = us > (1.0 / alpha2);
-    MatType betaq2 = arma::pow(us % alpha2, 1 / (eta + 1));
-    betaq2 = betaq2 % (mask1 != 1.0) + arma::pow((1.0 / (2.0 - us % alpha2)), 
-        1.0 / (eta + 1)) % mask2;
-
-    // Variables after the cross over for all of them.
-    MatType c1 = 0.5 * ((current_min + current_max) - betaq1 % current_diff);
-    MatType c2 = 0.5 * ((current_min + current_max) + betaq2 % current_diff);
-    c1 = arma::min(arma::max(c1, lowerBound), upperBound);
-    c2 = arma::min(arma::max(c2, lowerBound), upperBound);
-    
-    // Decision for the crossover between the two parents for each variable.
-    us.randu();
-    childA = parentA % (us <= 0.5);
-    childB = parentB % (us <= 0.5);
-    us.randu();
-    childA = childA + c1 % ((us <= 0.5) % (childA == 0));
-    childA = childA + c2 % ((us > 0.5) % (childA == 0));
-    childB = childB + c2 % ((us <= 0.5) % (childB == 0));
-    childB = childB + c1 % ((us > 0.5) % (childB == 0));
+  typedef typename InputMatType::elem_type ElemType;
+  typedef typename ForwardType<InputMatType>::bcube BaseCubeType;
+  typedef typename ForwardType<InputMatType>::umat UMatType;
+
+  // Generates a child from two parent individuals
+  // according to the polynomial probability distribution.
+  BaseCubeType parents(parentA.n_rows,
+    parentA.n_cols, 2);
+  parents.slice(0) = parentA;
+  parents.slice(1) = parentB;
+  InputMatType current_min = min(parents, 2);
+  InputMatType current_max = max(parents, 2);
+
+  if (accu(parentA - parentB < ElemType(1e-14)))
+  {
+    childA = parentA;
+    childB = parentB;
+    return;
+  }
+  InputMatType current_diff = current_max - current_min;
+  current_diff.transform( [](ElemType val)
+      { return (val < ElemType(1e-10) ? ElemType(1e-10) : val); } );
+
+  // Calculating beta used for the final crossover.
+  InputMatType beta1 = 1 + 2 * (current_min - lowerBound) / current_diff;
+  InputMatType beta2 = 1 + 2 * (upperBound - current_max) / current_diff;
+  InputMatType alpha1 = 2 - pow(beta1, -(eta + 1));
+  InputMatType alpha2 = 2 - pow(beta2, -(eta + 1));
+
+  InputMatType us(size(alpha1), GetFillType<InputMatType>::randu);
+
+  UMatType mask1 = us > (1 / alpha1);
+  InputMatType betaq1 = pow(us % alpha1, 1. / (eta + 1));
+  betaq1 = betaq1 % (mask1 != 1) + pow((1 / (2 - us % alpha1)),
+      1 / (eta + 1)) % mask1;
+  UMatType mask2 = us > (1 / alpha2);
+  InputMatType betaq2 = pow(us % alpha2, 1 / (eta + 1));
+  betaq2 = betaq2 % (mask1 != 1) + pow((1 / (2 - us % alpha2)),
+      1 / (eta + 1)) % mask2;
+
+  // Variables after the cross over for all of them.
+  InputMatType c1 = ((current_min + current_max) - betaq1 % current_diff) / 2;
+  InputMatType c2 = ((current_min + current_max) + betaq2 % current_diff) / 2;
+  c1 = min(max(c1, lowerBound), upperBound);
+  c2 = min(max(c2, lowerBound), upperBound);
+
+  // Decision for the crossover between the two parents for each variable.
+  us.randu();
+  childA = parentA % (us <= ElemType(0.5));
+  childB = parentB % (us <= ElemType(0.5));
+  us.randu();
+  childA = childA + c1 % ((us <= ElemType(0.5)) % (childA == 0));
+  childA = childA + c2 % ((us >  ElemType(0.5)) % (childA == 0));
+  childB = childB + c2 % ((us <= ElemType(0.5)) % (childB == 0));
+  childB = childB + c1 % ((us >  ElemType(0.5)) % (childB == 0));
 }
 
 //! Perform Polynomial mutation of the candidate.
@@ -389,39 +413,40 @@ inline void AGEMOEA::Mutate(MatType& candidate,
                             const MatType& lowerBound,
                             const MatType& upperBound)
 {
-    const size_t numVariables = candidate.n_rows;
-    for (size_t geneIdx = 0; geneIdx < numVariables; ++geneIdx)
+  const size_t numVariables = candidate.n_rows;
+  for (size_t geneIdx = 0; geneIdx < numVariables; ++geneIdx)
+  {
+    // Should this gene be mutated?
+    if (arma::randu() > mutationRate)
+      continue;
+
+    const double geneRange = upperBound(geneIdx) - lowerBound(geneIdx);
+    // Normalised distance from the bounds.
+    const double lowerDelta = (candidate(geneIdx)
+        - lowerBound(geneIdx)) / geneRange;
+    const double upperDelta = (upperBound(geneIdx)
+        - candidate(geneIdx)) / geneRange;
+    const double mutationPower = 1. / (distributionIndex + 1.0);
+    const double rand = arma::randu();
+    double value, perturbationFactor;
+    if (rand < 0.5)
     {
-      // Should this gene be mutated?
-      if (arma::randu() > mutationRate)
-        continue;
-
-      const double geneRange = upperBound(geneIdx) - lowerBound(geneIdx);
-      // Normalised distance from the bounds.
-      const double lowerDelta = (candidate(geneIdx) 
-          - lowerBound(geneIdx)) / geneRange;
-      const double upperDelta = (upperBound(geneIdx) 
-          - candidate(geneIdx)) / geneRange;
-      const double mutationPower = 1. / (distributionIndex + 1.0);
-      const double rand = arma::randu();
-      double value, perturbationFactor;
-      if (rand < 0.5)
-      {
-        value = 2.0 * rand + (1.0 - 2.0 * rand) *
-            std::pow(upperDelta, distributionIndex + 1.0);
-        perturbationFactor = std::pow(value, mutationPower) - 1.0;
-      }
-      else
-      {
-        value = 2.0 * (1.0 - rand) + 2.0 *(rand - 0.5) *
-            std::pow(lowerDelta, distributionIndex + 1.0);
-        perturbationFactor = 1.0 - std::pow(value, mutationPower);
-      }
-
-      candidate(geneIdx) += perturbationFactor * geneRange;
+      value = 2.0 * rand + (1.0 - 2.0 * rand) *
+          std::pow(upperDelta, distributionIndex + 1.0);
+      perturbationFactor = std::pow(value, mutationPower) - 1.0;
     }
-    //! Enforce bounds.
-    candidate = arma::min(arma::max(candidate, lowerBound), upperBound);
+    else
+    {
+      value = 2.0 * (1.0 - rand) + 2.0 *(rand - 0.5) *
+          std::pow(lowerDelta, distributionIndex + 1.0);
+      perturbationFactor = 1.0 - std::pow(value, mutationPower);
+    }
+
+    candidate(geneIdx) +=
+        typename MatType::elem_type(perturbationFactor * geneRange);
+  }
+  //! Enforce bounds.
+  candidate = min(max(candidate, lowerBound), upperBound);
 }
 
 template <typename MatType>
@@ -431,9 +456,9 @@ inline void AGEMOEA::NormalizeFront(
       const std::vector<size_t>& front,
       const arma::Row<size_t>& extreme)
 {
-  arma::Mat<typename MatType::elem_type> vectorizedObjectives(numObjectives, 
+  arma::Mat<typename MatType::elem_type> vectorizedObjectives(numObjectives,
       front.size());
-  arma::Mat<typename MatType::elem_type> vectorizedExtremes(numObjectives, 
+  arma::Mat<typename MatType::elem_type> vectorizedExtremes(numObjectives,
       extreme.n_elem);
   for (size_t i = 0; i < front.size(); i++)
   {
@@ -441,7 +466,7 @@ inline void AGEMOEA::NormalizeFront(
   }
   for (size_t i = 0; i < extreme.n_elem; i++)
   {
-    vectorizedExtremes.col(i) = calculatedObjectives[front[extreme[i]]]; 
+    vectorizedExtremes.col(i) = calculatedObjectives[front[extreme[i]]];
   }
 
   if (front.size() < numObjectives)
@@ -474,9 +499,9 @@ inline void AGEMOEA::NormalizeFront(
   }
   else
   {
-    normalization = 1. / hyperplane;   
+    normalization = 1. / hyperplane;
     if (normalization.has_inf() || normalization.has_nan())
-    {    
+    {
       normalization = arma::max(vectorizedObjectives, 1);
     }
   }
@@ -484,26 +509,29 @@ inline void AGEMOEA::NormalizeFront(
 }
 
 template <typename MatType>
-inline double AGEMOEA::GetGeometry(
+inline typename MatType::elem_type AGEMOEA::GetGeometry(
     std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives,
     const std::vector<size_t>& front,
     const arma::Row<size_t>& extreme)
 {
-  arma::Row<typename MatType::elem_type> d;
-  arma::Col<typename MatType::elem_type> zero(numObjectives, arma::fill::zeros);
-  arma::Col<typename MatType::elem_type> one(numObjectives, arma::fill::ones);
+  typedef typename MatType::elem_type ElemType;
 
-  PointToLineDistance<MatType> (d, calculatedObjectives, front, zero, one);
+  arma::Row<ElemType> d;
+  arma::Col<ElemType> zero(numObjectives, arma::fill::zeros);
+  arma::Col<ElemType> one(numObjectives, arma::fill::ones);
+
+  PointToLineDistance<MatType>(d, calculatedObjectives, front, zero, one);
 
   for (size_t i = 0; i < extreme.size(); i++)
   {
-    d[extreme[i]] = arma::datum::inf;
+    d[extreme[i]] = arma::Datum<ElemType>::inf;
   }
+
   size_t index = arma::index_min(d);
-  double avg = arma::accu(calculatedObjectives[front[index]]) / static_cast<double> (numObjectives); 
-  double p = std::log(numObjectives) / std::log(1.0 / avg);
-  if (p <= 0.1 || std::isnan(p)) 
-    p = 1.0;
+  ElemType avg = accu(calculatedObjectives[front[index]]) / numObjectives;
+  ElemType p = std::log(ElemType(numObjectives)) / std::log(1 / avg);
+  if (p <= ElemType(0.1) || std::isnan(p))
+    p = 1;
 
   return p;
 }
@@ -514,13 +542,15 @@ inline void AGEMOEA::PairwiseDistance(
     MatType& f,
     std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives,
     const std::vector<size_t>& front,
-    double dimension)
-{ 
+    const typename MatType::elem_type dimension)
+{
   for (size_t i = 0; i < front.size(); i++)
   {
     for (size_t j = i + 1; j < front.size(); j++)
     {
-      f(i, j) = std::pow(arma::accu(arma::pow(arma::abs(calculatedObjectives[front[i]] - calculatedObjectives[front[j]]), dimension)), 1.0 / dimension);
+      f(i, j) = std::pow(accu(pow(abs(
+          calculatedObjectives[front[i]] - calculatedObjectives[front[j]]),
+          dimension)), 1 / dimension);
       f(j, i) = f(i, j);
     }
   }
@@ -529,12 +559,12 @@ inline void AGEMOEA::PairwiseDistance(
 //! Find the index of the of the extreme points in the given front.
 template <typename MatType>
 void AGEMOEA::FindExtremePoints(
-    arma::Row<size_t>& indexes, 
+    arma::Row<size_t>& indexes,
     std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives,
     const std::vector<size_t>& front)
 {
   typedef typename MatType::elem_type ElemType;
-  
+
   if (numObjectives >= front.size())
   {
     indexes = arma::linspace<arma::Row<size_t>>(0, front.size() - 1, front.size());
@@ -567,13 +597,13 @@ void AGEMOEA::PointToLineDistance(
 {
   typedef typename MatType::elem_type ElemType;
   arma::Row<ElemType> distancesTemp(front.size());
-  arma::Col<ElemType> ba = pointB - pointA; 
+  arma::Col<ElemType> ba = pointB - pointA;
   arma::Col<ElemType> pa;
 
   for (size_t i = 0; i < front.size(); i++)
   {
     size_t ind = front[i];
- 
+
     pa = (calculatedObjectives[ind] - pointA);
     double t = arma::dot(pa, ba) / arma::dot(ba, ba);
     distancesTemp[i] = arma::accu(arma::pow((pa - t * ba), 2));
@@ -660,7 +690,7 @@ inline bool AGEMOEA::Dominates(
       allBetterOrEqual = false;
 
     // P is better than Q for the i-th objective function.
-    else if (calculatedObjectives[candidateP](i) < 
+    else if (calculatedObjectives[candidateP](i) <
         calculatedObjectives[candidateQ](i))
       atleastOneBetter = true;
   }
@@ -674,7 +704,7 @@ inline typename MatType::elem_type AGEMOEA::DiversityScore(
     std::set<size_t>& selected,
     const MatType& pairwiseDistance,
     size_t S)
-{ 
+{
   typedef typename MatType::elem_type ElemType;
   ElemType m = arma::datum::inf;
   ElemType m1 = arma::datum::inf;
@@ -682,7 +712,7 @@ inline typename MatType::elem_type AGEMOEA::DiversityScore(
   for (it = selected.begin(); it != selected.end(); it++)
   {
     if (*it == S){ continue; }
-    if (pairwiseDistance(S, *it) < m) 
+    if (pairwiseDistance(S, *it) < m)
     {
       m1 = m;
       m = pairwiseDistance(S, *it);
@@ -705,7 +735,7 @@ inline void AGEMOEA::SurvivalScoreAssignment(
     std::vector<arma::Col<typename MatType::elem_type>>& calculatedObjectives,
     std::vector<typename MatType::elem_type>& survivalScore,
     arma::Col<typename MatType::elem_type>& normalize,
-    double& dimension,
+    typename MatType::elem_type& dimension,
     size_t fNum)
 {
   typedef typename MatType::elem_type ElemType;
@@ -718,12 +748,12 @@ inline void AGEMOEA::SurvivalScoreAssignment(
       dimension = 1;
       arma::Row<size_t> extreme(numObjectives, arma::fill::zeros);
       NormalizeFront<MatType>(calculatedObjectives, normalize, front, extreme);
-      return; 
+      return;
     }
 
     for (size_t index = 0; index < front.size(); index++)
     {
-      calculatedObjectives[front[index]] = calculatedObjectives[front[index]] 
+      calculatedObjectives[front[index]] = calculatedObjectives[front[index]]
           - idealPoint;
     }
 
@@ -733,22 +763,21 @@ inline void AGEMOEA::SurvivalScoreAssignment(
 
     for (size_t index = 0; index < front.size(); index++)
     {
-      calculatedObjectives[front[index]] = calculatedObjectives[front[index]] 
+      calculatedObjectives[front[index]] = calculatedObjectives[front[index]]
           / normalize;
     }
 
     std::set<size_t> selected;
     std::set<size_t> remaining;
-    
+
     // Create the selected and remaining sets.
     for (size_t index: extreme)
-    { 
+    {
       selected.insert(index);
-      survivalScore[front[index]] = arma::datum::inf;
+      survivalScore[front[index]] = arma::Datum<ElemType>::inf;
     }
 
-    dimension = GetGeometry<MatType>(calculatedObjectives, front,
-                                           extreme);
+    dimension = GetGeometry<MatType>(calculatedObjectives, front, extreme);
     for (size_t i = 0; i < front.size(); i++)
     {
       if (selected.count(i) == 0)
@@ -758,17 +787,17 @@ inline void AGEMOEA::SurvivalScoreAssignment(
     }
 
     arma::Mat<ElemType> pairwise(front.size(), front.size(), arma::fill::zeros);
-    PairwiseDistance<MatType>(pairwise,calculatedObjectives,front,dimension);
-    arma::Row<typename MatType::elem_type> value(front.size(), 
+    PairwiseDistance<MatType>(pairwise, calculatedObjectives, front, dimension);
+    arma::Row<typename MatType::elem_type> value(front.size(),
         arma::fill::zeros);
-    
+
     // Calculate the diversity and proximity score.
     for (size_t i = 0; i < front.size(); i++)
     {
-      pairwise.col(i) = pairwise.col(i) / std::pow(arma::accu(arma::pow(
-        arma::abs(calculatedObjectives[front[i]]), dimension)), 1.0 / dimension);
+      pairwise.col(i) = pairwise.col(i) / std::pow(accu(pow(
+        arma::abs(calculatedObjectives[front[i]]), dimension)), 1 / dimension);
     }
-    
+
     while (remaining.size() > 0)
     {
       std::set<size_t>::iterator it;
@@ -789,12 +818,12 @@ inline void AGEMOEA::SurvivalScoreAssignment(
   {
     for (size_t i = 0; i < front.size(); i++)
     {
-      calculatedObjectives[front[i]] = (calculatedObjectives[front[i]]) / normalize;
-      survivalScore[front[i]] =  1.0 / std::pow(arma::accu(arma::pow(arma::abs(
-          calculatedObjectives[front[i]] - idealPoint), dimension)), 
-              1.0 / dimension);
+      calculatedObjectives[front[i]] =
+          (calculatedObjectives[front[i]]) / normalize;
+      survivalScore[front[i]] =  1 / std::pow(accu(pow(abs(
+          calculatedObjectives[front[i]] - idealPoint), dimension)),
+              1 / dimension);
     }
-
   }
 }
 
diff --git a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian.hpp b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian.hpp
index 2411e87..f247081 100644
--- a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian.hpp
+++ b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian.hpp
@@ -30,7 +30,8 @@ namespace ens {
  * documentation on function types included with this distribution or on the
  * ensmallen website.
  */
-class AugLagrangian
+template<typename VecType = arma::vec> // TODO: remove for ensmallen 4.x
+class AugLagrangianType
 {
  public:
   /**
@@ -43,13 +44,13 @@ class AugLagrangian
    * @param maxIterations Maximum number of iterations of the Augmented
    *     Lagrangian algorithm.  0 indicates no maximum.
    */
-  AugLagrangian(const size_t maxIterations = 1000,
-                const double penaltyThresholdFactor = 0.25,
-                const double sigmaUpdateFactor = 10.0,
-                const L_BFGS& lbfgs = L_BFGS());
+  AugLagrangianType(const size_t maxIterations = 1000,
+                    const double penaltyThresholdFactor = 0.25,
+                    const double sigmaUpdateFactor = 10.0,
+                    const L_BFGS& lbfgs = L_BFGS());
 
   /**
-   * Optimize the function.  The value '1' is used for the initial value of each
+   * Optimize the function.  The value '0' is used for the initial value of each
    * Lagrange multiplier.  To set the Lagrange multipliers yourself, use the
    * other overload of Optimize().
    *
@@ -66,7 +67,8 @@ class AugLagrangian
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value, bool>::type
+  typename std::enable_if<IsMatrixType<GradType>::value &&
+                          IsAllNonMatrix<CallbackTypes...>::value, bool>::type
   Optimize(LagrangianFunctionType& function,
            MatType& coordinates,
            CallbackTypes&&... callbacks);
@@ -75,9 +77,10 @@ class AugLagrangian
   template<typename LagrangianFunctionType,
            typename MatType,
            typename... CallbackTypes>
-  bool Optimize(LagrangianFunctionType& function,
-                MatType& coordinates,
-                CallbackTypes&&... callbacks)
+  typename std::enable_if<IsAllNonMatrix<CallbackTypes...>::value, bool>::type
+  Optimize(LagrangianFunctionType& function,
+           MatType& coordinates,
+           CallbackTypes&&... callbacks)
   {
     return Optimize<LagrangianFunctionType, MatType, MatType,
         CallbackTypes...>(function, coordinates,
@@ -96,29 +99,53 @@ class AugLagrangian
    * @tparam CallbackTypes Types of callback functions.
    * @param function The function to optimize.
    * @param coordinates Output matrix to store the optimized coordinates in.
-   * @param initLambda Vector of initial Lagrange multipliers.  Should have
-   *     length equal to the number of constraints.
-   * @param initSigma Initial penalty parameter.
+   * @param lambda Vector containing initial Lagrange multipliers.  Should have
+   *     length equal to the number of constraints.  This will be overwritten
+   *     with the Lagrange multipliers that are found during optimization.
+   * @param sigma Initial penalty parameter.  This will be overwritten with the
+   *     final penalty value used during optimization.
    * @param callbacks Callback functions.
    */
   template<typename LagrangianFunctionType,
            typename MatType,
+           typename InVecType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value, bool>::type
+  [[deprecated("use Optimize() with non-const lambda/sigma instead")]]
+  typename std::enable_if<IsMatrixType<GradType>::value, bool>::type
   Optimize(LagrangianFunctionType& function,
            MatType& coordinates,
-           const arma::vec& initLambda,
+           const InVecType& initLambda,
            const double initSigma,
+           CallbackTypes&&... callbacks)
+  {
+    deprecatedLambda = initLambda;
+    deprecatedSigma = initSigma;
+    const bool result = Optimize(function, coordinates, this->deprecatedLambda,
+        this->deprecatedSigma,
+        std::forward<CallbackTypes>(callbacks)...);
+  }
+
+  template<typename LagrangianFunctionType,
+           typename MatType,
+           typename InVecType,
+           typename GradType,
+           typename... CallbackTypes>
+  typename std::enable_if<IsMatrixType<GradType>::value, bool>::type
+  Optimize(LagrangianFunctionType& function,
+           MatType& coordinates,
+           InVecType& lambda,
+           double& sigma,
            CallbackTypes&&... callbacks);
 
   //! Forward the MatType as GradType.
   template<typename LagrangianFunctionType,
            typename MatType,
            typename... CallbackTypes>
+  [[deprecated("use Optimize() with non-const lambda/sigma instead")]]
   bool Optimize(LagrangianFunctionType& function,
                 MatType& coordinates,
-                const arma::vec& initLambda,
+                const VecType& initLambda,
                 const double initSigma,
                 CallbackTypes&&... callbacks)
   {
@@ -127,20 +154,39 @@ class AugLagrangian
         std::forward<CallbackTypes>(callbacks)...);
   }
 
+  template<typename LagrangianFunctionType,
+           typename MatType,
+           typename InVecType,
+           typename... CallbackTypes>
+  bool Optimize(LagrangianFunctionType& function,
+                MatType& coordinates,
+                InVecType& lambda,
+                double& sigma,
+                CallbackTypes&&... callbacks)
+  {
+    return Optimize<LagrangianFunctionType, MatType, InVecType, MatType,
+        CallbackTypes...>(function, coordinates, lambda, sigma,
+        std::forward<CallbackTypes>(callbacks)...);
+  }
+
   //! Get the L-BFGS object used for the actual optimization.
   const L_BFGS& LBFGS() const { return lbfgs; }
   //! Modify the L-BFGS object used for the actual optimization.
   L_BFGS& LBFGS() { return lbfgs; }
 
   //! Get the Lagrange multipliers.
-  const arma::vec& Lambda() const { return lambda; }
+  [[deprecated("use Optimize() with lambda/sigma parameters instead")]]
+  const VecType& Lambda() const { return deprecatedLambda; }
   //! Modify the Lagrange multipliers (i.e. set them before optimization).
-  arma::vec& Lambda() { return lambda; }
+  [[deprecated("use Optimize() with lambda/sigma parameters instead")]]
+  VecType& Lambda() { return deprecatedLambda; }
 
   //! Get the penalty parameter.
-  double Sigma() const { return sigma; }
+  [[deprecated("use Optimize() with lambda/sigma parameters instead")]]
+  double Sigma() const { return deprecatedSigma; }
   //! Modify the penalty parameter.
-  double& Sigma() { return sigma; }
+  [[deprecated("use Optimize() with lambda/sigma parameters instead")]]
+  double& Sigma() { return deprecatedSigma; }
 
   //! Get the maximum iterations
   size_t MaxIterations() const { return maxIterations; }
@@ -173,11 +219,11 @@ class AugLagrangian
   //! Controls early termination of the optimization process.
   bool terminate;
 
+  // NOTE: these will be removed in ensmallen 4.x!
   //! Lagrange multipliers.
-  arma::vec lambda;
-
+  VecType deprecatedLambda;
   //! Penalty parameter.
-  double sigma;
+  double deprecatedSigma;
 
   /**
    * Internal optimization function: given an initialized AugLagrangianFunction,
@@ -185,27 +231,32 @@ class AugLagrangian
    */
   template<typename LagrangianFunctionType,
            typename MatType,
+           typename InVecType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value, bool>::type
-  Optimize(AugLagrangianFunction<LagrangianFunctionType>& augfunc,
+  typename std::enable_if<IsMatrixType<GradType>::value, bool>::type
+  Optimize(AugLagrangianFunction<LagrangianFunctionType, InVecType>& augfunc,
            MatType& coordinates,
            CallbackTypes&&... callbacks);
 
   //! Forward the MatType as GradType.
   template<typename LagrangianFunctionType,
            typename MatType,
+           typename InVecType,
            typename... CallbackTypes>
-  bool Optimize(AugLagrangianFunction<LagrangianFunctionType>& function,
-                MatType& coordinates,
-                CallbackTypes&&... callbacks)
+  bool Optimize(
+      AugLagrangianFunction<LagrangianFunctionType, InVecType>& function,
+      MatType& coordinates,
+      CallbackTypes&&... callbacks)
   {
-    return Optimize<LagrangianFunctionType, MatType, MatType,
+    return Optimize<LagrangianFunctionType, MatType, InVecType, MatType,
         CallbackTypes...>(function, coordinates,
         std::forward<CallbackTypes>(callbacks)...);
   }
 };
 
+using AugLagrangian = AugLagrangianType<arma::vec>;
+
 } // namespace ens
 
 #include "aug_lagrangian_impl.hpp"
diff --git a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function.hpp b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function.hpp
index d9310d9..ae3e1a2 100644
--- a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function.hpp
+++ b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function.hpp
@@ -31,19 +31,10 @@ namespace ens {
  *
  * @tparam LagrangianFunction Lagrangian function to be used.
  */
-template<typename LagrangianFunction>
+template<typename LagrangianFunction, typename VecType>
 class AugLagrangianFunction
 {
  public:
-  /**
-   * Initialize the AugLagrangianFunction, but don't set the Lagrange
-   * multipliers or penalty parameters yet.  Make sure you set the Lagrange
-   * multipliers before you use this...
-   *
-   * @param function Lagrangian function.
-   */
-  AugLagrangianFunction(LagrangianFunction& function);
-
   /**
    * Initialize the AugLagrangianFunction with the given LagrangianFunction,
    * Lagrange multipliers, and initial penalty parameter.
@@ -53,8 +44,8 @@ class AugLagrangianFunction
    * @param sigma Initial penalty parameter.
    */
   AugLagrangianFunction(LagrangianFunction& function,
-                        const arma::vec& lambda,
-                        const double sigma);
+                        VecType& lambda,
+                        double& sigma);
   /**
    * Evaluate the objective function of the Augmented Lagrangian function, which
    * is the standard Lagrangian function evaluation plus a penalty term, which
@@ -81,17 +72,12 @@ class AugLagrangianFunction
    *
    * @return Initial point.
    */
-  template<typename MatType = arma::mat>
+  template<typename MatType>
   const MatType& GetInitialPoint() const;
 
-  //! Get the Lagrange multipliers.
-  const arma::vec& Lambda() const { return lambda; }
-  //! Modify the Lagrange multipliers.
-  arma::vec& Lambda() { return lambda; }
-
-  //! Get sigma (the penalty parameter).
-  double Sigma() const { return sigma; }
-  //! Modify sigma (the penalty parameter).
+  // Get the Lagrange multipliers.
+  VecType& Lambda() { return lambda; }
+  // Get the penalty parameter.
   double& Sigma() { return sigma; }
 
   //! Get the Lagrangian function.
@@ -104,9 +90,9 @@ class AugLagrangianFunction
   LagrangianFunction& function;
 
   //! The Lagrange multipliers.
-  arma::vec lambda;
+  VecType& lambda;
   //! The penalty parameter.
-  double sigma;
+  double& sigma;
 };
 
 } // namespace ens
diff --git a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function_impl.hpp b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function_impl.hpp
index 092a7c2..ed7675b 100644
--- a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function_impl.hpp
+++ b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_function_impl.hpp
@@ -20,23 +20,11 @@
 namespace ens {
 
 // Initialize the AugLagrangianFunction.
-template<typename LagrangianFunction>
-AugLagrangianFunction<LagrangianFunction>::AugLagrangianFunction(
-    LagrangianFunction& function) :
-    function(function),
-    lambda(function.NumConstraints()),
-    sigma(10)
-{
-  // Initialize lambda vector to all zeroes.
-  lambda.zeros();
-}
-
-// Initialize the AugLagrangianFunction.
-template<typename LagrangianFunction>
-AugLagrangianFunction<LagrangianFunction>::AugLagrangianFunction(
+template<typename LagrangianFunction, typename VecType>
+AugLagrangianFunction<LagrangianFunction, VecType>::AugLagrangianFunction(
     LagrangianFunction& function,
-    const arma::vec& lambda,
-    const double sigma) :
+    VecType& lambda,
+    double& sigma) :
     function(function),
     lambda(lambda),
     sigma(sigma)
@@ -45,9 +33,10 @@ AugLagrangianFunction<LagrangianFunction>::AugLagrangianFunction(
 }
 
 // Evaluate the AugLagrangianFunction at the given coordinates.
-template<typename LagrangianFunction>
+template<typename LagrangianFunction, typename VecType>
 template<typename MatType>
-typename MatType::elem_type AugLagrangianFunction<LagrangianFunction>::Evaluate(
+typename MatType::elem_type
+AugLagrangianFunction<LagrangianFunction, VecType>::Evaluate(
     const MatType& coordinates) const
 {
   // The augmented Lagrangian is evaluated as
@@ -63,20 +52,22 @@ typename MatType::elem_type AugLagrangianFunction<LagrangianFunction>::Evaluate(
   {
     ElemType constraint = function.EvaluateConstraint(i, coordinates);
 
-    objective += (-lambda[i] * constraint) +
-        sigma * std::pow(constraint, 2) / 2;
+    objective += (-ElemType(lambda[i]) * constraint) +
+        ElemType(sigma) * std::pow(constraint, ElemType(2)) / 2;
   }
 
   return objective;
 }
 
 // Evaluate the gradient of the AugLagrangianFunction at the given coordinates.
-template<typename LagrangianFunction>
+template<typename LagrangianFunction, typename VecType>
 template<typename MatType, typename GradType>
-void AugLagrangianFunction<LagrangianFunction>::Gradient(
+void AugLagrangianFunction<LagrangianFunction, VecType>::Gradient(
     const MatType& coordinates,
     GradType& gradient) const
 {
+  typedef typename MatType::elem_type ElemType;
+
   // The augmented Lagrangian's gradient is evaluted as
   // f'(x) + {(-lambda_i + sigma * c_i(x)) * c'_i(x)} for all constraints
   gradient.zeros();
@@ -89,16 +80,17 @@ void AugLagrangianFunction<LagrangianFunction>::Gradient(
 
     // Now calculate scaling factor and add to existing gradient.
     GradType tmpGradient;
-    tmpGradient = (-lambda[i] + sigma *
+    tmpGradient = (ElemType(-lambda[i]) + ElemType(sigma) *
         function.EvaluateConstraint(i, coordinates)) * constraintGradient;
     gradient += tmpGradient;
   }
 }
 
 // Get the initial point.
-template<typename LagrangianFunction>
+template<typename LagrangianFunction, typename VecType>
 template<typename MatType>
-const MatType& AugLagrangianFunction<LagrangianFunction>::GetInitialPoint()
+const MatType&
+AugLagrangianFunction<LagrangianFunction, VecType>::GetInitialPoint()
     const
 {
   return function.template GetInitialPoint<MatType>();
diff --git a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_impl.hpp b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_impl.hpp
index f972eae..75cf58c 100644
--- a/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_impl.hpp
+++ b/inst/include/ensmallen_bits/aug_lagrangian/aug_lagrangian_impl.hpp
@@ -19,70 +19,90 @@
 
 namespace ens {
 
-inline AugLagrangian::AugLagrangian(const size_t maxIterations,
-                                    const double penaltyThresholdFactor,
-                                    const double sigmaUpdateFactor,
-                                    const L_BFGS& lbfgs) :
+template<typename VecType>
+inline AugLagrangianType<VecType>::AugLagrangianType(
+    const size_t maxIterations,
+    const double penaltyThresholdFactor,
+    const double sigmaUpdateFactor,
+    const L_BFGS& lbfgs) :
     maxIterations(maxIterations),
     penaltyThresholdFactor(penaltyThresholdFactor),
     sigmaUpdateFactor(sigmaUpdateFactor),
     lbfgs(lbfgs),
     terminate(false),
-    sigma(0.0)
+    deprecatedSigma(0.0)
 {
 }
 
+template<typename VecType>
 template<typename LagrangianFunctionType,
          typename MatType,
+         typename InVecType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value, bool>::type
-AugLagrangian::Optimize(LagrangianFunctionType& function,
-                        MatType& coordinates,
-                        const arma::vec& initLambda,
-                        const double initSigma,
-                        CallbackTypes&&... callbacks)
+typename std::enable_if<IsMatrixType<GradType>::value, bool>::type
+AugLagrangianType<VecType>::Optimize(
+    LagrangianFunctionType& function,
+    MatType& coordinates,
+    InVecType& lambda,
+    double& sigma,
+    CallbackTypes&&... callbacks)
 {
-  lambda = initLambda;
-  sigma = initSigma;
-
-  AugLagrangianFunction<LagrangianFunctionType> augfunc(function,
-      lambda, sigma);
+  AugLagrangianFunction<LagrangianFunctionType, InVecType> augfunc(
+      function, lambda, sigma);
 
   return Optimize(augfunc, coordinates, callbacks...);
 }
 
+template<typename VecType>
 template<typename LagrangianFunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value, bool>::type
-AugLagrangian::Optimize(LagrangianFunctionType& function,
-                        MatType& coordinates,
-                        CallbackTypes&&... callbacks)
+typename std::enable_if<IsMatrixType<GradType>::value &&
+                        IsAllNonMatrix<CallbackTypes...>::value, bool>::type
+AugLagrangianType<VecType>::Optimize(LagrangianFunctionType& function,
+                                     MatType& coordinates,
+                                     CallbackTypes&&... callbacks)
 {
+  typedef typename ForwardType<MatType>::bvec InVecType;
+
   // If the user did not specify the right size for sigma and lambda, we will
   // use defaults.
-  if (!lambda.is_empty())
+  // TODO: remove this when ensmallen 4.x is released!
+  if (!deprecatedLambda.is_empty())
   {
-    AugLagrangianFunction<LagrangianFunctionType> augfunc(function, lambda,
-        sigma);
-    return Optimize(augfunc, coordinates, callbacks...);
+    InVecType lambda(conv_to<InVecType>::from(deprecatedLambda));
+
+    AugLagrangianFunction<LagrangianFunctionType, InVecType> augfunc(function,
+        lambda, deprecatedSigma);
+    const bool result = Optimize(augfunc, coordinates, callbacks...);
+    deprecatedLambda = conv_to<VecType>::from(lambda);
+
+    return result;
   }
   else
   {
-    AugLagrangianFunction<LagrangianFunctionType> augfunc(function);
+    // Use default values.
+    InVecType lambda(function.NumConstraints());
+    lambda.zeros();
+    double sigma = 10;
+
+    AugLagrangianFunction<LagrangianFunctionType, InVecType> augfunc(
+        function, lambda, sigma);
     return Optimize(augfunc, coordinates, callbacks...);
   }
 }
 
+template<typename VecType>
 template<typename LagrangianFunctionType,
          typename MatType,
+         typename InVecType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value, bool>::type
-AugLagrangian::Optimize(
-    AugLagrangianFunction<LagrangianFunctionType>& augfunc,
+typename std::enable_if<IsMatrixType<GradType>::value, bool>::type
+AugLagrangianType<VecType>::Optimize(
+    AugLagrangianFunction<LagrangianFunctionType, InVecType>& augfunc,
     MatType& coordinatesIn,
     CallbackTypes&&... callbacks)
 {
@@ -110,13 +130,14 @@ AugLagrangian::Optimize(
 
   // Convergence tolerance---depends on the epsilon of the type we are using for
   // optimization.
-  ElemType tolerance = 1e3 * std::numeric_limits<ElemType>::epsilon();
+  ElemType tolerance = 1000 * std::numeric_limits<ElemType>::epsilon();
 
   // Then, calculate the current penalty.
   ElemType penalty = 0;
   for (size_t i = 0; i < function.NumConstraints(); i++)
   {
-    const ElemType p = std::pow(function.EvaluateConstraint(i, coordinates), 2);
+    const ElemType p = std::pow(function.EvaluateConstraint(i, coordinates),
+        ElemType(2));
     terminate |= Callback::EvaluateConstraint(*this, function, coordinates, i,
         p, callbacks...);
 
@@ -149,9 +170,6 @@ AugLagrangian::Optimize(
     if (std::abs(lastObjective - objective) < tolerance &&
         augfunc.Sigma() > 500000)
     {
-      lambda = std::move(augfunc.Lambda());
-      sigma = augfunc.Sigma();
-
       Callback::EndOptimization(*this, function, coordinates, callbacks...);
       return true;
     }
@@ -167,7 +185,7 @@ AugLagrangian::Optimize(
     for (size_t i = 0; i < function.NumConstraints(); i++)
     {
       const ElemType p = std::pow(function.EvaluateConstraint(i, coordinates),
-          2);
+          ElemType(2));
       terminate |= Callback::EvaluateConstraint(*this, function, coordinates, i,
           p, callbacks...);
 
@@ -190,12 +208,12 @@ AugLagrangian::Optimize(
         terminate |= Callback::EvaluateConstraint(*this, function, coordinates,
             i, p, callbacks...);
 
-        augfunc.Lambda()[i] -= augfunc.Sigma() * p;
+        augfunc.Lambda()[i] -= ElemType(augfunc.Sigma()) * p;
       }
 
       // We also update the penalty threshold to be a factor of the current
       // penalty.
-      penaltyThreshold = penaltyThresholdFactor * penalty;
+      penaltyThreshold = ElemType(penaltyThresholdFactor) * penalty;
       Info << "Lagrange multiplier estimates updated." << std::endl;
     }
     else
@@ -208,7 +226,7 @@ AugLagrangian::Optimize(
         Warn << "AugLagrangian::Optimize(): sigma too large for element type; "
             << "terminating." << std::endl;
         Callback::EndOptimization(*this, function, coordinates, callbacks...);
-        return false;
+        return true;
       }
     }
 
diff --git a/inst/include/ensmallen_bits/bigbatch_sgd/adaptive_stepsize.hpp b/inst/include/ensmallen_bits/bigbatch_sgd/adaptive_stepsize.hpp
index a7e2816..816a7a0 100644
--- a/inst/include/ensmallen_bits/bigbatch_sgd/adaptive_stepsize.hpp
+++ b/inst/include/ensmallen_bits/bigbatch_sgd/adaptive_stepsize.hpp
@@ -69,6 +69,8 @@ class AdaptiveStepsize
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     // Create the instantiated object.
     Policy(AdaptiveStepsize& parent) : parent(parent) { }
 
@@ -104,7 +106,7 @@ class AdaptiveStepsize
           backtrackingBatchSize);
 
       // Update the iterate.
-      iterate -= stepSize * gradient;
+      iterate -= ElemType(stepSize) * gradient;
 
       // Update Gradient & calculate curvature of quadratic approximation.
       GradType functionGradient(iterate.n_rows, iterate.n_cols);
@@ -132,8 +134,8 @@ class AdaptiveStepsize
         delta0 = delta1 + (functionGradient - delta1) / k;
 
         // Compute sample variance.
-        vB += arma::norm(functionGradient - delta1, 2.0) *
-            arma::norm(functionGradient - delta0, 2.0);
+        vB += norm(functionGradient - delta1, 2.0) *
+            norm(functionGradient - delta0, 2.0);
 
         delta1 = delta0;
         gradient += functionGradient;
@@ -145,13 +147,13 @@ class AdaptiveStepsize
 
       // Update sample variance & norm of the gradient.
       sampleVariance = vB;
-      gradientNorm = std::pow(arma::norm(gradient / backtrackingBatchSize, 2),
+      gradientNorm = std::pow(norm(gradient / backtrackingBatchSize, 2),
           2.0);
 
       // Compute curvature.
-      double v = arma::trace(arma::trans(iterate - iteratePrev) *
+      double v = trace(trans(iterate - iteratePrev) *
           (gradient - gradPrevIterate)) /
-          std::pow(arma::norm(iterate - iteratePrev, 2), 2.0);
+          std::pow(norm(iterate - iteratePrev, 2), 2.0);
 
       // Update previous iterate.
       iteratePrev = iterate;
@@ -205,12 +207,10 @@ class AdaptiveStepsize
                       const size_t offset,
                       const size_t backtrackingBatchSize)
     {
-      typedef typename MatType::elem_type ElemType;
-
       ElemType overallObjective = function.Evaluate(iterate,
           offset, backtrackingBatchSize);
 
-      MatType iterateUpdate = iterate - (stepSize * gradient);
+      MatType iterateUpdate = iterate - (ElemType(stepSize) * gradient);
       ElemType overallObjectiveUpdate = function.Evaluate(iterateUpdate, offset,
           backtrackingBatchSize);
 
@@ -220,7 +220,7 @@ class AdaptiveStepsize
       {
         stepSize *= parent.backtrackStepSize;
 
-        iterateUpdate = iterate - (stepSize * gradient);
+        iterateUpdate = iterate - (ElemType(stepSize) * gradient);
         overallObjectiveUpdate = function.Evaluate(iterateUpdate, offset,
             backtrackingBatchSize);
       }
diff --git a/inst/include/ensmallen_bits/bigbatch_sgd/backtracking_line_search.hpp b/inst/include/ensmallen_bits/bigbatch_sgd/backtracking_line_search.hpp
index 8f271b8..4739019 100644
--- a/inst/include/ensmallen_bits/bigbatch_sgd/backtracking_line_search.hpp
+++ b/inst/include/ensmallen_bits/bigbatch_sgd/backtracking_line_search.hpp
@@ -60,6 +60,8 @@ class BacktrackingLineSearch
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     // Instantiate the policy with the given parent.
     Policy(BacktrackingLineSearch& parent) : parent(parent) { }
 
@@ -94,12 +96,10 @@ class BacktrackingLineSearch
       if (reset)
         stepSize *= 2;
 
-      typedef typename MatType::elem_type ElemType;
-
       ElemType overallObjective = function.Evaluate(iterate, offset,
           backtrackingBatchSize);
 
-      MatType iterateUpdate = iterate - (stepSize * gradient);
+      MatType iterateUpdate = iterate - (ElemType(stepSize) * gradient);
       ElemType overallObjectiveUpdate = function.Evaluate(iterateUpdate, offset,
           backtrackingBatchSize);
 
@@ -109,7 +109,7 @@ class BacktrackingLineSearch
       {
         stepSize /= 2;
 
-        iterateUpdate = iterate - (stepSize * gradient);
+        iterateUpdate = iterate - (ElemType(stepSize) * gradient);
         overallObjectiveUpdate = function.Evaluate(iterateUpdate,
           offset, backtrackingBatchSize);
       }
diff --git a/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd.hpp b/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd.hpp
index 4d670f2..ee194be 100644
--- a/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd.hpp
+++ b/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd.hpp
@@ -125,7 +125,7 @@ class BigBatchSGD
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd_impl.hpp b/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd_impl.hpp
index cd88660..39b479f 100644
--- a/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd_impl.hpp
+++ b/inst/include/ensmallen_bits/bigbatch_sgd/bigbatch_sgd_impl.hpp
@@ -50,8 +50,8 @@ template<typename SeparableFunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 BigBatchSGD<UpdatePolicyType>::Optimize(
     SeparableFunctionType& function,
     MatType& iterateIn,
@@ -137,13 +137,13 @@ BigBatchSGD<UpdatePolicyType>::Optimize(
       delta0 = delta1 + (functionGradient - delta1) / k;
 
       // Compute sample variance.
-      vB += arma::norm(functionGradient - delta1, 2.0) *
-          arma::norm(functionGradient - delta0, 2.0);
+      vB += norm(functionGradient - delta1, 2.0) *
+          norm(functionGradient - delta0, 2.0);
 
       delta1 = delta0;
       gradient += functionGradient;
     }
-    double gB = std::pow(arma::norm(gradient / effectiveBatchSize, 2), 2.0);
+    double gB = std::pow(norm(gradient / effectiveBatchSize, 2), 2.0);
 
     // Reset the batch size update process counter.
     reset = false;
@@ -174,13 +174,13 @@ BigBatchSGD<UpdatePolicyType>::Optimize(
           delta0 = delta1 + (functionGradient - delta1) / (k + 1);
 
           // Compute sample variance.
-          vB += arma::norm(functionGradient - delta1, 2.0) *
-              arma::norm(functionGradient - delta0, 2.0);
+          vB += norm(functionGradient - delta1, 2.0) *
+              norm(functionGradient - delta0, 2.0);
 
           delta1 = delta0;
           gradient += functionGradient;
         }
-        gB = std::pow(arma::norm(gradient / (batchSize + batchOffset), 2), 2.0);
+        gB = std::pow(norm(gradient / (batchSize + batchOffset), 2), 2.0);
 
         // Update the batchSize.
         batchSize += batchOffset;
@@ -199,7 +199,7 @@ BigBatchSGD<UpdatePolicyType>::Optimize(
         reset);
 
     // Update the iterate.
-    iterate -= stepSize * gradient;
+    iterate -= ElemType(stepSize) * gradient;
     terminate |= Callback::StepTaken(*this, f, iterate, callbacks...);
 
     const ElemType objective = f.Evaluate(iterate, currentFunction,
@@ -244,10 +244,13 @@ BigBatchSGD<UpdatePolicyType>::Optimize(
       terminate |= Callback::BeginEpoch(*this, f, iterate, epoch,
           overallObjective, callbacks...);
 
-      // Reset the counter variables.
-      lastObjective = overallObjective;
-      overallObjective = 0;
-      currentFunction = 0;
+      // Reset the counter variables if we will continue.
+      if (i != actualMaxIterations)
+      {
+        lastObjective = overallObjective;
+        overallObjective = 0;
+        currentFunction = 0;
+      }
 
       if (shuffle) // Determine order of visitation.
         f.Shuffle();
diff --git a/inst/include/ensmallen_bits/callbacks/timer_stop.hpp b/inst/include/ensmallen_bits/callbacks/timer_stop.hpp
index 7b7f80b..542a7a1 100644
--- a/inst/include/ensmallen_bits/callbacks/timer_stop.hpp
+++ b/inst/include/ensmallen_bits/callbacks/timer_stop.hpp
@@ -45,6 +45,30 @@ class TimerStop
     timer.tic();
   }
 
+  /**
+   * Callback function called when a step is taken.
+   *
+   * @param optimizer The optimizer used to update the function.
+   * @param function Function to optimize.
+   * @param coordinates Starting point.
+   * @param epoch The index of the current epoch.
+   * @param objective Objective value of the current point.
+   */
+  template<typename OptimizerType, typename FunctionType, typename MatType>
+  bool EndEpoch(OptimizerType& /* optimizer */,
+                FunctionType& /* function */,
+                const MatType& /* coordinates */)
+  {
+    if (timer.toc() > duration)
+    {
+      Info << "Timer timeout (" << duration << "s) reached; terminating "
+          << "optimization." << std::endl;
+      return true;
+    }
+
+    return false;
+  }
+
   /**
    * Callback function called at the end of a pass over the data.
    *
@@ -63,7 +87,8 @@ class TimerStop
   {
     if (timer.toc() > duration)
     {
-      Info << "Timer timeout reached; terminate optimization." << std::endl;
+      Info << "Timer timeout (" << duration << "s) reached; terminating "
+          << "optimization." << std::endl;
       return true;
     }
 
diff --git a/inst/include/ensmallen_bits/cd/cd.hpp b/inst/include/ensmallen_bits/cd/cd.hpp
index 062a210..34510a7 100644
--- a/inst/include/ensmallen_bits/cd/cd.hpp
+++ b/inst/include/ensmallen_bits/cd/cd.hpp
@@ -94,7 +94,7 @@ class CD
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(ResolvableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/cd/cd_impl.hpp b/inst/include/ensmallen_bits/cd/cd_impl.hpp
index 8f57e27..3d1c7af 100644
--- a/inst/include/ensmallen_bits/cd/cd_impl.hpp
+++ b/inst/include/ensmallen_bits/cd/cd_impl.hpp
@@ -39,8 +39,8 @@ template <typename ResolvableFunctionType,
           typename MatType,
           typename GradType,
           typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 CD<DescentPolicyType>::Optimize(
     ResolvableFunctionType& function,
     MatType& iterateIn,
@@ -84,7 +84,7 @@ CD<DescentPolicyType>::Optimize(
       break;
 
     // Update the decision variable with the partial gradient.
-    iterate.col(featureIdx) -= stepSize * gradient.col(featureIdx);
+    iterate.col(featureIdx) -= ElemType(stepSize) * gradient.col(featureIdx);
     terminate |= Callback::StepTaken(*this, function, iterate, callbacks...);
 
     // Check for convergence.
diff --git a/inst/include/ensmallen_bits/cd/descent_policies/random_descent.hpp b/inst/include/ensmallen_bits/cd/descent_policies/random_descent.hpp
index e5eaeb1..e5c57d8 100644
--- a/inst/include/ensmallen_bits/cd/descent_policies/random_descent.hpp
+++ b/inst/include/ensmallen_bits/cd/descent_policies/random_descent.hpp
@@ -52,8 +52,11 @@ class RandomDescent
                                const MatType& /* iterate */,
                                const ResolvableFunctionType& function)
   {
+    // return randi<size_t>(
+    //     arma::distr_param(0, function.NumFeatures() - 1));
+
     return arma::as_scalar(arma::randi<arma::uvec>(
-          1, arma::distr_param(0, function.NumFeatures() - 1)));
+      1, arma::distr_param(0, function.NumFeatures() - 1)));
   }
 };
 
diff --git a/inst/include/ensmallen_bits/cmaes/active_cmaes.hpp b/inst/include/ensmallen_bits/cmaes/active_cmaes.hpp
index 1edec07..f56f50f 100644
--- a/inst/include/ensmallen_bits/cmaes/active_cmaes.hpp
+++ b/inst/include/ensmallen_bits/cmaes/active_cmaes.hpp
@@ -3,8 +3,8 @@
  * @author Marcus Edel
  * @author Suvarsha Chennareddy
  *
- * Definition of the Active Covariance Matrix Adaptation Evolution Strategy 
- * as proposed by G.A Jastrebski and D.V Arnold in "Improving Evolution 
+ * Definition of the Active Covariance Matrix Adaptation Evolution Strategy
+ * as proposed by G.A Jastrebski and D.V Arnold in "Improving Evolution
  * Strategies through Active Covariance Matrix Adaptation".
  *
  * ensmallen is free software; you may redistribute it and/or modify it under
@@ -26,25 +26,25 @@ namespace ens {
  * Active CMA-ES is a variant of the stochastic search algorithm
  * CMA-ES - Covariance Matrix Adaptation Evolution Strategy.
  * Active CMA-ES actively reduces the uncertainty in unfavourable directions by
- * exploiting the information about bad mutations in the covariance matrix 
- * update step. This isn't for the purpose of accelerating progress, but 
- * instead for speeding up the adaptation of the covariance matrix (which, in 
+ * exploiting the information about bad mutations in the covariance matrix
+ * update step. This isn't for the purpose of accelerating progress, but
+ * instead for speeding up the adaptation of the covariance matrix (which, in
  * turn, will lead to faster progress).
  *
  * For more information, please refer to:
  *
  * @code
  * @INPROCEEDINGS{1688662,
- *   author={Jastrebski, G.A. and Arnold, D.V.},
- *   booktitle={2006 IEEE International Conference on Evolutionary 
-                Computation},
- *   title={Improving Evolution Strategies through Active Covariance 
-            Matrix Adaptation},
- *   year={2006},
- *   volume={},
- *   number={},
- *   pages={2814-2821},
- *   doi={10.1109/CEC.2006.1688662}}
+ *   author    = {Jastrebski, G.A. and Arnold, D.V.},
+ *   booktitle = {2006 IEEE International Conference on Evolutionary
+ *                Computation},
+ *   title     = {Improving Evolution Strategies through Active Covariance
+ *                Matrix Adaptation},
+ *   year      = {2006},
+ *   volume    = {},
+ *   number    = {},
+ *   pages     = {2814-2821},
+ *   doi       = {10.1109/CEC.2006.1688662}}
  * @endcode
  *
  * Active CMA-ES can optimize separable functions.  For more details, see the
@@ -52,7 +52,7 @@ namespace ens {
  * ensmallen website.
  *
  * @tparam SelectionPolicy The selection strategy used for the evaluation step.
- * @tparam TransformationPolicy The transformation strategy used to 
+ * @tparam TransformationPolicy The transformation strategy used to
  *       map decision variables to the desired domain during fitness evaluation
  *       and termination. Use EmptyTransformation if the domain isn't bounded.
  */
@@ -62,15 +62,15 @@ class ActiveCMAES
 {
  public:
   /**
-   * Construct the Active CMA-ES optimizer with the given function and parameters. The
-   * defaults here are not necessarily good for the given problem, so it is
-   * suggested that the values used be tailored to the task at hand.  The
-   * maximum number of iterations refers to the maximum number of points that
-   * are processed (i.e., one iteration equals one point; one iteration does not
-   * equal one pass over the dataset).
+   * Construct the Active CMA-ES optimizer with the given function and
+   * parameters. The defaults here are not necessarily good for the given
+   * problem, so it is suggested that the values used be tailored to the task at
+   * hand.  The maximum number of iterations refers to the maximum number of
+   * points that are processed (i.e., one iteration equals one point; one
+   * iteration does not equal one pass over the dataset).
    *
    * @param lambda The population size (0 use the default size).
-   * @param transformationPolicy Instantiated transformation policy used to 
+   * @param transformationPolicy Instantiated transformation policy used to
    *     map the coordinates to the desired domain.
    * @param batchSize Batch size to use for the objective calculation.
    * @param maxIterations Maximum number of iterations allowed (0 means no
@@ -82,7 +82,7 @@ class ActiveCMAES
    */
   ActiveCMAES(
       const size_t lambda = 0,
-      const TransformationPolicyType& 
+      const TransformationPolicyType&
           transformationPolicy = TransformationPolicyType(),
       const size_t batchSize = 32,
       const size_t maxIterations = 1000,
@@ -91,38 +91,9 @@ class ActiveCMAES
       double stepSize = 0);
 
   /**
-   * Construct the Active CMA-ES optimizer with the given function and parameters 
-   * (including lower and upper bounds). The defaults here are not necessarily 
-   * good for the given problem, so it is suggested that the values used be 
-   * tailored to the task at hand.  The maximum number of iterations refers to 
-   * the maximum number of points that are processed (i.e., one iteration 
-   * equals one point; one iteration does not equal one pass over the dataset). 
-   *
-   * @param lambda The population size(0 use the default size).
-   * @param lowerBound Lower bound of decision variables.
-   * @param upperBound Upper bound of decision variables.
-   * @param batchSize Batch size to use for the objective calculation.
-   * @param maxIterations Maximum number of iterations allowed(0 means no
-      limit).
-   * @param tolerance Maximum absolute tolerance to terminate algorithm.
-   * @param selectionPolicy Instantiated selection policy used to calculate the
-   * objective.
-   * @param stepSize Starting sigma/step size (will be modified).
-   */
-  ActiveCMAES(
-      const size_t lambda = 0,
-      const double lowerBound = -10,
-      const double upperBound = 10,
-      const size_t batchSize = 32,
-      const size_t maxIterations = 1000,
-      const double tolerance = 1e-5,
-      const SelectionPolicyType& selectionPolicy = SelectionPolicyType(),
-      double stepSize = 0);
-
-  /**
-   * Optimize the given function using Active CMA-ES. The given starting point will be
-   * modified to store the finishing point of the algorithm, and the final
-   * objective value is returned.
+   * Optimize the given function using Active CMA-ES. The given starting point
+   * will be modified to store the finishing point of the algorithm, and the
+   * final objective value is returned.
    *
    * @tparam SeparableFunctionType Type of the function to be optimized.
    * @tparam MatType Type of matrix to optimize.
@@ -169,7 +140,7 @@ class ActiveCMAES
   const TransformationPolicyType& TransformationPolicy() const
   { return transformationPolicy; }
   //! Modify the transformation policy.
-  TransformationPolicyType& TransformationPolicy() 
+  TransformationPolicyType& TransformationPolicy()
   { return transformationPolicy; }
 
   //! Get the step size.
@@ -196,7 +167,7 @@ class ActiveCMAES
   SelectionPolicyType selectionPolicy;
 
   //! The transformationPolicy used to map coordinates to the suitable domain
-  //! while evaluating fitness. This mapping is also done after optimization 
+  //! while evaluating fitness. This mapping is also done after optimization
   //! has completed.
   TransformationPolicyType transformationPolicy;
 
diff --git a/inst/include/ensmallen_bits/cmaes/active_cmaes_impl.hpp b/inst/include/ensmallen_bits/cmaes/active_cmaes_impl.hpp
index 047de85..cd7d9ed 100644
--- a/inst/include/ensmallen_bits/cmaes/active_cmaes_impl.hpp
+++ b/inst/include/ensmallen_bits/cmaes/active_cmaes_impl.hpp
@@ -18,7 +18,6 @@
 // In case it hasn't been included yet.
 #include "active_cmaes.hpp"
 
-#include "not_empty_transformation.hpp"
 #include <ensmallen_bits/function.hpp>
 
 namespace ens {
@@ -42,29 +41,6 @@ ActiveCMAES<SelectionPolicyType, TransformationPolicyType>::ActiveCMAES(
     stepSize(stepSizeIn)
 { /* Nothing to do. */ }
 
-template<typename SelectionPolicyType, typename TransformationPolicyType>
-ActiveCMAES<SelectionPolicyType, TransformationPolicyType>::ActiveCMAES(
-                                  const size_t lambda,
-                                  const double lowerBound,
-                                  const double upperBound,
-                                  const size_t batchSize,
-                                  const size_t maxIterations,
-                                  const double tolerance,
-                                  const SelectionPolicyType& selectionPolicy,
-                                  double stepSizeIn) :
-    lambda(lambda),
-    batchSize(batchSize),
-    maxIterations(maxIterations),
-    tolerance(tolerance),
-    selectionPolicy(selectionPolicy),
-    stepSize(stepSizeIn)
-{
-  Warn << "This is a deprecated constructor and will be removed in a "
-    "future version of ensmallen" << std::endl;
-  NotEmptyTransformation<TransformationPolicyType, EmptyTransformation<>> d;
-  d.Assign(transformationPolicy, lowerBound, upperBound);
-}
-
 //! Optimize the function (minimize).
 template<typename SelectionPolicyType, typename TransformationPolicyType>
 template<typename SeparableFunctionType,
@@ -80,6 +56,9 @@ typename MatType::elem_type ActiveCMAES<SelectionPolicyType,
   typedef typename MatType::elem_type ElemType;
   typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
 
+  typedef typename ForwardType<MatType>::bcol BaseColType;
+  typedef typename ForwardType<MatType>::uvec UVecType;
+
   // Make sure that we have the methods that we need.  Long name...
   traits::CheckArbitrarySeparableFunctionTypeAPI<
       SeparableFunctionType, BaseMatType>();
@@ -105,21 +84,23 @@ typename MatType::elem_type ActiveCMAES<SelectionPolicyType,
 
   // Step size control parameters.
   BaseMatType sigma(2, 1); // sigma is vector-shaped.
-  if (stepSize == 0) 
+  if (stepSize == 0)
     sigma(0) = transformationPolicy.InitialStepSize();
-  else 
-    sigma(0) = stepSize;
+  else
+    sigma(0) = ElemType(stepSize);
 
-  const ElemType cs = 4.0 / (iterate.n_elem + 4);
+  const ElemType cs = 4 / ElemType(iterate.n_elem + 4);
   const ElemType ds = 1 + cs;
-  const ElemType enn = std::sqrt(iterate.n_elem) * (1.0 - 1.0 /
-      (4.0 * iterate.n_elem) + 1.0 / (21 * std::pow(iterate.n_elem, 2)));
+  const ElemType enn = std::sqrt(iterate.n_elem) * (1 -
+      1 / ElemType(4 * iterate.n_elem) +
+      1 / ElemType(21 * std::pow(iterate.n_elem, 2)));
 
   // Covariance update parameters. Cumulation for distribution.
   const ElemType cc = cs;
-  const ElemType ccov = 2.0 / std::pow((iterate.n_elem + std::sqrt(2)), 2);
-  const ElemType beta = (4.0 * mu - 2.0) / (std::pow((iterate.n_elem + 12), 2) 
-      + 4 * mu);
+  const ElemType ccov = 2 /
+      std::pow((iterate.n_elem + std::sqrt(ElemType(2))), ElemType(2));
+  const ElemType beta = (4 * mu - 2) /
+      (std::pow(ElemType(iterate.n_elem + 12), ElemType(2)) + 4 * mu);
 
   std::vector<BaseMatType> mPosition(2, BaseMatType(iterate.n_rows,
       iterate.n_cols));
@@ -163,13 +144,13 @@ typename MatType::elem_type ActiveCMAES<SelectionPolicyType,
   C[0].eye();
 
   // Covariance matrix parameters.
-  arma::Col<ElemType> eigval;
+  BaseColType eigval;
   BaseMatType eigvec;
   BaseMatType eigvalZero(iterate.n_elem, 1); // eigvalZero is vector-shaped.
   eigvalZero.zeros();
 
   // The current visitation order (sorted by population objectives).
-  arma::uvec idx = arma::linspace<arma::uvec>(0, lambda - 1, lambda);
+  UVecType idx = linspace<UVecType>(0, lambda - 1, lambda);
 
   // Now iterate!
   Callback::BeginOptimization(*this, function, transformedIterate,
@@ -191,21 +172,22 @@ typename MatType::elem_type ActiveCMAES<SelectionPolicyType,
     // Perform Cholesky decomposition. If the matrix is not positive definite,
     // add a small value and try again.
     BaseMatType covLower;
-    while (!arma::chol(covLower, C[idx0], "lower"))
+    // while (!arma::chol(covLower, C[idx0], "lower"))
+    while (!chol(covLower, C[idx0]))
       C[idx0].diag() += std::numeric_limits<ElemType>::epsilon();
 
-    arma::eig_sym(eigval, eigvec, C[idx0]);
+    eig_sym(eigval, eigvec, C[idx0]);
 
     for (size_t j = 0; j < lambda; ++j)
     {
       if (iterate.n_rows > iterate.n_cols)
       {
         pStep[idx(j)] = covLower *
-          arma::randn<BaseMatType>(iterate.n_rows, iterate.n_cols);
+          randn<BaseMatType>(iterate.n_rows, iterate.n_cols);
       }
       else
       {
-        pStep[idx(j)] = arma::randn<BaseMatType>(iterate.n_rows, iterate.n_cols)
+        pStep[idx(j)] = randn<BaseMatType>(iterate.n_rows, iterate.n_cols)
           * covLower.t();
       }
 
@@ -218,7 +200,7 @@ typename MatType::elem_type ActiveCMAES<SelectionPolicyType,
     }
 
     // Sort population.
-    idx = arma::sort_index(pObjective);
+    idx = sort_index(pObjective);
 
     step = w * pStep[idx(0)];
     for (size_t j = 1; j < mu; ++j)
@@ -256,7 +238,7 @@ typename MatType::elem_type ActiveCMAES<SelectionPolicyType,
           eigvec * diagmat(1 / eigval) * eigvec.t();
     }
 
-    const ElemType psNorm = arma::norm(ps[idx1]);
+    const ElemType psNorm = norm(ps[idx1]);
     sigma(idx1) = sigma(idx0) * std::exp(cs / ds * (psNorm / enn - 1));
 
     if (std::isnan(sigma(idx1)) || sigma(idx1) > 1e14)
@@ -308,8 +290,8 @@ typename MatType::elem_type ActiveCMAES<SelectionPolicyType,
       }
     }
 
-    arma::eig_sym(eigval, eigvec, C[idx1]);
-    const arma::uvec negativeEigval = arma::find(eigval < 0, 1);
+    eig_sym(eigval, eigvec, C[idx1]);
+    const UVecType negativeEigval = find(eigval < 0, 1);
     if (!negativeEigval.is_empty())
     {
       if (negativeEigval(0) == 0)
@@ -319,7 +301,7 @@ typename MatType::elem_type ActiveCMAES<SelectionPolicyType,
       else
       {
         C[idx1] = eigvec.cols(0, negativeEigval(0) - 1) *
-            arma::diagmat(eigval.subvec(0, negativeEigval(0) - 1)) *
+            diagmat(eigval.subvec(0, negativeEigval(0) - 1)) *
             eigvec.cols(0, negativeEigval(0) - 1).t();
       }
     }
diff --git a/inst/include/ensmallen_bits/cmaes/cmaes.hpp b/inst/include/ensmallen_bits/cmaes/cmaes.hpp
index 6fed17a..7ae22bc 100644
--- a/inst/include/ensmallen_bits/cmaes/cmaes.hpp
+++ b/inst/include/ensmallen_bits/cmaes/cmaes.hpp
@@ -48,7 +48,7 @@ namespace ens {
  * ensmallen website.
  *
  * @tparam SelectionPolicy The selection strategy used for the evaluation step.
- * @tparam TransformationPolicy The transformation strategy used to 
+ * @tparam TransformationPolicy The transformation strategy used to
  *       map decision variables to the desired domain during fitness evaluation
  *       and termination. Use EmptyTransformation if the domain isn't bounded.
  */
@@ -66,7 +66,7 @@ class CMAES
    * equal one pass over the dataset).
    *
    * @param lambda The population size (0 use the default size).
-   * @param transformationPolicy Instantiated transformation policy used to 
+   * @param transformationPolicy Instantiated transformation policy used to
    *     map the coordinates to the desired domain.
    * @param batchSize Batch size to use for the objective calculation.
    * @param maxIterations Maximum number of iterations allowed (0 means no
@@ -77,7 +77,7 @@ class CMAES
    * @param stepSize Starting sigma/step size (will be modified).
    */
   CMAES(const size_t lambda = 0,
-        const TransformationPolicyType& 
+        const TransformationPolicyType&
               transformationPolicy = TransformationPolicyType(),
         const size_t batchSize = 32,
         const size_t maxIterations = 1000,
@@ -85,34 +85,6 @@ class CMAES
         const SelectionPolicyType& selectionPolicy = SelectionPolicyType(),
         double stepSize = 0);
 
-  /**
-   * Construct the CMA-ES optimizer with the given function and parameters 
-   * (including lower and upper bounds). The defaults here are not necessarily 
-   * good for the given problem, so it is suggested that the values used be 
-   * tailored to the task at hand.  The maximum number of iterations refers to 
-   * the maximum number of points that are processed (i.e., one iteration 
-   * equals one point; one iteration does not equal one pass over the dataset).
-   *
-   * @param lambda The population size(0 use the default size).
-   * @param lowerBound Lower bound of decision variables.
-   * @param upperBound Upper bound of decision variables.
-   * @param batchSize Batch size to use for the objective calculation.
-   * @param maxIterations Maximum number of iterations allowed(0 means no
-      limit).
-   * @param tolerance Maximum absolute tolerance to terminate algorithm.
-   * @param selectionPolicy Instantiated selection policy used to calculate the
-   * objective.
-   * @param stepSize Starting sigma/step size (will be modified).
-   */
-  CMAES(const size_t lambda = 0,
-        const double lowerBound = -10,
-        const double upperBound = 10,
-        const size_t batchSize = 32,
-        const size_t maxIterations = 1000,
-        const double tolerance = 1e-5,
-        const SelectionPolicyType& selectionPolicy = SelectionPolicyType(),
-        double stepSize = 0);
-
   /**
    * Optimize the given function using CMA-ES. The given starting point will be
    * modified to store the finishing point of the algorithm, and the final
@@ -162,15 +134,13 @@ class CMAES
   const TransformationPolicyType& TransformationPolicy() const
   { return transformationPolicy; }
   //! Modify the transformation policy.
-  TransformationPolicyType& TransformationPolicy() 
+  TransformationPolicyType& TransformationPolicy()
   { return transformationPolicy; }
 
   //! Get the step size.
-  double StepSize() const
-  { return stepSize; }
+  double StepSize() const { return stepSize; }
   //! Modify the step size.
-  double& StepSize()
-  { return stepSize; }
+  double& StepSize() { return stepSize; }
 
   //! Get the total number of function evaluations.
   size_t FunctionEvaluations() const  { return functionEvaluations; }
@@ -192,7 +162,7 @@ class CMAES
   SelectionPolicyType selectionPolicy;
 
   //! The transformationPolicy used to map coordinates to the suitable domain
-  //! while evaluating fitness. This mapping is also done after optimization 
+  //! while evaluating fitness. This mapping is also done after optimization
   //! has completed.
   TransformationPolicyType transformationPolicy;
 
diff --git a/inst/include/ensmallen_bits/cmaes/cmaes_impl.hpp b/inst/include/ensmallen_bits/cmaes/cmaes_impl.hpp
index 4d2c18e..35b2c6d 100644
--- a/inst/include/ensmallen_bits/cmaes/cmaes_impl.hpp
+++ b/inst/include/ensmallen_bits/cmaes/cmaes_impl.hpp
@@ -18,14 +18,13 @@
 // In case it hasn't been included yet.
 #include "cmaes.hpp"
 
-#include "not_empty_transformation.hpp"
 #include <ensmallen_bits/function.hpp>
 
 namespace ens {
 
 template<typename SelectionPolicyType, typename TransformationPolicyType>
 CMAES<SelectionPolicyType, TransformationPolicyType>::CMAES(const size_t lambda,
-                                  const TransformationPolicyType& 
+                                  const TransformationPolicyType&
                                         transformationPolicy,
                                   const size_t batchSize,
                                   const size_t maxIterations,
@@ -41,35 +40,12 @@ CMAES<SelectionPolicyType, TransformationPolicyType>::CMAES(const size_t lambda,
     stepSize(stepSizeIn)
 { /* Nothing to do. */ }
 
-template<typename SelectionPolicyType, typename TransformationPolicyType>
-CMAES<SelectionPolicyType, TransformationPolicyType>::CMAES(const size_t lambda,
-                                  const double lowerBound,
-                                  const double upperBound,
-                                  const size_t batchSize,
-                                  const size_t maxIterations,
-                                  const double tolerance,
-                                  const SelectionPolicyType& selectionPolicy,
-                                  double stepSizeIn) :
-    lambda(lambda),
-    batchSize(batchSize),
-    maxIterations(maxIterations),
-    tolerance(tolerance),
-    selectionPolicy(selectionPolicy),
-    stepSize(stepSizeIn)
-{
-  Warn << "This is a deprecated constructor and will be removed in a "
-    "future version of ensmallen" << std::endl;
-  NotEmptyTransformation<TransformationPolicyType, EmptyTransformation<>> d;
-  d.Assign(transformationPolicy, lowerBound, upperBound);
-}
-
-
 //! Optimize the function (minimize).
 template<typename SelectionPolicyType, typename TransformationPolicyType>
 template<typename SeparableFunctionType,
          typename MatType,
          typename... CallbackTypes>
-typename MatType::elem_type CMAES<SelectionPolicyType, 
+typename MatType::elem_type CMAES<SelectionPolicyType,
   TransformationPolicyType>::Optimize(
     SeparableFunctionType& function,
     MatType& iterateIn,
@@ -77,7 +53,10 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
 {
   // Convenience typedefs.
   typedef typename MatType::elem_type ElemType;
-  typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
+
+  typedef typename ForwardType<MatType>::bcol bcol;
+  typedef typename ForwardType<MatType>::uvec UVecType;
+  typedef typename ForwardType<MatType>::bmat BaseMatType;
 
   // Make sure that we have the methods that we need.  Long name...
   traits::CheckArbitrarySeparableFunctionTypeAPI<
@@ -95,18 +74,18 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
 
   // Parent weights.
   const size_t mu = std::round(lambda / 2);
-  BaseMatType w = std::log(mu + 0.5) - arma::log(
-      arma::linspace<BaseMatType>(0, mu - 1, mu) + 1.0);
-  w /= arma::accu(w);
+  BaseMatType w = std::log(mu + 0.5) - log(
+      linspace<BaseMatType>(0, mu - 1, mu) + 1.0);
+  w /= accu(w);
 
   // Number of effective solutions.
-  const double muEffective = 1 / arma::accu(arma::pow(w, 2));
+  const double muEffective = 1 / accu(pow(w, 2));
 
   // Step size control parameters.
   BaseMatType sigma(2, 1); // sigma is vector-shaped.
-  if (stepSize == 0) 
+  if (stepSize == 0)
     sigma(0) = transformationPolicy.InitialStepSize();
-  else 
+  else
     sigma(0) = stepSize;
 
   const double cs = (muEffective + 2) / (iterate.n_elem + muEffective + 5);
@@ -151,7 +130,6 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
     terminate |= Callback::Evaluate(*this, function, transformedIterate,
         objective, callbacks...);
   }
-  functionEvaluations += numFunctions;
 
   ElemType overallObjective = currentObjective;
   ElemType lastObjective = std::numeric_limits<ElemType>::max();
@@ -170,13 +148,13 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
   C[0].eye();
 
   // Covariance matrix parameters.
-  arma::Col<ElemType> eigval; // TODO: might need a more general type.
+  bcol eigval; // TODO: might need a more general type.
   BaseMatType eigvec;
   BaseMatType eigvalZero(iterate.n_elem, 1); // eigvalZero is vector-shaped.
   eigvalZero.zeros();
 
   // The current visitation order (sorted by population objectives).
-  arma::uvec idx = arma::linspace<arma::uvec>(0, lambda - 1, lambda);
+  UVecType idx = linspace<UVecType>(0, lambda - 1, lambda);
 
   // Now iterate!
   Callback::BeginOptimization(*this, function, transformedIterate,
@@ -196,22 +174,24 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
     // Perform Cholesky decomposition. If the matrix is not positive definite,
     // add a small value and try again.
     BaseMatType covLower;
-    while (!arma::chol(covLower, C[idx0], "lower"))
+    // while (!chol(covLower, C[idx0], "lower"))
+    while (!chol(covLower, C[idx0]))
       C[idx0].diag() += std::numeric_limits<ElemType>::epsilon();
 
-    arma::eig_sym(eigval, eigvec, C[idx0]);
+    eig_sym(eigval, eigvec, C[idx0]);
 
     for (size_t j = 0; j < lambda; ++j)
     {
       if (iterate.n_rows > iterate.n_cols)
       {
-        pStep[idx(j)] = covLower *
-          arma::randn<BaseMatType>(iterate.n_rows, iterate.n_cols);
+        pStep[idx(j)] = covLower * BaseMatType(
+            iterate.n_rows, iterate.n_cols, GetFillType<MatType>::randn);
       }
       else
       {
-        pStep[idx(j)] = arma::randn<BaseMatType>(iterate.n_rows, iterate.n_cols)
-          * covLower.t();
+        pStep[idx(j)] = BaseMatType(
+          iterate.n_rows, iterate.n_cols, GetFillType<MatType>::randn) *
+          covLower.t();
       }
 
       pPosition[idx(j)] = mPosition[idx0] + sigma(idx0) * pStep[idx(j)];
@@ -223,7 +203,7 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
     }
 
     // Sort population.
-    idx = arma::sort_index(pObjective);
+    idx = sort_index(pObjective);
 
     step = w(0) * pStep[idx(0)];
     for (size_t j = 1; j < mu; ++j)
@@ -236,8 +216,6 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
         transformationPolicy.Transform(mPosition[idx1]), terminate,
         callbacks...);
 
-    functionEvaluations += lambda; 
-
     // Update best parameters.
     if (currentObjective < overallObjective)
     {
@@ -246,30 +224,30 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
 
       transformedIterate = transformationPolicy.Transform(iterate);
       terminate |= Callback::StepTaken(*this, function,
-        transformedIterate, callbacks...);
+          transformedIterate, callbacks...);
     }
 
     // Update Step Size.
     if (iterate.n_rows > iterate.n_cols)
     {
       ps[idx1] = (1 - cs) * ps[idx0] + std::sqrt(
-        cs * (2 - cs) * muEffective) *
-        eigvec * diagmat(1 / eigval) * eigvec.t() * step;
+          cs * (2 - cs) * muEffective) * eigvec *
+          diagmat(1 / eigval) * eigvec.t() * step;
     }
     else
     {
       ps[idx1] = (1 - cs) * ps[idx0] + std::sqrt(
-        cs * (2 - cs) * muEffective) * step *
-        eigvec * diagmat(1 / eigval) * eigvec.t();
+          cs * (2 - cs) * muEffective) * step * eigvec *
+          diagmat(1 / eigval) * eigvec.t();
     }
 
-    const ElemType psNorm = arma::norm(ps[idx1]);
+    const ElemType psNorm = norm(ps[idx1]);
     sigma(idx1) = sigma(idx0) * std::exp(cs / ds * (psNorm / enn - 1));
 
     if (std::isnan(sigma(idx1)) || sigma(idx1) > 1e14)
     {
       Warn << "The step size diverged to " << sigma(idx1) << "; "
-        << "terminating with failure.  Try a smaller step size?" << std::endl;
+          << "terminating with failure.  Try a smaller step size?" << std::endl;
 
       iterate = transformationPolicy.Transform(iterate);
 
@@ -278,20 +256,20 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
     }
 
     // Update covariance matrix.
-    if ((psNorm / sqrt(1 - std::pow(1 - cs, 2 * i))) < h)
+    if ((psNorm / std::sqrt(1 - std::pow(1.0 - cs, 2.0 * (double) i))) < h)
     {
       pc[idx1] = (1 - cc) * pc[idx0] + std::sqrt(cc * (2 - cc) *
-        muEffective) * step;
+          muEffective) * step;
 
       if (iterate.n_rows > iterate.n_cols)
       {
         C[idx1] = (1 - c1 - cmu) * C[idx0] + c1 *
-          (pc[idx1] * pc[idx1].t());
+            (pc[idx1] * pc[idx1].t());
       }
       else
       {
         C[idx1] = (1 - c1 - cmu) * C[idx0] + c1 *
-          (pc[idx1].t() * pc[idx1]);
+            (pc[idx1].t() * pc[idx1]);
       }
     }
     else
@@ -301,12 +279,12 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
       if (iterate.n_rows > iterate.n_cols)
       {
         C[idx1] = (1 - c1 - cmu) * C[idx0] + c1 * (pc[idx1] *
-          pc[idx1].t() + (cc * (2 - cc)) * C[idx0]);
+            pc[idx1].t() + (cc * (2 - cc)) * C[idx0]);
       }
       else
       {
         C[idx1] = (1 - c1 - cmu) * C[idx0] + c1 *
-          (pc[idx1].t() * pc[idx1] + (cc * (2 - cc)) * C[idx0]);
+            (pc[idx1].t() * pc[idx1] + (cc * (2 - cc)) * C[idx0]);
       }
     }
 
@@ -314,21 +292,19 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
     {
       for (size_t j = 0; j < mu; ++j)
       {
-        C[idx1] = C[idx1] + cmu * w(j) *
-          pStep[idx(j)] * pStep[idx(j)].t();
+        C[idx1] = C[idx1] + cmu * w(j) * pStep[idx(j)] * pStep[idx(j)].t();
       }
     }
     else
     {
       for (size_t j = 0; j < mu; ++j)
       {
-        C[idx1] = C[idx1] + cmu * w(j) *
-          pStep[idx(j)].t() * pStep[idx(j)];
+        C[idx1] = C[idx1] + cmu * w(j) * pStep[idx(j)].t() * pStep[idx(j)];
       }
     }
 
-    arma::eig_sym(eigval, eigvec, C[idx1]);
-    const arma::uvec negativeEigval = arma::find(eigval < 0, 1);
+    eig_sym(eigval, eigvec, C[idx1]);
+    const UVecType negativeEigval = find(eigval < 0, 1);
     if (!negativeEigval.is_empty())
     {
       if (negativeEigval(0) == 0)
@@ -338,19 +314,19 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
       else
       {
         C[idx1] = eigvec.cols(0, negativeEigval(0) - 1) *
-          arma::diagmat(eigval.subvec(0, negativeEigval(0) - 1)) *
-          eigvec.cols(0, negativeEigval(0) - 1).t();
+            diagmat(eigval.subvec(0, negativeEigval(0) - 1)) *
+            eigvec.cols(0, negativeEigval(0) - 1).t();
       }
     }
 
     // Output current objective function.
     Info << "CMA-ES: iteration " << i << ", objective " << overallObjective
-      << "." << std::endl;
+        << "." << std::endl;
 
     if (std::isnan(overallObjective) || std::isinf(overallObjective))
     {
       Warn << "CMA-ES: converged to " << overallObjective << "; "
-        << "terminating with failure.  Try a smaller step size?" << std::endl;
+          << "terminating with failure.  Try a smaller step size?" << std::endl;
 
       iterate = transformationPolicy.Transform(iterate);
       Callback::EndOptimization(*this, function, iterate, callbacks...);
@@ -361,7 +337,7 @@ typename MatType::elem_type CMAES<SelectionPolicyType,
     {
       if (steps > patience) {
         Info << "CMA-ES: minimized within tolerance " << tolerance << "; "
-          << "terminating optimization." << std::endl;
+            << "terminating optimization." << std::endl;
 
         iterate = transformationPolicy.Transform(iterate);
         Callback::EndOptimization(*this, function, iterate, callbacks...);
diff --git a/inst/include/ensmallen_bits/cmaes/not_empty_transformation.hpp b/inst/include/ensmallen_bits/cmaes/not_empty_transformation.hpp
deleted file mode 100644
index 5252a42..0000000
--- a/inst/include/ensmallen_bits/cmaes/not_empty_transformation.hpp
+++ /dev/null
@@ -1,42 +0,0 @@
-/**
- * @file not_empty_transformation.hpp
- * @author Suvarsha Chennareddy
- *
- * Check whether TransformationPolicyType is EmptyTransformation.
- *
- * ensmallen is free software; you may redistribute it and/or modify it under
- * the terms of the 3-clause BSD license.  You should have received a copy of
- * the 3-clause BSD license along with ensmallen.  If not, see
- * http://www.opensource.org/licenses/BSD-3-Clause for more information.
- */
-#ifndef ENSMALLEN_CMAES_NOT_EMPTY_TRANSFORMATION
-#define ENSMALLEN_CMAES_NOT_EMPTY_TRANSFORMATION
-
-/**
- * This partial specialization is used to throw an exception when the
- * TransformationPolicyType is EmptyTransformation and call a constructor with
- * parameters 'lowerBound' and 'upperBound' otherwise.  This shall be removed
- * when the deprecated constructor is removed in the next major version of
- * ensmallen.
- */
-template<typename T1, typename T2>
-struct NotEmptyTransformation : std::true_type
-{
-  void Assign(T1& obj, double lowerBound, double upperBound)
-  {
-    obj = T1(lowerBound, upperBound);
-  }
-};
-
-template<template<typename...> class T, typename... A, typename... B>
-struct NotEmptyTransformation<T<A...>, T<B...>> : std::false_type
-{
-  void Assign(T<A...>& /* obj */,
-              double /* lowerBound */,
-              double /* upperBound */)
-  {
-    throw std::logic_error("TransformationPolicyType is EmptyTransformation");
-  }
-};
-
-#endif
diff --git a/inst/include/ensmallen_bits/cmaes/pop_cmaes.hpp b/inst/include/ensmallen_bits/cmaes/pop_cmaes.hpp
index 7614305..17e1bcc 100644
--- a/inst/include/ensmallen_bits/cmaes/pop_cmaes.hpp
+++ b/inst/include/ensmallen_bits/cmaes/pop_cmaes.hpp
@@ -6,7 +6,7 @@
  * Definition of the IPOP Covariance Matrix Adaptation Evolution Strategy
  * as proposed by A. Auger and N. Hansen in "A Restart CMA Evolution
  * Strategy With Increasing Population Size" and BIPOP Covariance Matrix
- * Adaptation Evolution Strategy as proposed by N. Hansen in "Benchmarking 
+ * Adaptation Evolution Strategy as proposed by N. Hansen in "Benchmarking
  * a BI-population CMA-ES on the BBOB-2009 function testbed".
  *
  * ensmallen is free software; you may redistribute it and/or modify it under
@@ -24,55 +24,59 @@ namespace ens {
 /**
  * Population-based CMA-ES (POP-CMA-ES) that can operate as either IPOP-CMA-ES
  * or BIPOP-CMA-ES based on a flag.
- * 
+ *
  * IPOP CMA-ES is a variant of the stochastic search algorithm
  * CMA-ES - Covariance Matrix Adaptation Evolution Strategy.
- * IPOP CMA-ES, also known as CMAES with increasing population size, 
+ * IPOP CMA-ES, also known as CMAES with increasing population size,
  * incorporates a restart strategy that involves gradually increasing
- * the population size. This approach is specifically designed to 
+ * the population size. This approach is specifically designed to
  * enhance the performance of CMA-ES on multi-modal functions.
  *
  * For more information, please refer to:
  *
  * @code
  * @INPROCEEDINGS{1554902,
- *   author={Auger, A. and Hansen, N.},
- *   booktitle={2005 IEEE Congress on Evolutionary Computation},
- *   title={A restart CMA evolution strategy with increasing population size},
- *   year={2005},
- *   volume={2},
- *   number={},
- *   pages={1769-1776 Vol. 2},
- *   doi={10.1109/CEC.2005.1554902}}
+ *   author    = {Auger, A. and Hansen, N.},
+ *   booktitle = {2005 IEEE Congress on Evolutionary Computation},
+ *   title     = {A restart CMA evolution strategy with increasing population
+ *                size},
+ *   year      = {2005},
+ *   volume    = {2},
+ *   number    = {},
+ *   pages     = {1769-1776 Vol. 2},
+ *   doi       = {10.1109/CEC.2005.1554902}}
  * @endcode
- * 
+ *
  * IPOP CMA-ES can optimize separable functions.  For more details, see the
  * documentation on function types included with this distribution or on the
  * ensmallen website.
- * 
+ *
  * BI-Population CMA-ES is a variant of the stochastic search algorithm
  * CMA-ES - Covariance Matrix Adaptation Evolution Strategy.
- * It implements a dual restart strategy with varying population sizes: one 
+ * It implements a dual restart strategy with varying population sizes: one
  * increasing and one with smaller, varied sizes. This BI-population approach
- * is designed to optimize performance on multi-modal function testbeds by 
+ * is designed to optimize performance on multi-modal function testbeds by
  * leveraging different exploration and exploitation dynamics.
  *
  * For more information, please refer to:
  *
  * @code
  * @inproceedings{hansen2009benchmarking,
- *   title={Benchmarking a BI-population CMA-ES on the BBOB-2009 function testbed},
- *   author={Hansen, Nikolaus},
- *   booktitle={Proceedings of the 11th annual conference companion on genetic and evolutionary computation conference: late breaking papers},
- *   pages={2389--2396},
- *   year={2009}}
+ *   title     = {Benchmarking a BI-population CMA-ES on the BBOB-2009 function
+ *                testbed},
+ *   author    = {Hansen, Nikolaus},
+ *   booktitle = {Proceedings of the 11th annual conference companion on genetic
+ *                and evolutionary computation conference: late breaking
+ *                papers},
+ *   pages     = {2389--2396},
+ *   year      = {2009}}
  * @endcode
  *
  * BI-Population CMA-ES can efficiently handle separable, multimodal, and weak
- * structure functions across various dimensions, as demonstrated in the 
+ * structure functions across various dimensions, as demonstrated in the
  * comprehensive results of the BBOB-2009 function testbed. The optimizer
- * utilizes an interlaced multistart strategy to balance between broad 
- * exploration and intensive exploitation, adjusting population sizes and 
+ * utilizes an interlaced multistart strategy to balance between broad
+ * exploration and intensive exploitation, adjusting population sizes and
  * step-sizes dynamically.
  */
 template<typename SelectionPolicyType = FullSelection,
@@ -83,15 +87,15 @@ class POP_CMAES : public CMAES<SelectionPolicyType, TransformationPolicyType>
  public:
   /**
    * Construct the POP-CMA-ES optimizer with the given parameters.
-   * Other than the same CMA-ES parameters, it also adds the maximum number of 
-   * restarts, the increase in population factor, the maximum number of 
+   * Other than the same CMA-ES parameters, it also adds the maximum number of
+   * restarts, the increase in population factor, the maximum number of
    * evaluations, as well as a flag indicating to use BIPOP or not.
    * The suggested values are not necessarily good for the given problem, so it
    * is suggested that the values used be tailored to the task at hand. The
    * maximum number of iterations refers to the maximum number of points that
    * are processed (i.e., one iteration equals one point; one iteration does not
    * equal one pass over the dataset).
-   * 
+   *
    * @param lambda The initial population size (0 use the default size).
    * @param transformationPolicy Instantiated transformation policy used to
    *    map the coordinates to the desired domain.
@@ -107,7 +111,7 @@ class POP_CMAES : public CMAES<SelectionPolicyType, TransformationPolicyType>
    * @param maxFunctionEvaluations Maximum number of function evaluations.
    */
   POP_CMAES(const size_t lambda = 0,
-            const TransformationPolicyType& transformationPolicy = 
+            const TransformationPolicyType& transformationPolicy =
                  TransformationPolicyType(),
             const size_t batchSize = 32,
             const size_t maxIterations = 1000,
@@ -161,11 +165,13 @@ class POP_CMAES : public CMAES<SelectionPolicyType, TransformationPolicyType>
 // Define IPOP_CMAES and BIPOP_CMAES using the POP_CMAES template
 template<typename SelectionPolicyType = FullSelection,
          typename TransformationPolicyType = EmptyTransformation<>>
-using IPOP_CMAES = POP_CMAES<SelectionPolicyType, TransformationPolicyType, false>;
+using IPOP_CMAES = POP_CMAES<
+    SelectionPolicyType, TransformationPolicyType, false>;
 
 template<typename SelectionPolicyType = FullSelection,
          typename TransformationPolicyType = EmptyTransformation<>>
-using BIPOP_CMAES = POP_CMAES<SelectionPolicyType, TransformationPolicyType, true>;
+using BIPOP_CMAES = POP_CMAES<
+    SelectionPolicyType, TransformationPolicyType, true>;
 
 } // namespace ens
 
diff --git a/inst/include/ensmallen_bits/cmaes/pop_cmaes_impl.hpp b/inst/include/ensmallen_bits/cmaes/pop_cmaes_impl.hpp
index 38b8011..2fa26c1 100644
--- a/inst/include/ensmallen_bits/cmaes/pop_cmaes_impl.hpp
+++ b/inst/include/ensmallen_bits/cmaes/pop_cmaes_impl.hpp
@@ -6,7 +6,7 @@
  * Implementation of the IPOP Covariance Matrix Adaptation Evolution Strategy
  * as proposed by A. Auger and N. Hansen in "A Restart CMA Evolution
  * Strategy With Increasing Population Size" and BIPOP Covariance Matrix
- * Adaptation Evolution Strategy as proposed by N. Hansen in "Benchmarking 
+ * Adaptation Evolution Strategy as proposed by N. Hansen in "Benchmarking
  * a BI-population CMA-ES on the BBOB-2009 function testbed".
  *
  * ensmallen is free software; you may redistribute it and/or modify it under
@@ -48,7 +48,7 @@ template<typename SelectionPolicyType,
          typename TransformationPolicyType,
          bool UseBIPOPFlag>
 template<typename SeparableFunctionType, typename MatType, typename... CallbackTypes>
-typename MatType::elem_type POP_CMAES<SelectionPolicyType, 
+typename MatType::elem_type POP_CMAES<SelectionPolicyType,
     TransformationPolicyType, UseBIPOPFlag>::Optimize(
     SeparableFunctionType& function,
     MatType& iterateIn,
@@ -65,9 +65,9 @@ typename MatType::elem_type POP_CMAES<SelectionPolicyType,
 
   // First single run with default population size
   MatType iterate = iterateIn;
-  ElemType overallObjective = CMAES<SelectionPolicyType, 
-      TransformationPolicyType>::Optimize(function, iterate, sbc, 
-                                          callbacks...);
+  ElemType overallObjective = CMAES<SelectionPolicyType,
+      TransformationPolicyType>::Optimize(function, iterate, sbc,
+          callbacks...);
 
   overallSBC = sbc;
   ElemType objective;
@@ -85,7 +85,7 @@ typename MatType::elem_type POP_CMAES<SelectionPolicyType,
 
   while (restart < maxRestarts)
   {
-    if (!UseBIPOPFlag || largePopulationBudget <= smallPopulationBudget || 
+    if (!UseBIPOPFlag || largePopulationBudget <= smallPopulationBudget ||
         restart == 0 || restart == maxRestarts - 1)
     {
       // Large population regime (IPOP or BIPOP)
@@ -95,12 +95,12 @@ typename MatType::elem_type POP_CMAES<SelectionPolicyType,
 
       Info << "POP-CMA-ES: restart " << restart << ", large population size" <<
           " (lambda): " << this->PopulationSize() << "." << std::endl;
-      
+
       iterate = iterateIn;
 
       // Optimize using the CMAES object.
-      objective = CMAES<SelectionPolicyType, 
-          TransformationPolicyType>::Optimize(function, iterate, sbc, 
+      objective = CMAES<SelectionPolicyType,
+          TransformationPolicyType>::Optimize(function, iterate, sbc,
           callbacks...);
 
       evaluations = this->FunctionEvaluations();
@@ -110,10 +110,10 @@ typename MatType::elem_type POP_CMAES<SelectionPolicyType,
     {
       // Small population regime (BIPOP only)
       double u = arma::randu<double>();
-      size_t smallLambda = static_cast<size_t>(defaultLambda * std::pow(0.5 * 
+      size_t smallLambda = static_cast<size_t>(defaultLambda * std::pow(0.5 *
           currentLargeLambda / defaultLambda, u * u));
       double stepSizeSmall = 2 * std::pow(10, -2 * arma::randu<double>());
-      
+
       this->PopulationSize() = smallLambda;
       this->StepSize() = stepSizeSmall;
 
@@ -121,10 +121,10 @@ typename MatType::elem_type POP_CMAES<SelectionPolicyType,
           " size (lambda): " << this->PopulationSize() << "." << std::endl;
 
       iterate = iterateIn;
-      
+
       // Optimize using the CMAES object.
-      objective = CMAES<SelectionPolicyType, 
-          TransformationPolicyType>::Optimize(function, iterate, sbc, 
+      objective = CMAES<SelectionPolicyType,
+          TransformationPolicyType>::Optimize(function, iterate, sbc,
                                               callbacks...);
 
       evaluations = this->FunctionEvaluations();
@@ -160,4 +160,4 @@ typename MatType::elem_type POP_CMAES<SelectionPolicyType,
 
 } // namespace ens
 
-#endif
\ No newline at end of file
+#endif
diff --git a/inst/include/ensmallen_bits/cne/cne.hpp b/inst/include/ensmallen_bits/cne/cne.hpp
index 9a1dcf2..be54d0d 100644
--- a/inst/include/ensmallen_bits/cne/cne.hpp
+++ b/inst/include/ensmallen_bits/cne/cne.hpp
@@ -149,14 +149,14 @@ class CNE
 
  private:
   //! Reproduce candidates to create the next generation.
-  template<typename MatType>
+  template<typename MatType, typename IndexType>
   void Reproduce(std::vector<MatType>& population,
                  const MatType& fitnessValues,
-                 arma::uvec& index);
+                 IndexType& index);
 
   //! Modify weights with some noise for the evolution of next generation.
-  template<typename MatType>
-  void Mutate(std::vector<MatType>& population, arma::uvec& index);
+  template<typename MatType, typename IndexType>
+  void Mutate(std::vector<MatType>& population, IndexType& index);
 
   /**
    * Crossover parents and create new childs. Two parents create two new childs.
diff --git a/inst/include/ensmallen_bits/cne/cne_impl.hpp b/inst/include/ensmallen_bits/cne/cne_impl.hpp
index 24d1812..6e74a75 100644
--- a/inst/include/ensmallen_bits/cne/cne_impl.hpp
+++ b/inst/include/ensmallen_bits/cne/cne_impl.hpp
@@ -47,6 +47,7 @@ typename MatType::elem_type CNE::Optimize(ArbitraryFunctionType& function,
   // Convenience typedefs.
   typedef typename MatType::elem_type ElemType;
   typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
+  typedef typename ForwardType<MatType>::uvec UVecType;
 
   // Make sure that we have the methods that we need.  Long name...
   traits::CheckArbitraryFunctionTypeAPI<ArbitraryFunctionType,
@@ -56,7 +57,7 @@ typename MatType::elem_type CNE::Optimize(ArbitraryFunctionType& function,
   // Vector of fitness values corresponding to each candidate.
   BaseMatType fitnessValues;
   //! Index of sorted fitness values.
-  arma::uvec index;
+  UVecType index;
 
   // Make sure for evolution to work at least four candidates are present.
   if (populationSize < 4)
@@ -93,8 +94,8 @@ typename MatType::elem_type CNE::Optimize(ArbitraryFunctionType& function,
   std::vector<BaseMatType> population;
   for (size_t i = 0 ; i < populationSize; ++i)
   {
-    population.push_back(arma::randn<BaseMatType>(iterate.n_rows,
-        iterate.n_cols) + iterate);
+    population.push_back(BaseMatType(iterate.n_rows, iterate.n_cols,
+        GetFillType<MatType>::randn) + iterate);
   }
 
   // Store the number of elements in the objective matrix.
@@ -164,13 +165,13 @@ typename MatType::elem_type CNE::Optimize(ArbitraryFunctionType& function,
 }
 
 //! Reproduce candidates to create the next generation.
-template<typename MatType>
+template<typename MatType, typename IndexType>
 inline void CNE::Reproduce(std::vector<MatType>& population,
                            const MatType& fitnessValues,
-                           arma::uvec& index)
+                           IndexType& index)
 {
   // Sort fitness values. Smaller fitness value means better performance.
-  index = arma::sort_index(fitnessValues);
+  index = sort_index(fitnessValues);
 
   // First parent.
   size_t mom;
@@ -241,17 +242,20 @@ inline void CNE::Crossover(std::vector<MatType>& population,
 }
 
 //! Modify weights with some noise for the evolution of next generation.
-template<typename MatType>
-inline void CNE::Mutate(std::vector<MatType>& population, arma::uvec& index)
+template<typename MatType, typename IndexType>
+inline void CNE::Mutate(std::vector<MatType>& population, IndexType& index)
 {
+  typedef typename MatType::elem_type ElemType;
+
   // Mutate the whole matrix with the given rate and probability.
   // The best candidate is not altered.
   for (size_t i = 1; i < populationSize; i++)
   {
-    population[index(i)] += (arma::randu<MatType>(population[index(i)].n_rows,
-        population[index(i)].n_cols) < mutationProb) %
-        (mutationSize * arma::randn<MatType>(population[index(i)].n_rows,
-        population[index(i)].n_cols));
+    population[index(i)] += conv_to<MatType>::from(
+        randu<MatType>(population[index(i)].n_rows,
+        population[index(i)].n_cols) < ElemType(mutationProb)) %
+        (ElemType(mutationSize) * MatType(population[index(i)].n_rows,
+        population[index(i)].n_cols, GetFillType<MatType>::randn));
   }
 }
 
diff --git a/inst/include/ensmallen_bits/de/de.hpp b/inst/include/ensmallen_bits/de/de.hpp
index 93c41aa..75449a6 100644
--- a/inst/include/ensmallen_bits/de/de.hpp
+++ b/inst/include/ensmallen_bits/de/de.hpp
@@ -45,10 +45,10 @@ namespace ens {
  *
  * @code
  * @techreport{storn1995,
- *   title    = {Differential Evolution—a simple and efficient adaptive scheme
- *               for global optimization over continuous spaces},
- *   author   = {Storn, Rainer and Price, Kenneth},
- *   year     = 1995
+ *   title  = {Differential Evolution—a simple and efficient adaptive scheme
+ *             for global optimization over continuous spaces},
+ *   author = {Storn, Rainer and Price, Kenneth},
+ *   year   = 1995
  * }
  * @endcode
  *
diff --git a/inst/include/ensmallen_bits/de/de_impl.hpp b/inst/include/ensmallen_bits/de/de_impl.hpp
index 09e55a0..c44cef6 100644
--- a/inst/include/ensmallen_bits/de/de_impl.hpp
+++ b/inst/include/ensmallen_bits/de/de_impl.hpp
@@ -40,14 +40,16 @@ typename MatType::elem_type DE::Optimize(FunctionType& function,
   // Convenience typedefs.
   typedef typename MatType::elem_type ElemType;
   typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
+  typedef typename ForwardType<MatType>::vec ColType;
 
   BaseMatType& iterate = (BaseMatType&) iterateIn;
 
   // Population matrix. Each column is a candidate.
   std::vector<BaseMatType> population;
   population.resize(populationSize);
+
   // Vector of fitness values corresponding to each candidate.
-  arma::Col<ElemType> fitnessValues;
+  ColType fitnessValues;
 
   // Make sure that we have the methods that we need.  Long name...
   traits::CheckArbitraryFunctionTypeAPI<
@@ -57,13 +59,13 @@ typename MatType::elem_type DE::Optimize(FunctionType& function,
   // Population Size must be at least 3 for DE to work.
   if (populationSize < 3)
   {
-    throw std::logic_error("CNE::Optimize(): population size should be at least"
+    throw std::logic_error("DE::Optimize(): population size should be at least"
         " 3!");
   }
 
   // Initialize helper variables.
   fitnessValues.set_size(populationSize);
-  ElemType lastBestFitness = DBL_MAX;
+  ElemType lastBestFitness = std::numeric_limits<ElemType>::max();
   BaseMatType bestElement;
 
   // Controls early termination of the optimization process.
@@ -82,7 +84,7 @@ typename MatType::elem_type DE::Optimize(FunctionType& function,
 
     if (fitnessValues[i] < lastBestFitness)
     {
-      lastBestFitness = fitnessValues[i];
+      lastBestFitness = ElemType(fitnessValues[i]);
       bestElement = population[i];
     }
   }
@@ -111,16 +113,17 @@ typename MatType::elem_type DE::Optimize(FunctionType& function,
       while (m == member && m == l);
 
       // Generate new "mutant" from two randomly chosen members.
-      BaseMatType mutant = bestElement + differentialWeight *
+      BaseMatType mutant = bestElement + ElemType(differentialWeight) *
           (population[l] - population[m]);
 
       // Perform crossover.
-      const BaseMatType cr = arma::randu<BaseMatType>(iterate.n_rows);
+      BaseMatType cr;
+      cr.randu(iterate.n_rows, 1);
       for (size_t it = 0; it < iterate.n_rows; it++)
       {
-        if (cr[it] >= crossoverRate)
+        if (cr[it] >= ElemType(crossoverRate))
         {
-          mutant[it] = iterate[it];
+          mutant(it) = ElemType(iterate(it));
         }
       }
 
@@ -158,7 +161,7 @@ typename MatType::elem_type DE::Optimize(FunctionType& function,
     }
 
     // Update helper variables.
-    lastBestFitness = fitnessValues.min();
+    lastBestFitness = ElemType(fitnessValues.min());
     for (size_t it = 0; it < populationSize; it++)
     {
       if (fitnessValues[it] == lastBestFitness)
diff --git a/inst/include/ensmallen_bits/demon_adam/demon_adam.hpp b/inst/include/ensmallen_bits/demon_adam/demon_adam.hpp
index e524531..e37b24e 100644
--- a/inst/include/ensmallen_bits/demon_adam/demon_adam.hpp
+++ b/inst/include/ensmallen_bits/demon_adam/demon_adam.hpp
@@ -31,11 +31,11 @@ namespace ens {
  *
  * @code
  * @misc{
- *   title   = {Decaying momentum helps neural network training},
- *   author  = {John Chen and Cameron Wolfe and Zhao Li
- *              and Anastasios Kyrillidis},
- *   url     = {https://arxiv.org/abs/1910.04952}
- *   year    = {2019}
+ *   title  = {Decaying momentum helps neural network training},
+ *   author = {John Chen and Cameron Wolfe and Zhao Li
+ *             and Anastasios Kyrillidis},
+ *   url    = {https://arxiv.org/abs/1910.04952}
+ *   year   = {2019}
  * }
  *
  * DemonAdam can optimize differentiable separable functions. For more details,
diff --git a/inst/include/ensmallen_bits/demon_adam/demon_adam_update.hpp b/inst/include/ensmallen_bits/demon_adam/demon_adam_update.hpp
index 47f6b36..b7581da 100644
--- a/inst/include/ensmallen_bits/demon_adam/demon_adam_update.hpp
+++ b/inst/include/ensmallen_bits/demon_adam/demon_adam_update.hpp
@@ -90,6 +90,7 @@ class DemonAdamUpdate
     // Convenient typedef.
     typedef typename UpdateRule::template Policy<MatType, GradType>
         InstUpdateRuleType;
+    typedef typename MatType::elem_type ElemType;
 
     /**
      * This constructor is called by the SGD Optimize() method before the start
@@ -103,7 +104,8 @@ class DemonAdamUpdate
            const size_t rows,
            const size_t cols) :
       parent(parent),
-      adamUpdate(new InstUpdateRuleType(parent.adamUpdateInst, rows, cols))
+      adamUpdate(new InstUpdateRuleType(parent.adamUpdateInst, rows, cols)),
+      betaInit(ElemType(parent.betaInit))
     { /* Nothing to do here */ }
 
     /**
@@ -125,12 +127,12 @@ class DemonAdamUpdate
                 const double stepSize,
                 const GradType& gradient)
     {
-      double decayRate = 1;
+      ElemType decayRate = 1;
       if (parent.t > 0)
-        decayRate = 1.0 - (double) parent.t / (double) parent.T;
+        decayRate = 1 - ElemType((double) parent.t / (double) parent.T);
 
-      const double betaDecay = parent.betaInit * decayRate;
-      const double beta = betaDecay / ((1.0 - parent.betaInit) + betaDecay);
+      const ElemType betaDecay = betaInit * decayRate;
+      const ElemType beta = betaDecay / ((1 - betaInit) + betaDecay);
 
       // Perform the update.
       iterate *= beta;
@@ -143,11 +145,14 @@ class DemonAdamUpdate
     }
 
    private:
-    //! Instantiated parent object.
+    // Instantiated parent object.
     DemonAdamUpdate<UpdateRule>& parent;
 
-    //! The update policy.
+    // The update policy.
     InstUpdateRuleType* adamUpdate;
+
+    // Optimizer parameter converted to the element type of the optimization.
+    ElemType betaInit;
   };
 
  private:
diff --git a/inst/include/ensmallen_bits/demon_sgd/demon_sgd.hpp b/inst/include/ensmallen_bits/demon_sgd/demon_sgd.hpp
index ddbf1d2..4c8d514 100644
--- a/inst/include/ensmallen_bits/demon_sgd/demon_sgd.hpp
+++ b/inst/include/ensmallen_bits/demon_sgd/demon_sgd.hpp
@@ -25,11 +25,11 @@ namespace ens {
  *
  * @code
  * @misc{
- *   title   = {Decaying momentum helps neural network training},
- *   author  = {John Chen and Cameron Wolfe and Zhao Li
- *              and Anastasios Kyrillidis},
- *   url     = {https://arxiv.org/abs/1910.04952}
- *   year    = {2019}
+ *   title  = {Decaying momentum helps neural network training},
+ *   author = {John Chen and Cameron Wolfe and Zhao Li
+ *             and Anastasios Kyrillidis},
+ *   url    = {https://arxiv.org/abs/1910.04952}
+ *   year   = {2019}
  * }
  *
  * DemonSGD can optimize differentiable separable functions. For more details,
diff --git a/inst/include/ensmallen_bits/demon_sgd/demon_sgd_update.hpp b/inst/include/ensmallen_bits/demon_sgd/demon_sgd_update.hpp
index dc8b7c5..41638a3 100644
--- a/inst/include/ensmallen_bits/demon_sgd/demon_sgd_update.hpp
+++ b/inst/include/ensmallen_bits/demon_sgd/demon_sgd_update.hpp
@@ -78,6 +78,8 @@ class DemonSGDUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -89,7 +91,8 @@ class DemonSGDUpdate
     Policy(DemonSGDUpdate& parent,
            const size_t /* rows */,
            const size_t /* cols */) :
-      parent(parent)
+      parent(parent),
+      betaInit(ElemType(parent.betaInit))
     { /* Nothing to do here */ }
 
     /**
@@ -103,34 +106,37 @@ class DemonSGDUpdate
                 const double stepSize,
                 const GradType& gradient)
     {
-      double decayRate = 1;
+      ElemType decayRate = 1;
       if (parent.t > 0)
-        decayRate = 1.0 - (double) parent.t / (double) parent.T;
+        decayRate = 1 - ElemType((double) parent.t / (double) parent.T);
 
-      const double betaDecay = parent.betaInit * decayRate;
-      const double beta = betaDecay / ((1.0 - parent.betaInit) + betaDecay);
+      const ElemType betaDecay = betaInit * decayRate;
+      const ElemType beta = betaDecay / ((1 - betaInit) + betaDecay);
 
       // Perform the update.
       iterate *= beta;
-      iterate -= stepSize * gradient;
+      iterate -= ElemType(stepSize) * gradient;
 
       // Increment the iteration counter variable.
       ++parent.t;
     }
 
    private:
-    //! Instantiated parent object.
+    // Instantiated parent object.
     DemonSGDUpdate& parent;
+
+    // Optimizer parameter converted to the element type of the optimization.
+    ElemType betaInit;
   };
 
  private:
-  //! The number of momentum iterations.
+  // The number of momentum iterations.
   size_t T;
 
-  //! Initial momentum coefficient.
+  // Initial momentum coefficient.
   double betaInit;
 
-  //! The number of iterations.
+  // The number of iterations.
   size_t t;
 };
 
diff --git a/inst/include/ensmallen_bits/ens_version.hpp b/inst/include/ensmallen_bits/ens_version.hpp
index 86c29ba..d6bf013 100644
--- a/inst/include/ensmallen_bits/ens_version.hpp
+++ b/inst/include/ensmallen_bits/ens_version.hpp
@@ -12,20 +12,20 @@
 
 // This follows the Semantic Versioning pattern defined in https://semver.org/.
 
-#define ENS_VERSION_MAJOR 2
+#define ENS_VERSION_MAJOR 3
 // The minor version is two digits so regular numerical comparisons of versions
 // work right.  The first minor version of a release is always 10.
-#define ENS_VERSION_MINOR 22
-#define ENS_VERSION_PATCH 1
+#define ENS_VERSION_MINOR 10
+#define ENS_VERSION_PATCH 0
 // If this is a release candidate, it will be reflected in the version name
 // (i.e. the version name will be "RC1", "RC2", etc.).  Otherwise the version
 // name will typically be a seemingly arbitrary set of words that does not
 // contain the capitalized string "RC".
-#define ENS_VERSION_NAME "E-Bike Excitement"
+#define ENS_VERSION_NAME "Unexpected Rain"
 // Incorporate the date the version was released.
-#define ENS_VERSION_YEAR "2024"
-#define ENS_VERSION_MONTH "12"
-#define ENS_VERSION_DAY "02"
+#define ENS_VERSION_YEAR "2025"
+#define ENS_VERSION_MONTH "09"
+#define ENS_VERSION_DAY "25"
 
 namespace ens {
 
diff --git a/inst/include/ensmallen_bits/eve/eve.hpp b/inst/include/ensmallen_bits/eve/eve.hpp
index cc38591..c240c6d 100644
--- a/inst/include/ensmallen_bits/eve/eve.hpp
+++ b/inst/include/ensmallen_bits/eve/eve.hpp
@@ -106,7 +106,7 @@ class Eve
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/eve/eve_impl.hpp b/inst/include/ensmallen_bits/eve/eve_impl.hpp
index 3237a4e..7cf4c7d 100644
--- a/inst/include/ensmallen_bits/eve/eve_impl.hpp
+++ b/inst/include/ensmallen_bits/eve/eve_impl.hpp
@@ -49,8 +49,8 @@ template<typename SeparableFunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 Eve::Optimize(SeparableFunctionType& function,
               MatType& iterateIn,
               CallbackTypes&&... callbacks)
@@ -126,29 +126,37 @@ Eve::Optimize(SeparableFunctionType& function,
     if (terminate)
       break;
 
-    m *= beta1;
-    m += (1 - beta1) * gradient;
+    m *= ElemType(beta1);
+    m += (1 - ElemType(beta1)) * gradient;
 
-    v *= beta2;
-    v += (1 - beta2) * (gradient % gradient);
+    v *= ElemType(beta2);
+    v += (1 - ElemType(beta2)) * (gradient % gradient);
 
-    const double biasCorrection1 = 1.0 - std::pow(beta1, (double) (i + 1));
-    const double biasCorrection2 = 1.0 - std::pow(beta2, (double) (i + 1));
+    const ElemType biasCorrection1 =
+        1 - std::pow(ElemType(beta1), ElemType(i + 1));
+    const ElemType biasCorrection2 =
+        1 - std::pow(ElemType(beta2), ElemType(i + 1));
 
     if (i > 0)
     {
       const ElemType d = std::abs(objective - lastObjective) /
-          (std::min(objective, lastObjective) + epsilon);
+          (std::min(objective, lastObjective) + ElemType(epsilon));
 
-      dt *= beta3;
-      dt += (1 - beta3) * std::min(std::max(d, ElemType(1.0 / clip)),
+      dt *= ElemType(beta3);
+      dt += (1 - ElemType(beta3)) * std::min(std::max(d, ElemType(1.0 / clip)),
           ElemType(clip));
     }
 
     lastObjective = objective;
 
-    iterate -= stepSize / dt * (m / biasCorrection1) /
-        (arma::sqrt(v / biasCorrection2) + epsilon);
+    // TODO: remove in ensmallen 4.0.0.
+    #if defined(ENS_OLD_SEPARABLE_STEP_BEHAVIOR)
+    iterate -= ElemType(stepSize) / dt * (m / biasCorrection1) /
+        (sqrt(v / biasCorrection2) + ElemType(epsilon));
+    #else
+    iterate -= (ElemType(stepSize) / (dt * effectiveBatchSize)) *
+        (m / biasCorrection1) / (sqrt(v / biasCorrection2) + ElemType(epsilon));
+    #endif
 
     terminate |= Callback::StepTaken(*this, f, iterate, callbacks...);
 
diff --git a/inst/include/ensmallen_bits/fasta/fasta.hpp b/inst/include/ensmallen_bits/fasta/fasta.hpp
new file mode 100644
index 0000000..3f41815
--- /dev/null
+++ b/inst/include/ensmallen_bits/fasta/fasta.hpp
@@ -0,0 +1,220 @@
+/**
+ * @file fasta.hpp
+ * @author Ryan Curtin
+ *
+ * An implementation of FASTA (Fast Adaptive Shrinkage/Thresholding Algorithm).
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_FASTA_FASTA_HPP
+#define ENSMALLEN_FASTA_FASTA_HPP
+
+#include "../fbs/l1_penalty.hpp"
+#include "../fbs/l1_constraint.hpp"
+
+namespace ens {
+
+/**
+ * FASTA (Fast Adaptive Shrinkage/Thresholding Algorithm) is a proximal
+ * gradient optimization technique for optimizing a function of the form
+ *
+ *   h(x) = f(x) + g(x)
+ *
+ * where f(x) is a differentiable function and g(x) is an arbitrary
+ * non-differentiable function.  In such a situation, standard gradient descent
+ * techniques cannot work because of the non-differentiability of g(x).  To work
+ * around this, FASTA takes a _forward step_ that is just a gradient descent
+ * step on f(x), and then a _backward step_ that is the _proximal operator_
+ * corresponding to g(x).  This continues until convergence.
+ *
+ * This implementation of FASTA allows specification of the backward step (or
+ * proximal operator) via the `BackwardStepType` template parameter.  When using
+ * FBS, the differentiable `FunctionType` given to `Optimize()` should be f(x),
+ * *not* the combined function h(x).  g(x) should be specified by the choice of
+ * `BackwardStepType` (e.g. `L1Penalty` or `L1Maximum`).  The `Optimize()`
+ * function will then return optimized coordinates for h(x), not f(x).
+ *
+ * For more information, see the following paper:
+ *
+ * ```
+ * @article{goldstein2014field,
+ *   title={A field guide to forward-backward splitting with a FASTA
+ *       implementation},
+ *   author={Goldstein, Tom and Studer, Christoph and Baraniuk, Richard},
+ *   journal={arXiv preprint arXiv:1411.3406},
+ *   year={2014}
+ * }
+ * ```
+ */
+template<typename BackwardStepType = L1Penalty>
+class FASTA
+{
+ public:
+  /**
+   * Construct the FASTA optimizer with the given options, using a
+   * default-constructed BackwardStepType.
+   */
+  FASTA(const size_t maxIterations = 10000,
+        const double tolerance = 1e-7,
+        const size_t maxLineSearchSteps = 50,
+        const double stepSizeAdjustment = 2.0,
+        const size_t lineSearchLookback = 10,
+        const bool estimateStepSize = true,
+        const size_t estimateTrials = 10,
+        const double maxStepSize = 0.001);
+
+  /**
+   * Construct the FASTA optimizer with the given options.
+   */
+  FASTA(BackwardStepType backwardStepType,
+        const size_t maxIterations = 10000,
+        const double tolerance = 1e-7,
+        const size_t maxLineSearchSteps = 50,
+        const double stepSizeAdjustment = 2.0,
+        const size_t lineSearchLookback = 10,
+        const bool estimateStepSize = true,
+        const size_t estimateTrials = 10,
+        const double maxStepSize = 0.001);
+
+  /**
+   * Optimize the given function using FASTA.  The given starting
+   * point will be modified to store the finishing point of the algorithm,
+   * the final objective value is returned.
+   *
+   * The FunctionType template class must provide the following functions:
+   *
+   *   double Evaluate(const arma::mat& coordinates);
+   *   void Gradient(const arma::mat& coordinates,
+   *                 arma::mat& gradient);
+   *
+   * @tparam FunctionType Type of function to be optimized.
+   * @tparam MatType Type of objective matrix.
+   * @tparam GradType Type of gradient matrix (default is MatType).
+   * @tparam CallbackTypes Types of callback functions.
+   * @param function Function to be optimized.
+   * @param iterate Input with starting point, and will be modified to save
+   *                the output optimial solution coordinates.
+   * @param callbacks Callback functions.
+   * @return Objective value at the final solution.
+   */
+  template<typename FunctionType, typename MatType, typename GradType,
+           typename... CallbackTypes>
+  typename std::enable_if<IsMatrixType<GradType>::value,
+      typename MatType::elem_type>::type
+  Optimize(FunctionType& function,
+           MatType& iterate,
+           CallbackTypes&&... callbacks);
+
+  //! Forward the MatType as GradType.
+  template<typename FunctionType,
+           typename MatType,
+           typename... CallbackTypes>
+  typename MatType::elem_type Optimize(FunctionType& function,
+                                       MatType& iterate,
+                                       CallbackTypes&&... callbacks)
+  {
+    return Optimize<FunctionType, MatType, MatType,
+        CallbackTypes...>(function, iterate,
+        std::forward<CallbackTypes>(callbacks)...);
+  }
+
+  //! Get the backward step object.
+  const BackwardStepType& BackwardStep() const { return backwardStep; }
+  //! Modify the backward step object.
+  BackwardStepType& BackwardStep() { return backwardStep; }
+
+  //! Get the maximum number of iterations (0 indicates no limit).
+  size_t MaxIterations() const { return maxIterations; }
+  //! Modify the maximum number of iterations (0 indicates no limit).
+  size_t& MaxIterations() { return maxIterations; }
+
+  //! Get the tolerance on the gradient norm for termination.
+  double Tolerance() const { return tolerance; }
+  //! Modify the tolerance on the gradient norm for termination.
+  double& Tolerance() { return tolerance; }
+
+  //! Get the maximum number of line search steps.
+  size_t MaxLineSearchSteps() const { return maxLineSearchSteps; }
+  //! Modify the maximum number of line search steps.
+  size_t& MaxLineSearchSteps() { return maxLineSearchSteps; }
+
+  //! Get the step size adjustment parameter.
+  double StepSizeAdjustment() const { return stepSizeAdjustment; }
+  //! Modify the step size adjustment parameter.
+  double& StepSizeAdjustment() { return stepSizeAdjustment; }
+
+  //! Get the maximum number of iterations to look back during a line search.
+  size_t LineSearchLookback() const { return lineSearchLookback; }
+  //! Modify the maximum number of iterations to look back during a line search.
+  size_t& LineSearchLookback() { return lineSearchLookback; }
+
+  //! Get whether or not to estimate the initial step size.
+  bool EstimateStepSize() const { return estimateStepSize; }
+  //! Modify whether or not to estimate the initial step size.
+  bool& EstimateStepSize() { return estimateStepSize; }
+
+  //! Get the number of trials to use for Lipschitz constant estimation.
+  size_t EstimateTrials() const { return estimateTrials; }
+  //! Modify the number of trials to use for Lipschitz constant estimation.
+  size_t& EstimateTrials() { return estimateTrials; }
+
+  //! Get the maximum step size.  If Optimize() has been called, this will
+  //! contain the estimated maximum step size value.
+  double MaxStepSize() const { return maxStepSize; }
+  //! Modify the step size (ignored if EstimateStepSize() is true).
+  double& MaxStepSize() { return maxStepSize; }
+
+ private:
+  //! Utility function: fill with random values.
+  template<typename MatType>
+  static void RandomFill(MatType& x,
+                         const size_t rows,
+                         const size_t cols,
+                         const typename MatType::elem_type maxVal);
+
+  template<typename eT>
+  static void RandomFill(arma::SpMat<eT>& x,
+                         const size_t rows,
+                         const size_t cols,
+                         const eT maxVal);
+
+  template<typename FunctionType, typename MatType>
+  void EstimateLipschitzStepSize(FunctionType& f, const MatType& x);
+
+  //! The instantiated backward step object.
+  BackwardStepType backwardStep;
+
+  //! The maximum number of allowed iterations.
+  size_t maxIterations;
+
+  //! The tolerance for termination.
+  double tolerance;
+
+  //! The maximum number of line search trials.
+  size_t maxLineSearchSteps;
+
+  //! The step size adjustment parameter for the line search.
+  double stepSizeAdjustment;
+
+  //! The maximum number of iterations to look back during a line search.
+  size_t lineSearchLookback;
+
+  //! Whether or not to try and estimate the initial step size.
+  bool estimateStepSize;
+
+  //! Number of trials to use for initial step size estimation.
+  size_t estimateTrials;
+
+  //! The maximum step size to use (estimated if estimateStepSize is true).
+  double maxStepSize;
+};
+
+} // namespace ens
+
+// Include implementation.
+#include "fasta_impl.hpp"
+
+#endif
diff --git a/inst/include/ensmallen_bits/fasta/fasta_impl.hpp b/inst/include/ensmallen_bits/fasta/fasta_impl.hpp
new file mode 100644
index 0000000..ee2331a
--- /dev/null
+++ b/inst/include/ensmallen_bits/fasta/fasta_impl.hpp
@@ -0,0 +1,546 @@
+/**
+ * @file fasta_impl.hpp
+ * @author Ryan Curtin
+ *
+ * Implementation of FASTA (Fast Adaptive Shrinkage/Thresholding Algorithm).
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_FASTA_FASTA_IMPL_HPP
+#define ENSMALLEN_FASTA_FASTA_IMPL_HPP
+
+// In case it hasn't been included yet.
+#include "fasta.hpp"
+
+#include <ensmallen_bits/function.hpp>
+
+namespace ens {
+
+//! Constructor of the FBS class.
+template<typename BackwardStepType>
+FASTA<BackwardStepType>::FASTA(const size_t maxIterations,
+                               const double tolerance,
+                               const size_t maxLineSearchSteps,
+                               const double stepSizeAdjustment,
+                               const size_t lineSearchLookback,
+                               const bool estimateStepSize,
+                               const size_t estimateTrials,
+                               const double maxStepSize) :
+    maxIterations(maxIterations),
+    tolerance(tolerance),
+    maxLineSearchSteps(maxLineSearchSteps),
+    stepSizeAdjustment(stepSizeAdjustment),
+    lineSearchLookback(lineSearchLookback),
+    estimateStepSize(estimateStepSize),
+    estimateTrials(estimateTrials),
+    maxStepSize(maxStepSize)
+{
+  // Check estimateSteps parameter.
+  if (estimateStepSize && estimateTrials == 0)
+  {
+    throw std::invalid_argument("FASTA::FASTA(): estimateTrials must be greater"
+        " than 0!");
+  }
+
+  if (lineSearchLookback == 0)
+  {
+    throw std::invalid_argument("FASTA::FASTA(): lineSearchLookback cannot be "
+        "0!");
+  }
+}
+
+template<typename BackwardStepType>
+FASTA<BackwardStepType>::FASTA(BackwardStepType backwardStep,
+                               const size_t maxIterations,
+                               const double tolerance,
+                               const size_t maxLineSearchSteps,
+                               const double stepSizeAdjustment,
+                               const size_t lineSearchLookback,
+                               const bool estimateStepSize,
+                               const size_t estimateTrials,
+                               const double maxStepSize) :
+    backwardStep(std::move(backwardStep)),
+    maxIterations(maxIterations),
+    tolerance(tolerance),
+    maxLineSearchSteps(maxLineSearchSteps),
+    stepSizeAdjustment(stepSizeAdjustment),
+    lineSearchLookback(lineSearchLookback),
+    estimateStepSize(estimateStepSize),
+    estimateTrials(estimateTrials),
+    maxStepSize(maxStepSize)
+{
+  // Check estimateSteps parameter.
+  if (estimateStepSize && estimateTrials == 0)
+  {
+    throw std::invalid_argument("FASTA::FASTA(): estimateTrials must be greater"
+        " than 0!");
+  }
+
+  if (lineSearchLookback == 0)
+  {
+    throw std::invalid_argument("FASTA::FASTA(): lineSearchLookback cannot be "
+        "0!");
+  }
+}
+
+//! Optimize the function (minimize).
+template<typename BackwardStepType>
+template<typename FunctionType, typename MatType, typename GradType,
+         typename... CallbackTypes>
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
+FASTA<BackwardStepType>::Optimize(FunctionType& function,
+                                  MatType& iterateIn,
+                                  CallbackTypes&&... callbacks)
+{
+  // Convenience typedefs.
+  typedef typename MatType::elem_type ElemType;
+  typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
+  typedef typename MatTypeTraits<GradType>::BaseMatType BaseGradType;
+
+  typedef Function<FunctionType, BaseMatType, BaseGradType> FullFunctionType;
+  FullFunctionType& f = static_cast<FullFunctionType&>(function);
+
+  // Make sure we have all necessary functions.
+  traits::CheckFunctionTypeAPI<FullFunctionType, BaseMatType, BaseGradType>();
+  RequireFloatingPointType<BaseMatType>();
+  RequireFloatingPointType<BaseGradType>();
+  RequireSameInternalTypes<BaseMatType, BaseGradType>();
+
+  // Sanity check: make sure lineSearchLookback is valid.
+  if (lineSearchLookback == 0)
+  {
+    throw std::invalid_argument("FASTA::FASTA(): lineSearchLookback cannot be "
+        "0!");
+  }
+
+  // Here we make a copy because we will use std::move() internally, and if
+  // iterateIn is an alias, this is unsafe.  We will copy the final result back
+  // to iterateIn at the end.
+  BaseMatType x(iterateIn);
+
+  // To keep track of the function value.
+  ElemType currentFObj = f.Evaluate(x);
+  ElemType currentGObj = backwardStep.Evaluate(x);
+  ElemType currentObj = currentFObj + currentGObj;
+
+  // This will be the denominator of the normalized residual termination
+  // condition.
+  ElemType firstResidual = ElemType(0);
+
+  // This will be used in the non-monotone line search, to track the last
+  // several function values.
+  arma::Col<ElemType> lastFObjs(lineSearchLookback);
+  lastFObjs.fill(std::numeric_limits<ElemType>::min());
+  size_t currentObjPos = 0;
+
+  BaseGradType g(x.n_rows, x.n_cols);
+  BaseMatType lastXHat; // Used for residual checks.
+  BaseMatType lastX; // Used for residual and alpha reset checks.
+  BaseMatType xHat; // Used for residual checks.
+  BaseMatType lpaX = x; // Used for alpha reset check.
+  ElemType alpha = ElemType(1); // Initialize alpha^1 = 1.
+  ElemType lastAlpha = alpha;
+
+  // Controls early termination of the optimization process.
+  bool terminate = false;
+
+  // First, estimate the Lipschitz constant to set the initial/maximum step
+  // size, if the user asked us to.
+  if (estimateStepSize)
+    EstimateLipschitzStepSize(f, x);
+
+  // Keep track of the last step size we used.
+  ElemType currentStepSize = (ElemType) maxStepSize;
+  ElemType lastStepSize = (ElemType) maxStepSize;
+
+  Callback::BeginOptimization(*this, f, x, callbacks...);
+  for (size_t i = 1; i != maxIterations && !terminate; ++i)
+  {
+    // During this optimization, we want to optimize h(x) = f(x) + g(x).
+    // f(x) is `f`, but g(x) is specified by `BackwardStepType`.
+
+    // The first step is to compute a step size via a non-monotone line search.
+    // To do this, we need to compute the gradient f'(y) as required by the line
+    // search condition in Eq. (38).  Note that our code does a little sleight
+    // of hand, and so `x` stores what the paper calls `y^k` here.  (See the
+    // code for the adaptive step below.)
+    currentFObj = f.EvaluateWithGradient(x, g);
+    terminate |= Callback::EvaluateWithGradient(*this, f, x, currentFObj, g,
+        callbacks...);
+
+    // Use backtracking non-monotone line search to find the best step size.
+    // This is the version from the FASTA paper, but with a minor modification:
+    // we start our search at the last step size, and allow the search to
+    // increase the step size up to the maximum step size if it can.  This is a
+    // more effective heuristic than simply starting at the largest allowable
+    // step size and shrinking from there, especially in regions where the
+    // gradient norm is small.  It is also more effective than simply starting
+    // at the last step size and shrinking from there, as it prevents getting
+    // "stuck" with a very small step size.
+    bool lsDone = false;
+    size_t lsTrial = 0;
+    bool increasing = false; // Will be set during the first iteration.
+    ElemType lastFObj = ElemType(0);
+    BaseMatType lsLastX; // Only used in increasing mode.
+    BaseMatType lsLastXHat; // Only used in increasing mode.
+    BaseMatType xDiff;
+
+    lastX = std::move(x);
+    lastStepSize = currentStepSize;
+    currentStepSize = std::min(currentStepSize, (ElemType) maxStepSize);
+
+    // Ensure that the last `lineSearchLookback` objective values are recorded
+    // properly.
+    lastFObjs[currentObjPos] = currentFObj;
+    currentObjPos = (currentObjPos + 1) % lineSearchLookback;
+    const ElemType strictMaxFObj = currentFObj;
+    const ElemType maxFObj = lastFObjs.max();
+
+    while (!lsDone && !terminate)
+    {
+      if (lsTrial == maxLineSearchSteps)
+      {
+        if (increasing)
+        {
+          Warn << "FASTA::Optimize(): line search reached maximum number of "
+              << "steps (" << maxLineSearchSteps << "); using step size "
+              << currentStepSize << "." << std::endl;
+          break; // The step size is still valid.
+        }
+        else
+        {
+          Warn << "FASTA::Optimize(): could not find valid step size in range "
+              << "(0, " << maxStepSize << "]!  Terminating optimization."
+              << std::endl;
+          terminate = true;
+          break;
+        }
+      }
+
+      // If the step size has converged to zero, we are done.
+      if (currentStepSize == ElemType(0))
+      {
+        Warn << "FASTA::Optimize(): computed zero step size; terminating "
+            << "optimization." << std::endl;
+        terminate = true;
+        break;
+      }
+
+      // Perform forward update into x.
+      xHat = lastX - currentStepSize * g;
+      // (We must store xHat separately for the residual, so this copy is
+      // necessary.)
+      x = xHat;
+      backwardStep.ProximalStep(x, currentStepSize);
+
+      // Compute objective of new point.
+      const ElemType fObj = f.Evaluate(x);
+      terminate |= Callback::Evaluate(*this, f, x, fObj, callbacks...);
+
+      // Compute the quadratic approximation of the objective (the condition in
+      // Eq. (38)).
+      xDiff = (x - lastX);
+
+      // Note: since we allow the step size to increase, we have to modify the
+      // non-monotone line search a little bit to keep things from diverging.
+      // Specifically, if we are increasing the step size, then we force a
+      // monotone line search (by looking only at the previous function value).
+      // It is only when we are decreasing the step size that we allow
+      // relaxation.
+      const ElemType relaxedCond = maxFObj + dot(xDiff, g) +
+          (1 / (2 * currentStepSize)) * dot(xDiff, xDiff);
+      const ElemType strictCond = strictMaxFObj + dot(xDiff, g) +
+          (1 / (2 * currentStepSize)) * dot(xDiff, xDiff);
+
+      // If we're on the first iteration, we don't know if we should be
+      // searching for a step size by increasing or decreasing the step size.
+      // (Remember that our valid ranges of step sizes are [0, maxStepSize], and
+      // we are starting at lastStepSize.)
+      //
+      // Thus, if the condition is satisfied, let's try increasing the step size
+      // until it's no longer satisfied.  Otherwise, we will have to decrease
+      // the step size.
+      if (lsTrial == 0)
+      {
+        increasing = ((fObj <= strictCond) && (std::isfinite(fObj)));
+      }
+
+      if (increasing)
+      {
+        // If we are in "increasing" mode, then termination occurs on the first
+        // iteration when the strict condition is *not* satisfied (and we use
+        // the last step size).
+        if ((fObj > strictCond) || (!std::isfinite(fObj)))
+        {
+          lsDone = true;
+          x = std::move(lsLastX);
+          xHat = std::move(lsLastXHat);
+          currentFObj = lastFObj;
+          currentStepSize = lastStepSize; // Take one step backwards.
+        }
+        else if (currentStepSize == (ElemType) maxStepSize)
+        {
+          // The condition is still satisfied, but the step size will be too big
+          // if we take another step.  Go back to the maximum step size.
+          lsDone = true;
+          currentFObj = fObj;
+        }
+        else
+        {
+          // The condition is still satisfied; increase the step size.
+          lastStepSize = currentStepSize;
+          currentStepSize *= ElemType(stepSizeAdjustment);
+          lsLastX = std::move(x);
+          lsLastXHat = std::move(xHat);
+          lastFObj = fObj;
+          ++lsTrial;
+        }
+      }
+      else
+      {
+        // If we are in "decreasing" mode, then termination occurs on the first
+        // iteration when the relaxed condition is satisfied.
+        if ((fObj <= relaxedCond) && (std::isfinite(fObj)))
+        {
+          lsDone = true;
+          currentFObj = fObj;
+        }
+        else
+        {
+          // The condition is not yet satisfied; decrease the step size.
+          currentStepSize /= ElemType(stepSizeAdjustment);
+          ++lsTrial;
+        }
+      }
+    }
+
+    if (!lsDone)
+    {
+      // The line search failed, so terminate.
+      Warn << "FASTA::Optimize(): non-monotone line search failed after "
+          << maxLineSearchSteps << " steps; terminating optimization."
+          << std::endl;
+      x = std::move(lastX);
+      terminate = true;
+    }
+
+    // If we terminated during the line search, we are done.
+    if (terminate)
+      break;
+
+    // Now that we have taken a step, compute the full objective by computing
+    // g(x).
+    currentGObj = backwardStep.Evaluate(x);
+    currentObj = currentFObj + currentGObj;
+
+    // Output current objective function.
+    Info << "FASTA::Optimize(): iteration " << i << ", combined objective "
+        << currentObj << " (f(x) = " << currentFObj << ", g(x) = "
+        << currentGObj << "), step size " << currentStepSize << "."
+        << std::endl;
+
+    // Sanity check for divergence.
+    if ((i > 1) && !std::isfinite(currentObj))
+    {
+      Warn << "FASTA::Optimize(): objective diverged to "
+          << currentObj << "; terminating optimization." << std::endl;
+      terminate = true;
+      break;
+    }
+
+    // Now, check for convergence.  The FASTA convergence check uses both the
+    // normalized residual and the relative residual, stopping when either
+    // becomes sufficiently small.  The check depends on x before and after the
+    // proximal step.
+
+    // Compute residual.  This is Eq. (40) in the paper.
+    const ElemType residual = norm(g + (xHat - x) / currentStepSize, 2);
+
+    // If this is the first iteration, store the residual as the first residual.
+    if (i == 1)
+      firstResidual = residual;
+
+    // First, check the normalized residual for convergence.  This is Eq. (43)
+    // in the paper.
+    const ElemType eps = 20 * std::numeric_limits<ElemType>::epsilon();
+    const ElemType normalizedResidual = residual / (firstResidual + eps);
+
+    if ((i < 10) && (normalizedResidual < ElemType(1e-5)))
+    {
+      // Heuristic: sometimes the optimization starts in such an awful place
+      // that we are able to make huge amounts of progress in the first few
+      // iterations.  In this case, reset the firstResidual to the slightly
+      // better point we get to by the tenth iterate.
+      firstResidual = residual;
+    }
+    else if ((i > 10) && (normalizedResidual < tolerance))
+    {
+      Info << "FASTA::Optimize(): normalized residual minimized within "
+          << "tolerance " << tolerance << "; terminating optimization."
+          << std::endl;
+      break;
+    }
+
+    // Next, check the relative residual for convergence.  This is Eq. (42) in
+    // the paper.
+    const ElemType gNorm = norm(g, 2);
+    const ElemType proxStepNorm = norm((xHat - x) / currentStepSize, 2);
+
+    const ElemType relativeResidual = residual /
+        (std::max(gNorm, proxStepNorm) + 20 * eps);
+
+    if (relativeResidual < tolerance)
+    {
+      Info << "FASTA::Optimize(): relative residual minimized within "
+          << "tolerance " << tolerance << "; terminating optimization."
+          << std::endl;
+      break;
+    }
+
+    // Compute updated prediction parameter alpha.
+    lastAlpha = alpha;
+    alpha = (1 + std::sqrt(1 + 4 * std::pow(alpha, ElemType(2)))) / 2;
+
+    // Take a predictive step.
+    BaseMatType y = x + ((lastAlpha - 1) / alpha) * (x - lpaX);
+
+    // Sometimes alpha can get to be too large; this restart scheme is taken
+    // originally from O'Donoghue and Candes, "Adaptive restart for accelerated
+    // gradient schemes", 2012.
+    //
+    // The notation is confusing here when compared with Eq. (37) in the paper.
+    // This is because the paper is poorly notated, although it's not clear much
+    // has been done here to improve things.  To translate:
+    //
+    //    Paper   Code    Explanation
+    //
+    //    y^k     lastX   This is the result of the predictive step on the
+    //                    previous iteration.  In our code, we apply the
+    //                    predictive step to x, which next iteration becomes
+    //                    lastX.
+    //
+    //    x^k     x       This is the iterate before the predictive step, this
+    //                    iteration.
+    //
+    //    x^k-1   lpaX    "Last Pre-Accelerated X"---we have to take a specific
+    //                    step to store this.
+    //
+    const ElemType restartCheck = dot(lastX - x, x - lpaX);
+    if (restartCheck > 0)
+    {
+      Info << "FASTA::Optimize(): alpha too large (" << alpha << "); reset to "
+          << "1." << std::endl;
+      alpha = ElemType(1);
+      lastAlpha = ElemType(1);
+    }
+
+    lpaX = std::move(x);
+    x = std::move(y);
+
+    terminate |= Callback::StepTaken(*this, f, x, callbacks...);
+  }
+
+  if (!terminate)
+  {
+    Info << "FASTA::Optimize(): maximum iterations (" << maxIterations
+        << ") reached; terminating optimization." << std::endl;
+  }
+
+  Callback::EndOptimization(*this, f, x, callbacks...);
+
+  ((BaseMatType&) iterateIn) = x;
+  return currentObj;
+} // Optimize()
+
+template<typename BackwardStepType>
+template<typename MatType>
+void FASTA<BackwardStepType>::RandomFill(
+    MatType& x,
+    const size_t rows,
+    const size_t cols,
+    const typename MatType::elem_type maxVal)
+{
+  x.randu(rows, cols);
+  x *= maxVal;
+}
+
+template<typename BackwardStepType>
+template<typename eT>
+void FASTA<BackwardStepType>::RandomFill(
+    arma::SpMat<eT>& x,
+    const size_t rows,
+    const size_t cols,
+    const eT maxVal)
+{
+  eT density = eT(0.1);
+  // Try and keep the matrix from having too many elements.
+  if (rows * cols > 100000)
+    density = eT(0.01);
+  else if (rows * cols > 1000000)
+    density = eT(0.001);
+  else if (rows * cols > 10000000)
+    density = eT(0.0001);
+
+  x.sprandu(rows, cols, density);
+
+  // Make sure we got at least some nonzero elements...
+  while (x.n_nonzero == 0)
+  {
+    if (x.n_elem < 10)
+      x.sprandu(rows, cols, 1.0);
+    else
+      x.sprandu(rows, cols, 0.5);
+  }
+
+  x *= maxVal;
+}
+
+template<typename BackwardStepType>
+template<typename FunctionType, typename MatType>
+void FASTA<BackwardStepType>::EstimateLipschitzStepSize(
+    FunctionType& f,
+    const MatType& x)
+{
+  typedef typename MatType::elem_type ElemType;
+
+  // Sanity check for estimateSteps parameter.
+  if (estimateTrials == 0)
+  {
+    throw std::invalid_argument("FASTA::Optimize(): estimateTrials must be "
+        "greater than 0!");
+  }
+
+  const ElemType xMax = std::max(ElemType(1), 2 * x.max());
+  ElemType sum = ElemType(0);
+  MatType x1, x2, gx1, gx2;
+
+  for (size_t t = 0; t < estimateTrials; ++t)
+  {
+    RandomFill(x1, x.n_rows, x.n_cols, xMax);
+    RandomFill(x2, x.n_rows, x.n_cols, xMax);
+
+    f.Gradient(x1, gx1);
+    f.Gradient(x2, gx2);
+
+    // Compute a Lipschitz constant estimate.
+    const ElemType lEst = norm(gx1 - gx2, 2) / norm(x1 - x2, 2);
+    sum += lEst;
+  }
+
+  sum /= estimateTrials;
+  if (sum == 0)
+    maxStepSize = std::numeric_limits<ElemType>::max();
+  else
+    maxStepSize = (10 / sum);
+
+  Info << "FASTA::Optimize(): estimated a maximum step size of "
+      << maxStepSize << "." << std::endl;
+}
+
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/fbs/fbs.hpp b/inst/include/ensmallen_bits/fbs/fbs.hpp
new file mode 100644
index 0000000..5cb6b0d
--- /dev/null
+++ b/inst/include/ensmallen_bits/fbs/fbs.hpp
@@ -0,0 +1,153 @@
+/**
+ * @file fbs.hpp
+ * @author Ryan Curtin
+ *
+ * An implementation of Forward-Backward Splitting (FBS).
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_FBS_FBS_HPP
+#define ENSMALLEN_FBS_FBS_HPP
+
+#include "l1_penalty.hpp"
+#include "l1_constraint.hpp"
+
+namespace ens {
+
+/**
+ * Forward-Backward Splitting is a proximal gradient optimization technique for
+ * optimizing a function of the form
+ *
+ *   h(x) = f(x) + g(x)
+ *
+ * where f(x) is a differentiable function and g(x) is an arbitrary
+ * non-differentiable function.  In such a situation, standard gradient descent
+ * techniques cannot work because of the non-differentiability of g(x).  To work
+ * around this, FBS takes a _forward step_ that is just a gradient descent step
+ * on f(x), and then a _backward step_ that is the _proximal operator_
+ * corresponding to g(x).  This continues until convergence.
+ *
+ * This implementation of FBS allows specification of the backward step (or
+ * proximal operator) via the `BackwardStepType` template parameter.  When using
+ * FBS, the differentiable `FunctionType` given to `Optimize()` should be f(x),
+ * *not* the combined function h(x).  g(x) should be specified by the choice of
+ * `BackwardStepType` (e.g. `L1Penalty` or `L1Maximum`).  The `Optimize()`
+ * function will then return optimized coordinates for h(x), not f(x).
+ *
+ * For more information, see the following paper:
+ *
+ * ```
+ * @article{goldstein2014field,
+ *   title={A field guide to forward-backward splitting with a FASTA
+ *       implementation},
+ *   author={Goldstein, Tom and Studer, Christoph and Baraniuk, Richard},
+ *   journal={arXiv preprint arXiv:1411.3406},
+ *   year={2014}
+ * }
+ * ```
+ */
+template<typename BackwardStepType = L1Penalty>
+class FBS
+{
+ public:
+  /**
+   * Construct the FBS optimizer with the given options, using a
+   * default-constructed BackwardStepType.
+   */
+  FBS(const double stepSize = 0.001,
+      const size_t maxIterations = 10000,
+      const double tolerance = 1e-10);
+
+  /**
+   * Construct the FBS optimizer with the given options.
+   */
+  FBS(BackwardStepType backwardStepType,
+      const double stepSize = 0.001,
+      const size_t maxIterations = 10000,
+      const double tolerance = 1e-10);
+
+  /**
+   * Optimize the given function using FBS.  The given starting
+   * point will be modified to store the finishing point of the algorithm,
+   * the final objective value is returned.
+   *
+   * FunctionType template class must provide the following functions:
+   *
+   *   double Evaluate(const arma::mat& coordinates);
+   *   void Gradient(const arma::mat& coordinates,
+   *                 arma::mat& gradient);
+   *
+   * @tparam FunctionType Type of function to be optimized.
+   * @tparam MatType Type of objective matrix.
+   * @tparam GradType Type of gradient matrix (default is MatType).
+   * @tparam CallbackTypes Types of callback functions.
+   * @param function Function to be optimized.
+   * @param iterate Input with starting point, and will be modified to save
+   *                the output optimial solution coordinates.
+   * @param callbacks Callback functions.
+   * @return Objective value at the final solution.
+   */
+  template<typename FunctionType, typename MatType, typename GradType,
+           typename... CallbackTypes>
+  typename std::enable_if<IsMatrixType<GradType>::value,
+      typename MatType::elem_type>::type
+  Optimize(FunctionType& function,
+           MatType& iterate,
+           CallbackTypes&&... callbacks);
+
+  //! Forward the MatType as GradType.
+  template<typename FunctionType,
+           typename MatType,
+           typename... CallbackTypes>
+  typename MatType::elem_type Optimize(FunctionType& function,
+                                       MatType& iterate,
+                                       CallbackTypes&&... callbacks)
+  {
+    return Optimize<FunctionType, MatType, MatType,
+        CallbackTypes...>(function, iterate,
+        std::forward<CallbackTypes>(callbacks)...);
+  }
+
+  //! Get the backward step object.
+  const BackwardStepType& BackwardStep() const { return backwardStep; }
+  //! Modify the backward step object.
+  BackwardStepType& BackwardStep() { return backwardStep; }
+
+  //! Get the step size.
+  double StepSize() const { return stepSize; }
+  //! Modify the step size.
+  double& StepSize() { return stepSize; }
+
+  //! Get the maximum number of iterations (0 indicates no limit).
+  size_t MaxIterations() const { return maxIterations; }
+  //! Modify the maximum number of iterations (0 indicates no limit).
+  size_t& MaxIterations() { return maxIterations; }
+
+  //! Get the tolerance for termination.
+  double Tolerance() const { return tolerance; }
+  //! Modify the tolerance for termination.
+  double& Tolerance() { return tolerance; }
+
+ private:
+  //! The instantiated backward step object.
+  BackwardStepType backwardStep;
+
+  //! The step size for FBS steps.
+  double stepSize;
+
+  //! The maximum number of allowed iterations.
+  size_t maxIterations;
+
+  //! The tolerance for termination.
+  double tolerance;
+};
+
+} // namespace ens
+
+// Include implementation.
+#include "fbs_impl.hpp"
+
+#endif
diff --git a/inst/include/ensmallen_bits/fbs/fbs_impl.hpp b/inst/include/ensmallen_bits/fbs/fbs_impl.hpp
new file mode 100644
index 0000000..d28d6df
--- /dev/null
+++ b/inst/include/ensmallen_bits/fbs/fbs_impl.hpp
@@ -0,0 +1,141 @@
+/**
+ * @file fbs_impl.hpp
+ * @author Ryan Curtin
+ *
+ * Implementation of Forward-Backward Splitting (FBS).
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_FBS_FBS_IMPL_HPP
+#define ENSMALLEN_FBS_FBS_IMPL_HPP
+
+// In case it hasn't been included yet.
+#include "fbs.hpp"
+
+#include <ensmallen_bits/function.hpp>
+
+namespace ens {
+
+//! Constructor of the FBS class.
+template<typename BackwardStepType>
+FBS<BackwardStepType>::FBS(const double stepSize,
+                           const size_t maxIterations,
+                           const double tolerance) :
+    stepSize(stepSize),
+    maxIterations(maxIterations),
+    tolerance(tolerance)
+{ /* Nothing to do. */ }
+
+template<typename BackwardStepType>
+FBS<BackwardStepType>::FBS(BackwardStepType backwardStep,
+                           const double stepSize,
+                           const size_t maxIterations,
+                           const double tolerance) :
+    backwardStep(std::move(backwardStep)),
+    stepSize(stepSize),
+    maxIterations(maxIterations),
+    tolerance(tolerance)
+{ /* Nothing to do. */ }
+
+//! Optimize the function (minimize).
+template<typename BackwardStepType>
+template<typename FunctionType, typename MatType, typename GradType,
+         typename... CallbackTypes>
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
+FBS<BackwardStepType>::Optimize(FunctionType& function,
+                                MatType& iterateIn,
+                                CallbackTypes&&... callbacks)
+{
+  // Convenience typedefs.
+  typedef typename MatType::elem_type ElemType;
+  typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
+  typedef typename MatTypeTraits<GradType>::BaseMatType BaseGradType;
+
+  typedef Function<FunctionType, BaseMatType, BaseGradType> FullFunctionType;
+  FullFunctionType& f = static_cast<FullFunctionType&>(function);
+
+  // Make sure we have all necessary functions.
+  traits::CheckFunctionTypeAPI<FullFunctionType, BaseMatType, BaseGradType>();
+  RequireFloatingPointType<BaseMatType>();
+  RequireFloatingPointType<BaseGradType>();
+  RequireSameInternalTypes<BaseMatType, BaseGradType>();
+
+  BaseMatType& iterate = (BaseMatType&) iterateIn;
+
+  // To keep track of the function value.
+  ElemType currentObjective = std::numeric_limits<ElemType>::max();
+  ElemType currentFObjective = currentObjective;
+  ElemType currentGObjective = currentObjective;
+  ElemType lastObjective = currentObjective;
+
+  BaseGradType gradient(iterate.n_rows, iterate.n_cols);
+
+  // Controls early termination of the optimization process.
+  bool terminate = false;
+
+  Callback::BeginOptimization(*this, f, iterate, callbacks...);
+  for (size_t i = 1; i != maxIterations && !terminate; ++i)
+  {
+    // During this optimization, we want to optimize h(x) = f(x) + g(x).
+    // f(x) is `f`, but g(x) is specified by `BackwardStepType`.
+
+    // First compute f(x) and f'(x).
+    currentFObjective = f.EvaluateWithGradient(iterate, gradient);
+    // Now compute g(x) to get the full objective.
+    currentGObjective = backwardStep.Evaluate(iterate);
+
+    lastObjective = currentObjective;
+    currentObjective = currentFObjective + currentGObjective;
+
+    terminate |= Callback::EvaluateWithGradient(*this, f, iterate,
+        currentObjective, gradient, callbacks...);
+
+    // Output current objective function.
+    Info << "FBS::Optimize(): iteration " << i << ", combined objective "
+        << currentObjective << " (f(x) = " << currentFObjective << ", g(x) = "
+        << currentGObjective << ")." << std::endl;
+
+    // Check for convergence.
+    if ((i > 1) && (std::abs(currentObjective - lastObjective) < tolerance))
+    {
+      Info << "FBS::Optimize(): minimized within objective tolerance "
+          << tolerance << "; terminating optimization." << std::endl;
+
+      Callback::EndOptimization(*this, f, iterate, callbacks...);
+      return currentObjective;
+    }
+
+    if ((i > 1) && !std::isfinite(currentObjective))
+    {
+      Warn << "FBS::Optimize(): objective diverged to " << currentObjective
+          << "; terminating optimization." << std::endl;
+
+      Callback::EndOptimization(*this, f, iterate, callbacks...);
+      return currentObjective;
+    }
+
+    // Perform forward update.
+    iterate -= ElemType(stepSize) * gradient;
+    // Now perform backward step (proximal update).
+    backwardStep.ProximalStep(iterate, stepSize);
+
+    terminate |= Callback::StepTaken(*this, f, iterate, callbacks...);
+  }
+
+  if (!terminate)
+  {
+    Info << "FBS::Optimize(): maximum iterations (" << maxIterations
+        << ") reached; terminating optimization." << std::endl;
+  }
+
+  Callback::EndOptimization(*this, f, iterate, callbacks...);
+  return currentObjective;
+} // Optimize()
+
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/fbs/l1_constraint.hpp b/inst/include/ensmallen_bits/fbs/l1_constraint.hpp
new file mode 100644
index 0000000..9513eff
--- /dev/null
+++ b/inst/include/ensmallen_bits/fbs/l1_constraint.hpp
@@ -0,0 +1,81 @@
+/**
+ * @file l1_constraint.hpp
+ * @author Ryan Curtin
+ *
+ * An implementation of the proximal operator for the L1 constraint.
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_FBS_L1_CONSTRAINT_HPP
+#define ENSMALLEN_FBS_L1_CONSTRAINT_HPP
+
+namespace ens {
+
+/**
+ * The L1Constraint applies a specific constraint that the L1 norm of the
+ * parameters must be less than or equal to the given lambda value.
+ *
+ * Implementationally, this means that the proximal step is a projection onto
+ * the L1 ball of radius lambda.  If the constraint is satisfied, `Evaluate()`
+ * will return 0.  Otherwise, it will return infinity.
+ *
+ * This class is meant to be used with the FBS optimizer, and any other
+ * optimizer that uses a proximal operator/step.
+ */
+class L1Constraint
+{
+ public:
+  /**
+   * Construct an L1Constraint with the given maximum L1 norm for the
+   * coordinates (lambda).
+   */
+  L1Constraint(const double lambda = 0.0);
+
+  /**
+   * If the L1 norm of the coordinates is less than or equal to lambda, this
+   * returns 0.  Otherwise, it returns infinity.
+   */
+  template<typename MatType>
+  typename MatType::elem_type Evaluate(const MatType& coordinates) const;
+
+  /**
+   * Apply a proximal step to the given `coordinates`, assuming that the forward
+   * step took a step of size `stepSize`.  This projects `coordinates` back onto
+   * the surface of the L1-ball with radius `lambda`, if the L1 norm of
+   * `coordinates` is greater than `lambda`.
+   *
+   * This may apply the proximal step multiple times to account for numerical
+   * stability issues during projection.
+   */
+  template<typename MatType>
+  void ProximalStep(MatType& coordinates, const double stepSize) const;
+
+  //! Get the L1 constraint to use when applying the proximal step.
+  double Lambda() const { return lambda; }
+  //! Modify the L1 constraint to use when applying the proximal step.
+  double& Lambda() { return lambda; }
+
+ private:
+  //! The L1 constraint value to use.
+  double lambda;
+
+  //! Helper function: extract only nonzero elements from sparse objects, or
+  //! extract the entire dense object.
+  template<typename MatType>
+  inline arma::Col<typename MatType::elem_type> ExtractNonzeros(
+      const MatType& coordinates) const;
+
+  template<typename eT>
+  inline arma::Col<eT> ExtractNonzeros(const arma::SpMat<eT>& coordinates)
+      const;
+};
+
+} // namespace ens
+
+// Include implementation.
+#include "l1_constraint_impl.hpp"
+
+#endif
diff --git a/inst/include/ensmallen_bits/fbs/l1_constraint_impl.hpp b/inst/include/ensmallen_bits/fbs/l1_constraint_impl.hpp
new file mode 100644
index 0000000..aff3ed2
--- /dev/null
+++ b/inst/include/ensmallen_bits/fbs/l1_constraint_impl.hpp
@@ -0,0 +1,201 @@
+/**
+ * @file l1_constraint_impl.hpp
+ * @author Ryan Curtin
+ *
+ * An implementation of the proximal operator for the L1 constraint.
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_FBS_L1_CONSTRAINT_IMPL_HPP
+#define ENSMALLEN_FBS_L1_CONSTRAINT_IMPL_HPP
+
+// In case it hasn't been included yet.
+#include "l1_constraint.hpp"
+
+namespace ens {
+
+inline L1Constraint::L1Constraint(const double lambda) : lambda(lambda)
+{
+  // Nothing to do.
+}
+
+template<typename MatType>
+typename MatType::elem_type L1Constraint::Evaluate(const MatType& coordinates)
+    const
+{
+  typedef typename MatType::elem_type eT;
+
+  // Allow some amount of tolerance for floating-point errors.
+  const eT l1Norm = norm(vectorise(coordinates), 1);
+  if (l1Norm <= lambda)
+    return eT(0);
+  else if (std::numeric_limits<eT>::has_infinity)
+    return std::numeric_limits<eT>::infinity();
+  else
+    return std::numeric_limits<eT>::max();
+}
+
+template<typename MatType>
+void L1Constraint::ProximalStep(MatType& coordinates,
+                                const double /* stepSize */)
+    const
+{
+  // First determine whether projection is necessary.
+  if (norm(vectorise(coordinates), 1) <= lambda)
+  {
+    return;
+  }
+
+  // An empty vector can't be projected.
+  if (coordinates.n_elem == 0)
+  {
+    return;
+  }
+
+  // We use the algorithm denoted in Figure 2 of the following paper:
+  //
+  // ```
+  // @inproceedings{duchi2008efficient,
+  //   title={Efficient projections onto the L1-ball for learning in high
+  //       dimensions},
+  //   author={Duchi, John and Shalev-Shwartz, Shai and Singer, Yoram and
+  //       Chandra, Tushar},
+  //   booktitle={Proceedings of the 25th international conference on
+  //       Machine learning},
+  //   pages={272--279},
+  //   year={2008}
+  // }
+  // ```
+  //
+  // This is an iterative algorithm that has a quicksort feel, where we try to
+  // determine the "pivot" element that tells us how much we need to shrink.  In
+  // the original paper, they maintain lists indicating whether a point is above
+  // or below the pivot, but it is more expedient (and efficient) to simply copy
+  // the coordinates array and partially sort it in-place.
+
+  typedef typename MatType::elem_type eT;
+  arma::Col<eT> work = ExtractNonzeros(coordinates);
+  size_t firstUpperElement = 0;
+  size_t lastUpperElement = work.n_elem;
+  eT rho = eT(0); // This is the quantity we aim to find to perform the projection.
+  eT s = eT(0);
+
+  while (lastUpperElement > firstUpperElement)
+  {
+    const size_t k = arma::randi<size_t>(
+        arma::distr_param((int) firstUpperElement, (int) lastUpperElement - 1));
+    const eT v = work[k];
+
+    // Now perform a half-quicksort such that all elements greater than v are in
+    // the first part of the array.
+    size_t left = firstUpperElement;
+    size_t right = lastUpperElement - 1;
+    while (left <= right)
+    {
+      while ((left < lastUpperElement) && (work[left] >= v))
+        ++left;
+      while ((right > firstUpperElement) && (work[right] < v))
+        --right;
+
+      if (left >= right)
+        break;
+
+      // work[left] is less than v, and work[right] is not.  Since we want all
+      // elements greater than or equal to v on the left, swap.
+      const eT tmp = work[left];
+      work[left] = work[right];
+      work[right] = tmp;
+    }
+
+    // Now, work[0] through work[left - 1] are in the greater set G.
+    const eT sDelta = accu(work.subvec(firstUpperElement, left - 1));
+    const size_t rhoDelta = (left - firstUpperElement);
+
+    if ((s + sDelta) - ((eT) (rho + rhoDelta)) * v < eT(lambda))
+    {
+      s += sDelta;
+      rho += rhoDelta;
+      firstUpperElement = left;
+    }
+    else
+    {
+      // v was an element that was less than rho, so, shrink the array and try
+      // again with larger elements.  We actually want to shrink the array so
+      // that it does not include v, so we need to find the first element that
+      // is v (since there may be duplicates).
+      size_t firstVIndex = left - 1;
+      while ((work[firstVIndex] == v) && (firstVIndex >= firstUpperElement))
+        --firstVIndex;
+      lastUpperElement = firstVIndex + 1;
+    }
+  }
+
+  const eT theta = (s - eT(lambda)) / rho;
+  // This is a single-line implementation of the .transform() below; we use the
+  // single-line implementation so it works with Bandicoot.
+  //
+  // coordinates.transform(
+  //     [theta](eT val)
+  //     {
+  //       if (val > 0)
+  //         return std::max(val - theta, eT(0));
+  //       else
+  //         return std::min(val + theta, eT(0));
+  //     });
+  coordinates = sign(coordinates) % clamp(
+      abs(coordinates) - theta, eT(0), std::numeric_limits<eT>::max());
+
+  // Sanity check: ensure we actually ended up inside the L1 ball.  This might
+  // not happen due to floating-point inaccuracies.  If so, try again.
+  const eT newNorm = norm(coordinates, 1);
+  if (newNorm > eT(lambda) && eT(lambda) > eT(0))
+  {
+    // Shrink the L1 ball by the amount of the error.
+    eT newLambda = (eT(lambda) - 2 * (newNorm - eT(lambda)));
+    if (newLambda == eT(lambda))
+    {
+      // Make sure we at least remove a few ULPs.
+      newLambda = eT(lambda) -
+          5 * (eT(lambda) - eT(std::nexttoward(lambda, 0.0)));
+    }
+
+    L1Constraint newConstraint(newLambda);
+    newConstraint.ProximalStep(coordinates, 0.0 /* ignored */);
+  }
+}
+
+// Helper function: extract only nonzero elements from sparse objects, or
+// extract the entire dense object.
+template<typename MatType>
+inline arma::Col<typename MatType::elem_type> L1Constraint::ExtractNonzeros(
+    const MatType& coordinates) const
+{
+  typedef typename MatType::elem_type ElemType;
+  return conv_to<arma::Col<ElemType>>::from(vectorise(abs(coordinates)));
+}
+
+template<typename eT>
+inline arma::Col<eT> L1Constraint::ExtractNonzeros(
+    const arma::SpMat<eT>& coordinates) const
+{
+  arma::Col<eT> result(coordinates.n_nonzero);
+  typename arma::SpMat<eT>::const_iterator it = coordinates.begin();
+  size_t i = 0;
+  while (it != coordinates.end())
+  {
+    // Extract only nonzero values.  Note we use the absolute value because that
+    // is what the algorithm requires (not the original value).
+    result[i] = std::abs(*it);
+    ++it;
+    ++i;
+  }
+
+  return result;
+}
+
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/fbs/l1_penalty.hpp b/inst/include/ensmallen_bits/fbs/l1_penalty.hpp
new file mode 100644
index 0000000..3cd45f6
--- /dev/null
+++ b/inst/include/ensmallen_bits/fbs/l1_penalty.hpp
@@ -0,0 +1,63 @@
+/**
+ * @file l1_penalty.hpp
+ * @author Ryan Curtin
+ *
+ * An implementation of the proximal operator for the L1 penalty (also known as
+ * the shrinkage operator).
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_FBS_L1_PENALTY_HPP
+#define ENSMALLEN_FBS_L1_PENALTY_HPP
+
+namespace ens {
+
+/**
+ * The L1Penalty applies a non-differentiable L1-norm penalty to the coordinates
+ * during optimization:
+ *
+ *   `lambda * || coordinates ||_1`
+ *
+ * This class is meant to be used with the FBS optimizer, and any other
+ * optimizer that uses a proximal operator/step.
+ */
+class L1Penalty
+{
+ public:
+  /**
+   * Construct an L1Penalty object with a given penalty `lambda`.
+   */
+  L1Penalty(const double lambda = 0.0);
+
+  /**
+   * Evaluate the L1 penalty function: `lambda * || coordinates ||_1`.
+   */
+  template<typename MatType>
+  typename MatType::elem_type Evaluate(const MatType& coordinates) const;
+
+  /**
+   * After taking a forward step of size `stepSize`, apply a backwards step /
+   * proximal operator that applies the L1 penalty to `coordinates`.
+   */
+  template<typename MatType>
+  void ProximalStep(MatType& coordinates, const double stepSize) const;
+
+  //! Get the L1 penalty to use when applying the proximal step.
+  double Lambda() const { return lambda; }
+  //! Modify the L1 penalty to use when applying the proximal step.
+  double& Lambda() { return lambda; }
+
+ private:
+  //! The L1 penalty value to use.
+  double lambda;
+};
+
+} // namespace ens
+
+// Include implementation.
+#include "l1_penalty_impl.hpp"
+
+#endif
diff --git a/inst/include/ensmallen_bits/fbs/l1_penalty_impl.hpp b/inst/include/ensmallen_bits/fbs/l1_penalty_impl.hpp
new file mode 100644
index 0000000..3112118
--- /dev/null
+++ b/inst/include/ensmallen_bits/fbs/l1_penalty_impl.hpp
@@ -0,0 +1,63 @@
+/**
+ * @file l1_penalty_impl.hpp
+ * @author Ryan Curtin
+ *
+ * An implementation of the proximal operator for the L1 penalty (also known as
+ * the shrinkage operator).
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_FBS_L1_PENALTY_IMPL_HPP
+#define ENSMALLEN_FBS_L1_PENALTY_IMPL_HPP
+
+// In case it hasn't been included yet.
+#include "l1_penalty.hpp"
+
+namespace ens {
+
+inline L1Penalty::L1Penalty(const double lambda) : lambda(lambda)
+{
+  // Nothing to do.
+}
+
+template<typename MatType>
+typename MatType::elem_type L1Penalty::Evaluate(const MatType& coordinates)
+    const
+{
+  // Compute the L1 penalty.
+  return norm(vectorise(coordinates), 1) * typename MatType::elem_type(lambda);
+}
+
+template<typename MatType>
+void L1Penalty::ProximalStep(MatType& coordinates,
+                             const double stepSize) const
+{
+  // Apply the backwards step coordinate-wise.  If `MatType` is sparse, this
+  // only applies to nonzero elements, which is just fine.
+  typedef typename MatType::elem_type eT;
+
+  // This is equivalent to the following .transform() implementation (which is
+  // easier to read but will not work with Bandicoot):
+  //
+  //arma::Mat<typename MatType::elem_type> c2 = conv_to<arma::Mat<typename MatType::elem_type>>::from(coordinates);
+  //c2.transform([this, stepSize](eT val) { return (val > eT(0)) ?
+  //    (std::max(eT(0), val - eT(lambda * stepSize))) :
+  //    (std::min(eT(0), val + eT(lambda * stepSize))); });
+  // coordinates.transform([this, stepSize](eT val) { return (val > eT(0)) ?
+  //     (std::max(eT(0), val - eT(lambda * stepSize))) :
+  //     (std::min(eT(0), val + eT(lambda * stepSize))); });
+  //
+  coordinates = sign(coordinates) % clamp(
+      abs(coordinates) - eT(lambda * stepSize), eT(0),
+      std::numeric_limits<eT>::max());
+
+  //coordinates.print("coordinates");
+  //c2.print("c2");
+}
+
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/fista/fista.hpp b/inst/include/ensmallen_bits/fista/fista.hpp
new file mode 100644
index 0000000..abea032
--- /dev/null
+++ b/inst/include/ensmallen_bits/fista/fista.hpp
@@ -0,0 +1,214 @@
+/**
+ * @file fista.hpp
+ * @author Ryan Curtin
+ *
+ * An implementation of FISTA (Fast Iterative Shrinkage-Thresholding Algorithm).
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_FISTA_FISTA_HPP
+#define ENSMALLEN_FISTA_FISTA_HPP
+
+#include "../fbs/l1_penalty.hpp"
+#include "../fbs/l1_constraint.hpp"
+
+namespace ens {
+
+/**
+ * FISTA (Fast Iterative Shrinkage-Thresholding Algorithm) is a proximal
+ * gradient optimization technique for optimizing a function of the form
+ *
+ *   h(x) = f(x) + g(x)
+ *
+ * where f(x) is a differentiable function and g(x) is an arbitrary
+ * non-differentiable function.  In such a situation, standard gradient descent
+ * techniques cannot work because of the non-differentiability of g(x).  To work
+ * around this, FISTA takes a _forward step_ that is just a gradient descent
+ * step on f(x), and then a _backward step_ that is the _proximal operator_
+ * corresponding to g(x).  This continues until convergence.
+ *
+ * This implementation of FISTA allows specification of the backward step (or
+ * proximal operator) via the `BackwardStepType` template parameter.  When using
+ * FBS, the differentiable `FunctionType` given to `Optimize()` should be f(x),
+ * *not* the combined function h(x).  g(x) should be specified by the choice of
+ * `BackwardStepType` (e.g. `L1Penalty` or `L1Maximum`).  The `Optimize()`
+ * function will then return optimized coordinates for h(x), not f(x).
+ *
+ * For more information, see the following paper:
+ *
+ * ```
+ * @article{beck2009fast,
+ *   title={A fast iterative shrinkage-thresholding algorithm for linear inverse
+ *       problems},
+ *   author={Beck, Amir and Teboulle, Marc},
+ *   journal={SIAM Journal On Imaging Sciences},
+ *   volume={2},
+ *   number={1},
+ *   pages={183--202},
+ *   year={2009},
+ *   publisher={SIAM}
+ * }
+ * ```
+ */
+template<typename BackwardStepType = L1Penalty>
+class FISTA
+{
+ public:
+  /**
+   * Construct the FISTA optimizer with the given options, using a
+   * default-constructed BackwardStepType.
+   */
+  FISTA(const size_t maxIterations = 10000,
+        const double tolerance = 1e-10,
+        const size_t maxLineSearchSteps = 50,
+        const double stepSizeAdjustment = 2.0,
+        const bool estimateStepSize = true,
+        const size_t estimateTrials = 10,
+        const double maxStepSize = 0.001);
+
+  /**
+   * Construct the FISTA optimizer with the given options.
+   */
+  FISTA(BackwardStepType backwardStepType,
+        const size_t maxIterations = 10000,
+        const double tolerance = 1e-10,
+        const size_t maxLineSearchSteps = 50,
+        const double stepSizeAdjustment = 2.0,
+        const bool estimateStepSize = true,
+        const size_t estimateTrials = 10,
+        const double maxStepSize = 0.001);
+
+  /**
+   * Optimize the given function using FISTA.  The given starting
+   * point will be modified to store the finishing point of the algorithm,
+   * the final objective value is returned.
+   *
+   * The FunctionType template class must provide the following functions:
+   *
+   *   double Evaluate(const arma::mat& coordinates);
+   *   void Gradient(const arma::mat& coordinates,
+   *                 arma::mat& gradient);
+   *
+   * @tparam FunctionType Type of function to be optimized.
+   * @tparam MatType Type of objective matrix.
+   * @tparam GradType Type of gradient matrix (default is MatType).
+   * @tparam CallbackTypes Types of callback functions.
+   * @param function Function to be optimized.
+   * @param iterate Input with starting point, and will be modified to save
+   *                the output optimial solution coordinates.
+   * @param callbacks Callback functions.
+   * @return Objective value at the final solution.
+   */
+  template<typename FunctionType, typename MatType, typename GradType,
+           typename... CallbackTypes>
+  typename std::enable_if<IsMatrixType<GradType>::value,
+      typename MatType::elem_type>::type
+  Optimize(FunctionType& function,
+           MatType& iterate,
+           CallbackTypes&&... callbacks);
+
+  //! Forward the MatType as GradType.
+  template<typename FunctionType,
+           typename MatType,
+           typename... CallbackTypes>
+  typename MatType::elem_type Optimize(FunctionType& function,
+                                       MatType& iterate,
+                                       CallbackTypes&&... callbacks)
+  {
+    return Optimize<FunctionType, MatType, MatType,
+        CallbackTypes...>(function, iterate,
+        std::forward<CallbackTypes>(callbacks)...);
+  }
+
+  //! Get the backward step object.
+  const BackwardStepType& BackwardStep() const { return backwardStep; }
+  //! Modify the backward step object.
+  BackwardStepType& BackwardStep() { return backwardStep; }
+
+  //! Get the maximum number of iterations (0 indicates no limit).
+  size_t MaxIterations() const { return maxIterations; }
+  //! Modify the maximum number of iterations (0 indicates no limit).
+  size_t& MaxIterations() { return maxIterations; }
+
+  //! Get the tolerance on the gradient norm for termination.
+  double Tolerance() const { return tolerance; }
+  //! Modify the tolerance on the gradient norm for termination.
+  double& Tolerance() { return tolerance; }
+
+  //! Get the maximum number of line search steps.
+  size_t MaxLineSearchSteps() const { return maxLineSearchSteps; }
+  //! Modify the maximum number of line search steps.
+  size_t& MaxLineSearchSteps() { return maxLineSearchSteps; }
+
+  //! Get the step size adjustment parameter.
+  double StepSizeAdjustment() const { return stepSizeAdjustment; }
+  //! Modify the step size adjustment parameter.
+  double& StepSizeAdjustment() { return stepSizeAdjustment; }
+
+  //! Get whether or not to estimate the initial step size.
+  bool EstimateStepSize() const { return estimateStepSize; }
+  //! Modify whether or not to estimate the initial step size.
+  bool& EstimateStepSize() { return estimateStepSize; }
+
+  //! Get the number of trials to use for Lipschitz constant estimation.
+  size_t EstimateTrials() const { return estimateTrials; }
+  //! Modify the number of trials to use for Lipschitz constant estimation.
+  size_t& EstimateTrials() { return estimateTrials; }
+
+  //! Get the maximum step size.  If Optimize() has been called, this will
+  //! contain the estimated maximum step size value.
+  double MaxStepSize() const { return maxStepSize; }
+  //! Modify the step size (ignored if EstimateStepSize() is true).
+  double& MaxStepSize() { return maxStepSize; }
+
+ private:
+  //! Utility function: fill with random values.
+  template<typename MatType>
+  static void RandomFill(MatType& x,
+                         const size_t rows,
+                         const size_t cols,
+                         const typename MatType::elem_type maxVal);
+
+  template<typename eT>
+  static void RandomFill(arma::SpMat<eT>& x,
+                         const size_t rows,
+                         const size_t cols,
+                         const eT maxVal);
+
+  template<typename FunctionType, typename MatType>
+  void EstimateLipschitzStepSize(FunctionType& f, const MatType& x);
+
+  //! The instantiated backward step object.
+  BackwardStepType backwardStep;
+
+  //! The maximum number of allowed iterations.
+  size_t maxIterations;
+
+  //! The tolerance for termination.
+  double tolerance;
+
+  //! The maximum number of line search trials.
+  size_t maxLineSearchSteps;
+
+  //! The step size adjustment parameter for the line search.
+  double stepSizeAdjustment;
+
+  //! Whether or not to try and estimate the initial step size.
+  bool estimateStepSize;
+
+  //! Number of trials to use for initial step size estimation.
+  size_t estimateTrials;
+
+  //! The maximum step size to use (estimated if estimateStepSize is true).
+  double maxStepSize;
+};
+
+} // namespace ens
+
+// Include implementation.
+#include "fista_impl.hpp"
+
+#endif
diff --git a/inst/include/ensmallen_bits/fista/fista_impl.hpp b/inst/include/ensmallen_bits/fista/fista_impl.hpp
new file mode 100644
index 0000000..09cf69c
--- /dev/null
+++ b/inst/include/ensmallen_bits/fista/fista_impl.hpp
@@ -0,0 +1,445 @@
+/**
+ * @file fista_impl.hpp
+ * @author Ryan Curtin
+ *
+ * Implementation of FISTA (Fast Iterative Shrinkage-Thresholding Algorithm).
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_FISTA_FISTA_IMPL_HPP
+#define ENSMALLEN_FISTA_FISTA_IMPL_HPP
+
+// In case it hasn't been included yet.
+#include "fista.hpp"
+
+#include <ensmallen_bits/function.hpp>
+
+namespace ens {
+
+//! Constructor of the FBS class.
+template<typename BackwardStepType>
+FISTA<BackwardStepType>::FISTA(const size_t maxIterations,
+                               const double tolerance,
+                               const size_t maxLineSearchSteps,
+                               const double stepSizeAdjustment,
+                               const bool estimateStepSize,
+                               const size_t estimateTrials,
+                               const double maxStepSize) :
+    maxIterations(maxIterations),
+    tolerance(tolerance),
+    maxLineSearchSteps(maxLineSearchSteps),
+    stepSizeAdjustment(stepSizeAdjustment),
+    estimateStepSize(estimateStepSize),
+    estimateTrials(estimateTrials),
+    maxStepSize(maxStepSize)
+{
+  // Check estimateSteps parameter.
+  if (estimateStepSize && estimateTrials == 0)
+  {
+    throw std::invalid_argument("FISTA::FISTA(): estimateTrials must be greater"
+        " than 0!");
+  }
+}
+
+template<typename BackwardStepType>
+FISTA<BackwardStepType>::FISTA(BackwardStepType backwardStep,
+                               const size_t maxIterations,
+                               const double tolerance,
+                               const size_t maxLineSearchSteps,
+                               const double stepSizeAdjustment,
+                               const bool estimateStepSize,
+                               const size_t estimateTrials,
+                               const double maxStepSize) :
+    backwardStep(std::move(backwardStep)),
+    maxIterations(maxIterations),
+    tolerance(tolerance),
+    maxLineSearchSteps(maxLineSearchSteps),
+    stepSizeAdjustment(stepSizeAdjustment),
+    estimateStepSize(estimateStepSize),
+    estimateTrials(estimateTrials),
+    maxStepSize(maxStepSize)
+{
+  // Check estimateSteps parameter.
+  if (estimateStepSize && estimateTrials == 0)
+  {
+    throw std::invalid_argument("FISTA::FISTA(): estimateTrials must be greater"
+        " than 0!");
+  }
+}
+
+//! Optimize the function (minimize).
+template<typename BackwardStepType>
+template<typename FunctionType, typename MatType, typename GradType,
+         typename... CallbackTypes>
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
+FISTA<BackwardStepType>::Optimize(FunctionType& function,
+                                  MatType& iterateIn,
+                                  CallbackTypes&&... callbacks)
+{
+  // Convenience typedefs.
+  typedef typename MatType::elem_type ElemType;
+  typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
+  typedef typename MatTypeTraits<GradType>::BaseMatType BaseGradType;
+
+  typedef Function<FunctionType, BaseMatType, BaseGradType> FullFunctionType;
+  FullFunctionType& f = static_cast<FullFunctionType&>(function);
+
+  // Make sure we have all necessary functions.
+  traits::CheckFunctionTypeAPI<FullFunctionType, BaseMatType, BaseGradType>();
+  RequireFloatingPointType<BaseMatType>();
+  RequireFloatingPointType<BaseGradType>();
+  RequireSameInternalTypes<BaseMatType, BaseGradType>();
+
+  // Match the notation of the paper.  We force a copy here, since we use
+  // std::move() internally and this may be an alias.  We copy back to
+  // `iterateIn` at the end.
+  BaseMatType x(iterateIn);
+
+  // To keep track of the function value.
+  ElemType lastObj = std::numeric_limits<ElemType>::max();;
+  ElemType currentFObj = f.Evaluate(x);
+  ElemType currentGObj = backwardStep.Evaluate(x);
+  ElemType currentObj = currentFObj + currentGObj;
+
+  BaseGradType g(x.n_rows, x.n_cols); // Gradient.
+  BaseMatType y = x; // Initialize y_1 = x_0.
+  BaseMatType lastX;
+  ElemType t = 1; // Initialize t_1 = 1.
+  ElemType lastT = t;
+
+  // Controls early termination of the optimization process.
+  bool terminate = false;
+
+  // First, estimate the Lipschitz constant to set the initial/maximum step
+  // size, if the user asked us to.
+  if (estimateStepSize)
+    EstimateLipschitzStepSize(f, x); // Sets `maxStepSize`.
+
+  // Keep track of the last step size we used.
+  ElemType currentStepSize = (ElemType) maxStepSize;
+  ElemType lastStepSize = (ElemType) maxStepSize;
+
+  Callback::BeginOptimization(*this, f, x, callbacks...);
+  for (size_t i = 1; i != maxIterations && !terminate; ++i)
+  {
+    // During this optimization, we want to optimize h(x) = f(x) + g(x).
+    // f(x) is `f`, but g(x) is specified by `BackwardStepType`.
+
+    // Notation (compare with Beck and Teboulle):
+    //   `i` represents `k`, the iteration number.
+    //   `x` represents `x_k` in the paper.
+    //   `y` represents `y_k` in the paper.
+
+    // The first step is to compute a step size via a line search.  To do this,
+    // we need to compute the gradient f'(y) as required by the quadratic
+    // approximation Q_L(x, y) (Eq. 2.5).
+    //
+    // We will also need the objective f(y), so we will compute that
+    // simultaneously.
+    const ElemType yObj = f.EvaluateWithGradient(y, g);
+    terminate |= Callback::EvaluateWithGradient(*this, f, y, yObj, g,
+        callbacks...);
+
+    // Use backtracking line search to find the best step size.  This is not the
+    // version from the FASTA paper (non-monotone line search) but instead the
+    // version proposed by Beck and Teboulle, with a minor modification: we
+    // start our search at the last step size, and allow the search to increase
+    // the step size up to the maximum step size if it can.  This is a more
+    // effective heuristic than simply starting at the largest allowable step
+    // size and shrinking from there, especially in regions where the gradient
+    // norm is small.  It is also more effective than simply starting at the
+    // last step size and shrinking from there, as it prevents getting "stuck"
+    // with a very small step size.
+    bool lsDone = false;
+    size_t lsTrial = 0;
+    bool increasing = false; // Will be set during the first iteration.
+    ElemType lastFObj = ElemType(0);
+    ElemType lastGObj = ElemType(0);
+    BaseMatType lsLastX; // Only used in increasing mode.
+    BaseMatType xDiff;
+
+    lastX = std::move(x);
+    lastStepSize = currentStepSize;
+    currentStepSize = std::min(currentStepSize, (ElemType) maxStepSize);
+
+    while (!lsDone && !terminate)
+    {
+      if (lsTrial == maxLineSearchSteps)
+      {
+        if (increasing)
+        {
+          Warn << "FISTA::Optimize(): line search reached maximum number of "
+              << "steps (" << maxLineSearchSteps << "); using step size "
+              << currentStepSize << "." << std::endl;
+          break; // The step size is still valid.
+        }
+        else
+        {
+          Warn << "FISTA::Optimize(): could not find valid step size in range "
+              << "(0, " << maxStepSize << "]!  Terminating optimization."
+              << std::endl;
+          x = std::move(lastX); // Revert to previous coordinates.
+          terminate = true;
+          break;
+        }
+      }
+
+      // If the step size has converged to zero, we are done.
+      if (currentStepSize == ElemType(0))
+      {
+        Warn << "FISTA::Optimize(): computed zero step size; terminating "
+            << "optimization." << std::endl;
+        x = std::move(lastX); // Revert to previous coordinates.
+        terminate = true;
+        break;
+      }
+
+      // Perform forward update into x.
+      x = y - currentStepSize * g;
+      backwardStep.ProximalStep(x, currentStepSize);
+
+      // Compute F(x) = f(x) + g(x).
+      const ElemType fObj = f.Evaluate(x);
+      const ElemType gObj = backwardStep.Evaluate(x);
+      const ElemType lsObj = fObj + gObj;
+      terminate |= Callback::Evaluate(*this, f, x, fObj, callbacks...);
+
+      // Compute Q_L(x, y) (the quadratic approximation), Eq. (2.5).
+      xDiff = x - y;
+      const ElemType q = yObj + dot(xDiff, g) +
+          (1 / (2 * currentStepSize)) * dot(xDiff, xDiff) + gObj;
+
+      // If we're on the first iteration, we don't know if we should be
+      // searching for a step size by increasing or decreasing the step size.
+      // (Remember that our valid ranges of step sizes are [0, maxStepSize], and
+      // we are starting at lastStepSize.)
+      //
+      // Thus, if the condition is satisfied, let's try increasing the step size
+      // until it's no longer satisfied.  Otherwise, we will have to decrease
+      // the step size.
+      if (lsTrial == 0)
+      {
+        increasing = (lsObj <= q);
+      }
+
+      if (increasing)
+      {
+        // If we are in "increasing" mode, then termination occurs on the first
+        // iteration when the condition is *not* satisfied (and we use the last
+        // step size).
+        if ((lsObj > q) || (!std::isfinite(lsObj)))
+        {
+          lsDone = true;
+          if (lsTrial != 0)
+            x = std::move(lsLastX);
+          currentFObj = lastFObj;
+          currentGObj = lastGObj;
+          lastObj = currentObj;
+          currentObj = currentFObj + currentGObj;
+          currentStepSize = lastStepSize; // Take one step backwards.
+        }
+        else if (currentStepSize == (ElemType) maxStepSize)
+        {
+          // The condition is still satisfied, but the step size will be too big
+          // if we take another step.  Go back to the maximum step size.
+          lsDone = true;
+          currentFObj = fObj;
+          currentGObj = gObj;
+          lastObj = currentObj;
+          currentObj = currentFObj + currentGObj;
+        }
+        else
+        {
+          // The condition is still satisfied; increase the step size.
+          lastStepSize = currentStepSize;
+          currentStepSize *= ElemType(stepSizeAdjustment);
+          lsLastX = std::move(x);
+          lastFObj = fObj;
+          lastGObj = gObj;
+          ++lsTrial;
+        }
+      }
+      else
+      {
+        // If we are in "decreasing" mode, then termination occurs on the first
+        // iteration when the condition is satisfied.
+        if ((lsObj <= q) && (std::isfinite(lsObj)))
+        {
+          lsDone = true;
+          currentFObj = fObj;
+          currentGObj = gObj;
+          lastObj = currentObj;
+          currentObj = currentFObj + currentGObj;
+        }
+        else
+        {
+          // The condition is not yet satisfied; decrease the step size.
+          currentStepSize /= ElemType(stepSizeAdjustment);
+          ++lsTrial;
+        }
+      }
+    }
+
+    // If we terminated during the line search, we are done.
+    if (terminate)
+      break;
+
+    if (!lsDone)
+    {
+      // The line search failed, so terminate.
+      Warn << "FISTA::Optimize(): line search failed after "
+          << maxLineSearchSteps << " steps; terminating optimization."
+          << std::endl;
+      x = std::move(lastX);
+      terminate = true;
+      break;
+    }
+
+    // Output current objective function.
+    Info << "FISTA::Optimize(): iteration " << i << ", combined objective "
+        << currentObj << " (f(x) = " << currentFObj << ", g(x) = "
+        << currentGObj << "), step size " << currentStepSize << "."
+        << std::endl;
+
+    if ((i > 1) && !std::isfinite(currentObj))
+    {
+      Warn << "FISTA::Optimize(): objective diverged to " << currentObj
+          << "; terminating optimization." << std::endl;
+      terminate = true;
+      break;
+    }
+
+    // Check for convergence.  This is a simple check on the objective.
+    if ((i > 1) && (std::abs(currentObj - lastObj) < tolerance))
+    {
+      Info << "FISTA::Optimize(): minimized within objective tolerance "
+          << tolerance << "; terminating optimization." << std::endl;
+      terminate = true;
+    }
+
+    // Compute updated prediction parameter t.
+    lastT = t;
+    t = (1 + std::sqrt(1 + 4 * std::pow(t, ElemType(2)))) / 2;
+
+    // Sometimes t can get to be too large; this restart scheme is taken
+    // originally from O'Donoghue and Candes, "Adaptive restart for accelerated
+    // gradient schemes", 2012.
+    const ElemType restartCheck = dot(y - x, x - lastX);
+    if (restartCheck > 0)
+    {
+      Info << "FISTA::Optimize(): t too large (" << t << "); reset to 1."
+          << std::endl;
+      t = 1;
+      lastT = 1;
+    }
+
+    // Update prediction y.
+    y = x + ((lastT - 1) / t) * (x - lastX);
+
+    terminate |= Callback::StepTaken(*this, f, y, callbacks...);
+  }
+
+  if (!terminate)
+  {
+    Info << "FISTA::Optimize(): maximum iterations (" << maxIterations
+        << ") reached; terminating optimization." << std::endl;
+  }
+
+  Callback::EndOptimization(*this, f, x, callbacks...);
+
+  ((BaseMatType&) iterateIn) = x;
+  return currentObj;
+} // Optimize()
+
+template<typename BackwardStepType>
+template<typename MatType>
+void FISTA<BackwardStepType>::RandomFill(
+    MatType& x,
+    const size_t rows,
+    const size_t cols,
+    const typename MatType::elem_type maxVal)
+{
+  x.randu(rows, cols);
+  x *= maxVal;
+}
+
+template<typename BackwardStepType>
+template<typename eT>
+void FISTA<BackwardStepType>::RandomFill(
+    arma::SpMat<eT>& x,
+    const size_t rows,
+    const size_t cols,
+    const eT maxVal)
+{
+  eT density = eT(0.1);
+  // Try and keep the matrix from having too many elements.
+  if (rows * cols > 100000)
+    density = eT(0.01);
+  else if (rows * cols > 1000000)
+    density = eT(0.001);
+  else if (rows * cols > 10000000)
+    density = eT(0.0001);
+
+  x.sprandu(rows, cols, density);
+
+  // Make sure we got at least some nonzero elements...
+  while (x.n_nonzero == 0)
+  {
+    if (x.n_elem < 10)
+      x.sprandu(rows, cols, 1.0);
+    else
+      x.sprandu(rows, cols, 0.5);
+  }
+
+  x *= maxVal;
+}
+
+template<typename BackwardStepType>
+template<typename FunctionType, typename MatType>
+void FISTA<BackwardStepType>::EstimateLipschitzStepSize(
+    FunctionType& f,
+    const MatType& x)
+{
+  typedef typename MatType::elem_type ElemType;
+
+  // Sanity check for estimateSteps parameter.
+  if (estimateTrials == 0)
+  {
+    throw std::invalid_argument("FISTA::Optimize(): estimateTrials must be "
+        "greater than 0!");
+  }
+
+  const ElemType xMax = std::max(ElemType(1), 2 * x.max());
+  ElemType sum = ElemType(0);
+  MatType x1, x2, gx1, gx2;
+
+  for (size_t t = 0; t < estimateTrials; ++t)
+  {
+    RandomFill(x1, x.n_rows, x.n_cols, xMax);
+    RandomFill(x2, x.n_rows, x.n_cols, xMax);
+
+    f.Gradient(x1, gx1);
+    f.Gradient(x2, gx2);
+
+    // Compute a Lipschitz constant estimate.
+    const ElemType lEst = norm(gx1 - gx2, 2) / norm(x1 - x2, 2);
+    sum += lEst;
+  }
+
+  sum /= estimateTrials;
+  if (sum == 0)
+    maxStepSize = std::numeric_limits<ElemType>::max();
+  else
+    maxStepSize = (10 / sum);
+
+  Info << "FISTA::Optimize(): estimated a maximum step size of "
+      << maxStepSize << "." << std::endl;
+}
+
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/ftml/ftml.hpp b/inst/include/ensmallen_bits/ftml/ftml.hpp
index 26c4183..4fcb53f 100644
--- a/inst/include/ensmallen_bits/ftml/ftml.hpp
+++ b/inst/include/ensmallen_bits/ftml/ftml.hpp
@@ -98,7 +98,7 @@ class FTML
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/ftml/ftml_update.hpp b/inst/include/ensmallen_bits/ftml/ftml_update.hpp
index 5db2b05..a135420 100644
--- a/inst/include/ensmallen_bits/ftml/ftml_update.hpp
+++ b/inst/include/ensmallen_bits/ftml/ftml_update.hpp
@@ -78,6 +78,8 @@ class FTMLUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -87,11 +89,18 @@ class FTMLUpdate
      * @param cols Number of columns in the gradient matrix.
      */
     Policy(FTMLUpdate& parent, const size_t rows, const size_t cols) :
-        parent(parent)
+        parent(parent),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2))
     {
       v.zeros(rows, cols);
       z.zeros(rows, cols);
       d.zeros(rows, cols);
+
+      // Attempt to catch underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -109,19 +118,19 @@ class FTMLUpdate
       ++iteration;
 
       // And update the iterate.
-      v *= parent.beta2;
-      v += (1 - parent.beta2) * (gradient % gradient);
+      v *= beta2;
+      v += (1 - beta2) * (gradient % gradient);
 
-      const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
-      const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
+      const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
+      const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));
 
-      MatType sigma = -parent.beta1 * d;
-      d = biasCorrection1 / stepSize *
-        (arma::sqrt(v / biasCorrection2) + parent.epsilon);
+      MatType sigma = -beta1 * d;
+      d = biasCorrection1 / ElemType(stepSize) *
+          (sqrt(v / biasCorrection2) + epsilon);
       sigma += d;
 
-      z *= parent.beta1;
-      z += (1 - parent.beta1) * gradient - sigma % iterate;
+      z *= beta1;
+      z += (1 - beta1) * gradient - sigma % iterate;
       iterate = -z / d;
     }
 
@@ -140,6 +149,11 @@ class FTMLUpdate
 
     // The number of iterations.
     size_t iteration;
+
+    // Optimization parameters converted to the type of the optimization.
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
   };
 
  private:
diff --git a/inst/include/ensmallen_bits/function/arma_traits.hpp b/inst/include/ensmallen_bits/function/arma_traits.hpp
index e13297b..cf9c67c 100644
--- a/inst/include/ensmallen_bits/function/arma_traits.hpp
+++ b/inst/include/ensmallen_bits/function/arma_traits.hpp
@@ -122,6 +122,17 @@ template<>
 inline void RequireDenseFloatingPointType<arma::mat>() { }
 template<>
 inline void RequireDenseFloatingPointType<arma::fmat>() { }
+#if defined(ARMA_HAVE_FP16)
+template<>
+inline void RequireDenseFloatingPointType<arma::hmat>() { }
+#endif
+
+#ifdef ENS_HAVE_COOT
+template<>
+inline void RequireDenseFloatingPointType<coot::mat>() { }
+template<>
+inline void RequireDenseFloatingPointType<coot::fmat>() { }
+#endif
 
 template<typename MatType>
 void RequireFloatingPointType()
@@ -144,6 +155,19 @@ template<>
 inline void RequireFloatingPointType<arma::sp_mat>() { }
 template<>
 inline void RequireFloatingPointType<arma::sp_fmat>() { }
+#if defined(ARMA_HAVE_FP16)
+template<>
+inline void RequireFloatingPointType<arma::hmat>() { }
+template<>
+inline void RequireFloatingPointType<arma::sp_hmat>() { }
+#endif
+
+#ifdef ENS_HAVE_COOT
+template<>
+inline void RequireFloatingPointType<coot::mat>() { }
+template<>
+inline void RequireFloatingPointType<coot::fmat>() { }
+#endif
 
 /**
  * Require that the internal element type of the matrix type and gradient type
diff --git a/inst/include/ensmallen_bits/fw/atoms.hpp b/inst/include/ensmallen_bits/fw/atoms.hpp
index ffef05f..6ba9a92 100644
--- a/inst/include/ensmallen_bits/fw/atoms.hpp
+++ b/inst/include/ensmallen_bits/fw/atoms.hpp
@@ -96,6 +96,7 @@ class Atoms
       // Find possible atom to be deleted.
       arma::vec gap = sqTerm -
           currentCoeffs % trans(gradient.t() * currentAtoms);
+
       arma::uword ind = gap.index_min();
 
       // Try deleting the atom.
diff --git a/inst/include/ensmallen_bits/fw/constr_lpball.hpp b/inst/include/ensmallen_bits/fw/constr_lpball.hpp
index 9cddbcb..df09907 100644
--- a/inst/include/ensmallen_bits/fw/constr_lpball.hpp
+++ b/inst/include/ensmallen_bits/fw/constr_lpball.hpp
@@ -49,7 +49,8 @@ namespace ens {
  * \f]
  *
  */
-class ConstrLpBallSolver
+template<typename VecType = arma::vec>
+class ConstrLpBallSolverType
 {
  public:
   /**
@@ -58,7 +59,7 @@ class ConstrLpBallSolver
    *
    * @param p The constraint is unit lp ball.
    */
-  ConstrLpBallSolver(const double p) : p(p)
+  ConstrLpBallSolverType(const double p) : p(p)
   { /* Do nothing. */ }
 
   /**
@@ -68,7 +69,7 @@ class ConstrLpBallSolver
    * @param p The constraint is unit lp ball.
    * @param lambda Regularization parameter.
    */
-  ConstrLpBallSolver(const double p, const arma::vec lambda) :
+  ConstrLpBallSolverType(const double p, const VecType lambda) :
       p(p), regFlag(true), lambda(lambda)
   { /* Do nothing. */ }
 
@@ -80,52 +81,51 @@ class ConstrLpBallSolver
    * @param s Output optimal solution in the constrained domain (lp ball).
    */
   template<typename MatType>
-  void Optimize(const MatType& v,
-                MatType& s)
+  void Optimize(const MatType& v, MatType& s)
   {
-    typedef typename MatType::elem_type ElemType;
+    typedef typename ForwardType<MatType>::uword UWordType;
 
     if (p == std::numeric_limits<double>::infinity())
     {
       // l-inf ball.
-      s = -arma::sign(v);
+      s = -sign(v);
       if (regFlag)
       {
         // Do element-wise division.
-        s /= arma::conv_to<arma::Col<ElemType>>::from(lambda);
+        s /= conv_to<MatType>::from(lambda);
       }
     }
     else if (p > 1.0)
     {
       // lp ball with 1<p<inf.
       if (regFlag)
-        s = v / arma::conv_to<arma::Col<ElemType>>::from(lambda);
+        s = v / conv_to<MatType>::from(lambda);
       else
         s = v;
 
       double q = 1 / (1.0 - 1.0 / p);
-      s = -arma::sign(v) % arma::pow(arma::abs(s), q - 1);
-      s = arma::normalise(s, p);
+      s = -sign(v) % pow(abs(s), q - 1);
+      s = normalise(s, p);
 
       if (regFlag)
-        s = s / arma::conv_to<arma::Col<ElemType>>::from(lambda);
+        s = s / conv_to<MatType>::from(lambda);
     }
     else if (p == 1.0)
     {
       // l1 ball, also used in OMP.
       if (regFlag)
-        s = arma::abs(v / arma::conv_to<arma::Col<ElemType>>::from(lambda));
+        s = abs(v / conv_to<MatType>::from(lambda));
       else
-        s = arma::abs(v);
+        s = abs(v);
 
       // k is the linear index of the largest element.
-      arma::uword k = s.index_max();
+      UWordType k = s.index_max();
       s.zeros();
       // Take the sign of v(k).
       s(k) = -((0.0 < v(k)) - (v(k) < 0.0));
 
       if (regFlag)
-        s = s / arma::conv_to<arma::Col<ElemType>>::from(lambda);
+        s = s / conv_to<MatType>::from(lambda);
     }
     else
     {
@@ -146,9 +146,9 @@ class ConstrLpBallSolver
   bool& RegFlag() { return regFlag; }
 
   //! Get the regularization parameter.
-  arma::vec Lambda() const { return lambda; }
+  VecType Lambda() const { return lambda; }
   //! Modify the regularization parameter.
-  arma::vec& Lambda() { return lambda; }
+  VecType& Lambda() { return lambda; }
 
  private:
   //! lp norm, 1<=p<=inf;
@@ -159,9 +159,11 @@ class ConstrLpBallSolver
   bool regFlag = false;
 
   //! Regularization parameter.
-  arma::vec lambda;
+  VecType lambda;
 };
 
+using ConstrLpBallSolver = ConstrLpBallSolverType<arma::vec>;
+
 } // namespace ens
 
 #endif
diff --git a/inst/include/ensmallen_bits/fw/frank_wolfe.hpp b/inst/include/ensmallen_bits/fw/frank_wolfe.hpp
index 2694566..561fdc0 100644
--- a/inst/include/ensmallen_bits/fw/frank_wolfe.hpp
+++ b/inst/include/ensmallen_bits/fw/frank_wolfe.hpp
@@ -126,7 +126,7 @@ class FrankWolfe
    */
   template<typename FunctionType, typename MatType, typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(FunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/fw/frank_wolfe_impl.hpp b/inst/include/ensmallen_bits/fw/frank_wolfe_impl.hpp
index 32c2fa3..b60ccf4 100644
--- a/inst/include/ensmallen_bits/fw/frank_wolfe_impl.hpp
+++ b/inst/include/ensmallen_bits/fw/frank_wolfe_impl.hpp
@@ -41,12 +41,12 @@ template<
     typename UpdateRuleType>
 template<typename FunctionType, typename MatType, typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 FrankWolfe<LinearConstrSolverType, UpdateRuleType>::Optimize(
-  FunctionType& function,
-  MatType& iterateIn,
-  CallbackTypes&&... callbacks)
+    FunctionType& function,
+    MatType& iterateIn,
+    CallbackTypes&&... callbacks)
 {
   // Convenience typedefs.
   typedef typename MatType::elem_type ElemType;
@@ -95,7 +95,7 @@ FrankWolfe<LinearConstrSolverType, UpdateRuleType>::Optimize(
     if (gap < tolerance)
     {
       Info << "FrankWolfe::Optimize(): minimized within tolerance "
-          << tolerance << "; " << "terminating optimization." << std::endl;
+          << tolerance << "; terminating optimization." << std::endl;
 
       Callback::EndOptimization(*this, f, iterate, callbacks...);
       return currentObjective;
@@ -109,8 +109,11 @@ FrankWolfe<LinearConstrSolverType, UpdateRuleType>::Optimize(
     terminate |= Callback::StepTaken(*this, f, iterate, callbacks...);
   }
 
-  Info << "FrankWolfe::Optimize(): maximum iterations (" << maxIterations
-      << ") reached; " << "terminating optimization." << std::endl;
+  if (!terminate)
+  {
+    Info << "FrankWolfe::Optimize(): maximum iterations (" << maxIterations
+        << ") reached; terminating optimization." << std::endl;
+  }
 
   Callback::EndOptimization(*this, f, iterate, callbacks...);
   return currentObjective;
diff --git a/inst/include/ensmallen_bits/fw/line_search/line_search_impl.hpp b/inst/include/ensmallen_bits/fw/line_search/line_search_impl.hpp
index 50d62dd..752ee7a 100644
--- a/inst/include/ensmallen_bits/fw/line_search/line_search_impl.hpp
+++ b/inst/include/ensmallen_bits/fw/line_search/line_search_impl.hpp
@@ -106,7 +106,7 @@ typename MatType::elem_type LineSearch::Derivative(FunctionType& function,
 {
   GradType gradient(x0.n_rows, x0.n_cols);
   function.Gradient(x0 + gamma * deltaX, gradient);
-  return arma::dot(gradient, deltaX);
+  return dot(gradient, deltaX);
 }
 
 } // namespace ens
diff --git a/inst/include/ensmallen_bits/fw/proximal/proximal_impl.hpp b/inst/include/ensmallen_bits/fw/proximal/proximal_impl.hpp
index f607c71..88792f8 100644
--- a/inst/include/ensmallen_bits/fw/proximal/proximal_impl.hpp
+++ b/inst/include/ensmallen_bits/fw/proximal/proximal_impl.hpp
@@ -35,14 +35,24 @@ namespace ens {
 template<typename MatType>
 inline void Proximal::ProjectToL1Ball(MatType& v, double tau)
 {
-  MatType simplexSol = arma::abs(v);
+  MatType simplexSol = abs(v);
 
   // Already with L1 norm <= tau.
-  if (arma::accu(simplexSol) <= tau)
+  if (accu(simplexSol) <= tau)
     return;
 
-  simplexSol = arma::sort(simplexSol, "descend");
-  MatType simplexSum = arma::cumsum(simplexSol);
+  simplexSol = sort(simplexSol, "descend");
+  // MatType simplexSum = arma::cumsum(simplexSol);
+  MatType simplexSum(simplexSol.n_rows, simplexSol.n_cols);
+  for (size_t col = 0; col < simplexSol.n_cols; ++col)
+  {
+    simplexSum(0, col) = simplexSol(0, col);
+    for (size_t row = 1; row < simplexSol.n_rows; ++row)
+    {
+      simplexSum(row, col) = simplexSum(row - 1, col) +
+          simplexSol(row, col);
+    }
+  }
 
   double nu = 0;
   size_t rho = simplexSol.n_rows - 1;
@@ -72,10 +82,15 @@ inline void Proximal::ProjectToL1Ball(MatType& v, double tau)
 template<typename MatType>
 inline void Proximal::ProjectToL0Ball(MatType& v, int tau)
 {
-  arma::uvec indices = arma::sort_index(arma::abs(v));
-  arma::uword numberToKill = v.n_elem - tau;
+  typedef typename ForwardType<MatType>::uword UWordType;
+  typedef typename ForwardType<MatType>::uvec UVecType;
+  typedef typename ForwardType<MatType>::bvec VecType;
+
+  const VecType vTemp = conv_to<VecType>::from(abs(v));
+  UVecType indices = sort_index(vTemp);
+  UWordType numberToKill = v.n_elem - tau;
 
-  for (arma::uword i = 0; i < numberToKill; i++)
+  for (UWordType i = 0; i < numberToKill; i++)
     v(indices(i)) = 0.0;
 }
 
diff --git a/inst/include/ensmallen_bits/fw/update_full_correction.hpp b/inst/include/ensmallen_bits/fw/update_full_correction.hpp
index e486fed..1fdc04c 100644
--- a/inst/include/ensmallen_bits/fw/update_full_correction.hpp
+++ b/inst/include/ensmallen_bits/fw/update_full_correction.hpp
@@ -78,7 +78,7 @@ class UpdateFullCorrection
     atoms.ProjectedGradientEnhancement(function, tau, stepSize);
     arma::mat tmp;
     atoms.RecoverVector(tmp);
-    newCoords = arma::conv_to<MatType>::from(tmp);
+    newCoords = conv_to<MatType>::from(tmp);
   }
 
  private:
diff --git a/inst/include/ensmallen_bits/fw/update_span.hpp b/inst/include/ensmallen_bits/fw/update_span.hpp
index 1c113d8..7ba21dc 100644
--- a/inst/include/ensmallen_bits/fw/update_span.hpp
+++ b/inst/include/ensmallen_bits/fw/update_span.hpp
@@ -63,7 +63,7 @@ class UpdateSpan
     // to the original size.
     arma::mat tmp;
     atoms.RecoverVector(tmp);
-    newCoords = arma::conv_to<MatType>::from(tmp);
+    newCoords = conv_to<MatType>::from(tmp);
 
     // Prune the support.
     if (isPrune)
@@ -72,7 +72,7 @@ class UpdateSpan
       double F = 0.25 * oldF + 0.75 * function.Evaluate(newCoords);
       atoms.PruneSupport(F, function);
       atoms.RecoverVector(tmp);
-      newCoords = arma::conv_to<MatType>::from(tmp);
+      newCoords = conv_to<MatType>::from(tmp);
     }
   }
 
diff --git a/inst/include/ensmallen_bits/gradient_descent/gradient_descent.hpp b/inst/include/ensmallen_bits/gradient_descent/gradient_descent.hpp
index b9c08a3..024c1bd 100644
--- a/inst/include/ensmallen_bits/gradient_descent/gradient_descent.hpp
+++ b/inst/include/ensmallen_bits/gradient_descent/gradient_descent.hpp
@@ -77,7 +77,7 @@ class GradientDescent
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(FunctionType& function,
            MatType& iterate,
@@ -140,9 +140,9 @@ class GradientDescent
       const arma::Row<size_t>& numCategories,
       CallbackTypes&&... callbacks)
   {
-    return Optimize<FunctionType, MatType, MatType,
-        CallbackTypes...>(function, iterate, categoricalDimensions,
-        numCategories, std::forward<CallbackTypes>(callbacks)...);
+    return Optimize<FunctionType, MatType, MatType, CallbackTypes...>(function,
+        iterate, categoricalDimensions, numCategories,
+        std::forward<CallbackTypes>(callbacks)...);
   }
 
   //! Get the step size.
diff --git a/inst/include/ensmallen_bits/gradient_descent/gradient_descent_impl.hpp b/inst/include/ensmallen_bits/gradient_descent/gradient_descent_impl.hpp
index 5301002..813244c 100644
--- a/inst/include/ensmallen_bits/gradient_descent/gradient_descent_impl.hpp
+++ b/inst/include/ensmallen_bits/gradient_descent/gradient_descent_impl.hpp
@@ -34,8 +34,8 @@ template<typename FunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 GradientDescent::Optimize(FunctionType& function,
                           MatType& iterateIn,
                           CallbackTypes&&... callbacks)
@@ -101,7 +101,7 @@ GradientDescent::Optimize(FunctionType& function,
     lastObjective = overallObjective;
 
     // And update the iterate.
-    iterate -= stepSize * gradient;
+    iterate -= ElemType(stepSize) * gradient;
     terminate |= Callback::StepTaken(*this, f, iterate, callbacks...);
   }
 
diff --git a/inst/include/ensmallen_bits/iqn/iqn.hpp b/inst/include/ensmallen_bits/iqn/iqn.hpp
index 3bd4f63..a1dbece 100644
--- a/inst/include/ensmallen_bits/iqn/iqn.hpp
+++ b/inst/include/ensmallen_bits/iqn/iqn.hpp
@@ -87,11 +87,11 @@ class IQN
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
-                                       MatType& iterate,
-                                       CallbackTypes&&... callbacks);
+           MatType& iterate,
+           CallbackTypes&&... callbacks);
 
   //! Forward the MatType as GradType.
   template<typename SeparableFunctionType,
diff --git a/inst/include/ensmallen_bits/iqn/iqn_impl.hpp b/inst/include/ensmallen_bits/iqn/iqn_impl.hpp
index eda999f..5499155 100644
--- a/inst/include/ensmallen_bits/iqn/iqn_impl.hpp
+++ b/inst/include/ensmallen_bits/iqn/iqn_impl.hpp
@@ -36,8 +36,8 @@ template<typename SeparableFunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 IQN::Optimize(SeparableFunctionType& functionIn,
               MatType& iterateIn,
               CallbackTypes&&... callbacks)
@@ -46,6 +46,7 @@ IQN::Optimize(SeparableFunctionType& functionIn,
   typedef typename MatType::elem_type ElemType;
   typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
   typedef typename MatTypeTraits<GradType>::BaseMatType BaseGradType;
+  typedef typename ForwardType<MatType>::bmat ProxyMatType;
 
   typedef Function<SeparableFunctionType, BaseMatType, BaseGradType>
       FullFunctionType;
@@ -81,8 +82,8 @@ IQN::Optimize(SeparableFunctionType& functionIn,
       iterate.n_cols));
   std::vector<BaseMatType> Q(numBatches, BaseMatType(iterate.n_elem,
       iterate.n_elem));
-  BaseMatType initialIterate = arma::randn<arma::Mat<ElemType>>(iterate.n_rows,
-      iterate.n_cols);
+  BaseMatType initialIterate = ProxyMatType(iterate.n_rows, iterate.n_cols,
+      GetFillType<MatType>::randn);
   BaseGradType B(iterate.n_elem, iterate.n_elem);
   B.eye();
 
@@ -103,7 +104,7 @@ IQN::Optimize(SeparableFunctionType& functionIn,
 
     Q[f].eye();
     g += y[f];
-    y[f] /= (double) effectiveBatchSize;
+    y[f] /= (ElemType) effectiveBatchSize;
 
     i += effectiveBatchSize;
   }
@@ -124,7 +125,7 @@ IQN::Optimize(SeparableFunctionType& functionIn,
       const size_t effectiveBatchSize = std::min(batchSize, numFunctions -
           it * batchSize);
 
-      if (arma::norm(iterate - t[it]) > 0)
+      if (norm(iterate - t[it]) > 0)
       {
         function.Gradient(iterate, it * batchSize, gradient,
             effectiveBatchSize);
@@ -133,31 +134,34 @@ IQN::Optimize(SeparableFunctionType& functionIn,
         terminate |= Callback::Gradient(*this, function, iterate, gradient,
             callbacks...);
 
-        const BaseMatType s = arma::vectorise(iterate - t[it]);
-        const BaseGradType yy = arma::vectorise(gradient - y[it]);
+        const BaseMatType s = vectorise(iterate - t[it]);
+        const BaseGradType yy = vectorise(gradient - y[it]);
 
         const BaseGradType stochasticHessian = Q[it] + yy * yy.t() /
-            arma::as_scalar(yy.t() * s) - Q[it] * s * s.t() *
-            Q[it] / arma::as_scalar(s.t() * Q[it] * s);
+            as_scalar(yy.t() * s) - Q[it] * s * s.t() *
+            Q[it] / as_scalar(s.t() * Q[it] * s);
+
+        const ElemType negBatches = 1 / ElemType(numBatches);
 
         // Update aggregate Hessian approximation.
-        B += (1.0 / numBatches) * (stochasticHessian - Q[it]);
+        B += negBatches * (stochasticHessian - Q[it]);
 
         // Update aggregate Hessian-variable product.
-        u += arma::reshape((1.0 / numBatches) * (stochasticHessian *
-            arma::vectorise(iterate) - Q[it] * arma::vectorise(t[it])),
-            u.n_rows, u.n_cols);;
+        u += reshape(negBatches * (stochasticHessian *
+            vectorise(iterate) - Q[it] * vectorise(t[it])),
+            u.n_rows, u.n_cols);
 
         // Update aggregate gradient.
-        g += (1.0 / numBatches) * (gradient - y[it]);
+        g += negBatches * (gradient - y[it]);
 
         // Update the function information tables.
         Q[it] = std::move(stochasticHessian);
         y[it] = std::move(gradient);
         t[it] = iterate;
 
-        iterate = arma::reshape(stepSize * B.i() * (u.t() - arma::vectorise(g)),
-            iterate.n_rows, iterate.n_cols) + (1 - stepSize) * iterate;
+        iterate = reshape(ElemType(stepSize) * pinv(B) * (u.t() - vectorise(g)),
+            iterate.n_rows, iterate.n_cols) +
+            (1 - ElemType(stepSize)) * iterate;
 
         terminate |= Callback::StepTaken(*this, function, iterate,
             callbacks...);
diff --git a/inst/include/ensmallen_bits/katyusha/katyusha.hpp b/inst/include/ensmallen_bits/katyusha/katyusha.hpp
index d416f8a..2126d92 100644
--- a/inst/include/ensmallen_bits/katyusha/katyusha.hpp
+++ b/inst/include/ensmallen_bits/katyusha/katyusha.hpp
@@ -93,7 +93,7 @@ class KatyushaType
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/katyusha/katyusha_impl.hpp b/inst/include/ensmallen_bits/katyusha/katyusha_impl.hpp
index 53c3043..ddca745 100644
--- a/inst/include/ensmallen_bits/katyusha/katyusha_impl.hpp
+++ b/inst/include/ensmallen_bits/katyusha/katyusha_impl.hpp
@@ -45,8 +45,8 @@ template<typename SeparableFunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 KatyushaType<Proximal>::Optimize(
     SeparableFunctionType& function,
     MatType& iterateIn,
@@ -80,20 +80,20 @@ KatyushaType<Proximal>::Optimize(
   if (numFunctions % batchSize != 0)
     ++numBatches; // Capture last few.
 
-  const double tau1 = std::min(0.5,
-      std::sqrt(batchSize * convexity / (3.0 * lipschitz)));
-  const double tau2 = 0.5;
-  const double alpha = 1.0 / (3.0 * tau1 * lipschitz);
-  const double r = 1.0 + std::min(alpha * convexity, 1.0 /
-      (4.0 / innerIterations));
+  const ElemType tau1 = ElemType(std::min(0.5,
+      std::sqrt(batchSize * convexity / (3 * lipschitz))));
+  const ElemType tau2 = ElemType(0.5);
+  const ElemType alpha = 1 / (3 * tau1 * ElemType(lipschitz));
+  const ElemType r = 1 + std::min(alpha * ElemType(convexity),
+                                  ElemType(innerIterations) / 4);
 
   // sum_{j=0}^{m-1} 1 + std::min(alpha * convexity, 1 / (4 * m)^j).
-  double normalizer = 1;
+  ElemType normalizer = 1;
   for (size_t i = 0; i < numBatches; i++)
   {
-    normalizer = r * (normalizer + 1.0);
+    normalizer = r * (normalizer + 1);
   }
-  normalizer = 1.0 / normalizer;
+  normalizer = 1 / normalizer;
 
   // To keep track of where we are and how things are going.
   ElemType overallObjective = 0;
@@ -168,10 +168,10 @@ KatyushaType<Proximal>::Optimize(
 
       f += effectiveBatchSize;
     }
-    fullGradient /= (double) numFunctions;
+    fullGradient /= (ElemType) numFunctions;
 
     // To keep track of where we are and how things are going.
-    double cw = 1;
+    ElemType cw = 1;
     w.zeros();
 
     for (size_t f = 0, currentFunction = 0; (f < innerIterations) && !terminate;
@@ -208,7 +208,7 @@ KatyushaType<Proximal>::Optimize(
       // By the minimality definition of z_{k + 1}, we have that:
       // z_{k+1} − z_k + \alpha * \sigma_{k+1} + \alpha g = 0.
       BaseMatType zNew = z - alpha * (fullGradient + (gradient - gradient0) /
-          (double) batchSize);
+          (ElemType) batchSize);
 
       // Proximal update, choose between Option I and Option II. Shift relative
       // to the Lipschitz constant or take a constant step using the given step
@@ -221,7 +221,7 @@ KatyushaType<Proximal>::Optimize(
         // yk = x0 − 1 / (3L) * \delta3 - ((1 - tau) / (3L)) + tau * alpha)
         // * \delta2 - ((1-tau)^2 / (3L) + (1 - (1 - tau)^2) * alpha) * \delta1,
         // k = 3.
-        y = iterate + 1.0 / (3.0 * lipschitz) * w;
+        y = iterate + 1 / (3 * ElemType(lipschitz)) * w;
       }
       else
       {
diff --git a/inst/include/ensmallen_bits/lbfgs/lbfgs.hpp b/inst/include/ensmallen_bits/lbfgs/lbfgs.hpp
index 8f58eee..d26b4ac 100644
--- a/inst/include/ensmallen_bits/lbfgs/lbfgs.hpp
+++ b/inst/include/ensmallen_bits/lbfgs/lbfgs.hpp
@@ -80,7 +80,7 @@ class L_BFGS
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(FunctionType& function,
            MatType& iterate,
@@ -177,10 +177,11 @@ class L_BFGS
    * @param y Differences between the gradient and the old gradient matrix.
    */
   template<typename MatType, typename CubeType>
-  double ChooseScalingFactor(const size_t iterationNum,
-                             const MatType& gradient,
-                             const CubeType& s,
-                             const CubeType& y);
+  typename MatType::elem_type ChooseScalingFactor(
+      const size_t iterationNum,
+      const MatType& gradient,
+      const CubeType& s,
+      const CubeType& y);
 
   /**
    * Perform a back-tracking line search along the search direction to
@@ -208,7 +209,7 @@ class L_BFGS
                   GradType& gradient,
                   MatType& newIterateTmp,
                   const GradType& searchDirection,
-                  double& finalStepSize,
+                  ElemType& finalStepSize,
                   CallbackTypes&... callbacks);
 
   /**
@@ -224,7 +225,7 @@ class L_BFGS
   template<typename MatType, typename CubeType>
   void SearchDirection(const MatType& gradient,
                        const size_t iterationNum,
-                       const double scalingFactor,
+                       const typename MatType::elem_type scalingFactor,
                        const CubeType& s,
                        const CubeType& y,
                        MatType& searchDirection);
diff --git a/inst/include/ensmallen_bits/lbfgs/lbfgs_impl.hpp b/inst/include/ensmallen_bits/lbfgs/lbfgs_impl.hpp
index 5d15401..b5ac443 100644
--- a/inst/include/ensmallen_bits/lbfgs/lbfgs_impl.hpp
+++ b/inst/include/ensmallen_bits/lbfgs/lbfgs_impl.hpp
@@ -72,34 +72,48 @@ inline L_BFGS::L_BFGS(const size_t numBasis,
  * @param y Differences between the gradient and the old gradient matrix.
  */
 template<typename MatType, typename CubeType>
-double L_BFGS::ChooseScalingFactor(const size_t iterationNum,
-                                   const MatType& gradient,
-                                   const CubeType& s,
-                                   const CubeType& y)
+typename MatType::elem_type L_BFGS::ChooseScalingFactor(
+    const size_t iterationNum,
+    const MatType& gradient,
+    const CubeType& s,
+    const CubeType& y)
 {
-  typedef typename CubeType::elem_type CubeElemType;
+  typedef typename CubeType::elem_type ElemType;
+  typedef typename ForwardType<CubeType>::bmat BaseMatType;
 
-  constexpr const CubeElemType tol =
-      100 * std::numeric_limits<CubeElemType>::epsilon();
+  constexpr const ElemType tol =
+      100 * std::numeric_limits<ElemType>::epsilon();
 
-  double scalingFactor;
+  ElemType scalingFactor;
   if (iterationNum > 0)
   {
     int previousPos = (iterationNum - 1) % numBasis;
     // Get s and y matrices once instead of multiple times.
-    const arma::Mat<CubeElemType>& sMat = s.slice(previousPos);
-    const arma::Mat<CubeElemType>& yMat = y.slice(previousPos);
+    const BaseMatType& sMat = s.slice(previousPos);
+    const BaseMatType& yMat = y.slice(previousPos);
 
-    const CubeElemType tmp   = arma::dot(yMat, yMat);
-    const CubeElemType denom = (tmp >= tol) ? tmp : CubeElemType(1);
+    const ElemType tmp = dot(yMat, yMat);
+    const ElemType denom = (tmp >= tol) ? tmp : ElemType(1);
+    if (std::isinf(tmp))
+    {
+      Warn << "L-BFGS: squared 2-norm of gradient difference is infinite; "
+          << "try using a higher-precision element type or setting MaxStep() "
+          << "to a smaller value." << std::endl;
+    }
 
-    scalingFactor = arma::dot(sMat, yMat) / denom;
+    scalingFactor = dot(sMat, yMat) / denom;
   }
   else
   {
-    const CubeElemType tmp = arma::norm(gradient, "fro");
+    const ElemType tmp = norm(gradient, "fro");
+    if (std::isinf(tmp))
+    {
+      Warn << "L-BFGS: Frobenius norm of gradient difference is infinite; "
+          << "try using a higher-precision element type or an initial point "
+          << "with a smaller gradient value." << std::endl;
+    }
 
-    scalingFactor = (tmp >= tol) ? (1.0 / tmp) : 1.0;
+    scalingFactor = (tmp >= tol) ? (1 / tmp) : 1;
   }
 
   return scalingFactor;
@@ -118,37 +132,38 @@ double L_BFGS::ChooseScalingFactor(const size_t iterationNum,
 template<typename MatType, typename CubeType>
 void L_BFGS::SearchDirection(const MatType& gradient,
                              const size_t iterationNum,
-                             const double scalingFactor,
+                             const typename MatType::elem_type scalingFactor,
                              const CubeType& s,
                              const CubeType& y,
                              MatType& searchDirection)
 {
+  typedef typename CubeType::elem_type ElemType;
+  typedef typename ForwardType<CubeType>::bmat BaseMatType;
+  typedef typename ForwardType<CubeType>::bcol BaseColType;
+
   // Start from this point.
   searchDirection = gradient;
 
   // See "A Recursive Formula to Compute H * g" in "Updating quasi-Newton
   // matrices with limited storage" (Nocedal, 1980).
-  typedef typename CubeType::elem_type CubeElemType;
 
   // Temporary variables.
-  arma::Col<CubeElemType> rho(numBasis);
-  arma::Col<CubeElemType> alpha(numBasis);
+  BaseColType rho(numBasis);
+  BaseColType alpha(numBasis);
 
   size_t limit = (numBasis > iterationNum) ? 0 : (iterationNum - numBasis);
   for (size_t i = iterationNum; i != limit; i--)
   {
     int translatedPosition = (i + (numBasis - 1)) % numBasis;
+    const BaseMatType& sMat = s.slice(translatedPosition);
+    const BaseMatType& yMat = y.slice(translatedPosition);
 
-    const arma::Mat<CubeElemType>& sMat = s.slice(translatedPosition);
-    const arma::Mat<CubeElemType>& yMat = y.slice(translatedPosition);
+    const ElemType tmp = dot(yMat, sMat);
 
-    const CubeElemType tmp = arma::dot(yMat, sMat);
-
-    rho[iterationNum - i] = (tmp != CubeElemType(0)) ? (1.0 / tmp) :
-        CubeElemType(1);
+    rho[iterationNum - i] = (tmp != ElemType(0)) ? (1 / tmp) : 1;
 
     alpha[iterationNum - i] = rho[iterationNum - i] *
-        arma::dot(sMat, searchDirection);
+        dot(sMat, searchDirection);
 
     searchDirection -= alpha[iterationNum - i] * yMat;
   }
@@ -158,8 +173,8 @@ void L_BFGS::SearchDirection(const MatType& gradient,
   for (size_t i = limit; i < iterationNum; i++)
   {
     int translatedPosition = i % numBasis;
-    double beta = rho[iterationNum - i - 1] *
-        arma::dot(y.slice(translatedPosition), searchDirection);
+    ElemType beta = rho[iterationNum - i - 1] *
+        dot(y.slice(translatedPosition), searchDirection);
     searchDirection += (alpha[iterationNum - i - 1] - beta) *
         s.slice(translatedPosition);
   }
@@ -222,23 +237,27 @@ bool L_BFGS::LineSearch(FunctionType& function,
                         GradType& gradient,
                         MatType& newIterateTmp,
                         const GradType& searchDirection,
-                        double& finalStepSize,
+                        ElemType& finalStepSize,
                         CallbackTypes&... callbacks)
 {
   // Default first step size of 1.0.
-  double stepSize = 1.0;
-  finalStepSize = 0.0; // Set only when we take the step.
+  ElemType stepSize = 1;
+  if (stepSize > ElemType(maxStep))
+    stepSize = ElemType(maxStep);
+  if (stepSize < ElemType(minStep))
+    stepSize = ElemType(minStep);
+  finalStepSize = 0; // Set only when we take the step.
 
   // The initial linear term approximation in the direction of the
   // search direction.
   ElemType initialSearchDirectionDotGradient =
-      arma::dot(gradient, searchDirection);
+      dot(gradient, searchDirection);
 
   // If it is not a descent direction, just report failure.
-  if ( (initialSearchDirectionDotGradient > 0.0)
+  if ( (initialSearchDirectionDotGradient > 0)
     || (std::isfinite(initialSearchDirectionDotGradient) == false) )
   {
-    Warn << "L-BFGS line search direction is not a descent direction "
+    Warn << "L-BFGS: line search direction is not a descent direction "
         << "(terminating)!" << std::endl;
     return false;
   }
@@ -247,17 +266,17 @@ bool L_BFGS::LineSearch(FunctionType& function,
   ElemType initialFunctionValue = functionValue;
 
   // Unit linear approximation to the decrease in function value.
-  ElemType linearApproxFunctionValueDecrease = armijoConstant *
+  ElemType linearApproxFunctionValueDecrease = ElemType(armijoConstant) *
       initialSearchDirectionDotGradient;
 
   // The number of iteration in the search.
   size_t numIterations = 0;
 
   // Armijo step size scaling factor for increase and decrease.
-  const double inc = 2.1;
-  const double dec = 0.5;
-  double width = 0;
-  double bestStepSize = 1.0;
+  const ElemType inc = ElemType(2.1);
+  const ElemType dec = ElemType(0.5);
+  ElemType width = 0;
+  ElemType bestStepSize = 1;
   ElemType bestObjective = std::numeric_limits<ElemType>::max();
 
   while (true)
@@ -270,7 +289,7 @@ bool L_BFGS::LineSearch(FunctionType& function,
 
     if (std::isnan(functionValue))
     {
-      Warn << "L-BFGS objective value is NaN (terminating)!" << std::endl;
+      Warn << "L-BFGS: objective value is NaN (terminating)!" << std::endl;
       return false;
     }
 
@@ -292,7 +311,7 @@ bool L_BFGS::LineSearch(FunctionType& function,
     else
     {
       // Check Wolfe's condition.
-      ElemType searchDirectionDotGradient = arma::dot(gradient,
+      ElemType searchDirectionDotGradient = dot(gradient,
           searchDirection);
 
       if (searchDirectionDotGradient < wolfe *
@@ -346,8 +365,8 @@ template<typename FunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 L_BFGS::Optimize(FunctionType& function,
                  MatType& iterateIn,
                  CallbackTypes&&... callbacks)
@@ -376,8 +395,10 @@ L_BFGS::Optimize(FunctionType& function,
   const size_t cols = iterate.n_cols;
 
   BaseMatType newIterateTmp(rows, cols);
-  arma::Cube<ElemType> s(rows, cols, numBasis);
-  arma::Cube<ElemType> y(rows, cols, numBasis);
+
+  typedef typename ForwardType<MatType>::bcube BaseCubeType;
+  BaseCubeType s(rows, cols, numBasis);
+  BaseCubeType y(rows, cols, numBasis);
 
   // The old iterate to be saved.
   BaseMatType oldIterate(iterate.n_rows, iterate.n_cols);
@@ -403,6 +424,7 @@ L_BFGS::Optimize(FunctionType& function,
         functionValue, gradient, callbacks...);
 
   ElemType prevFunctionValue;
+  Info << "L-BFGS: initial objective " << functionValue << "." << std::endl;
 
   // The main optimization loop.
   Callback::BeginOptimization(*this, f, iterate, callbacks...);
@@ -417,9 +439,10 @@ L_BFGS::Optimize(FunctionType& function,
     // least one descent step.
     // TODO: to speed this up, investigate use of arma::norm2est() in Armadillo
     // 12.4
-    if (arma::norm(gradient, 2) < minGradientNorm)
+    const ElemType gradNorm = norm(gradient, 2);
+    if (gradNorm < minGradientNorm)
     {
-      Info << "L-BFGS gradient norm too small (terminating successfully)."
+      Info << "L-BFGS: gradient norm too small (terminating successfully)."
           << std::endl;
       break;
     }
@@ -427,24 +450,24 @@ L_BFGS::Optimize(FunctionType& function,
     // Break if the objective is not a number.
     if (std::isnan(functionValue))
     {
-      Warn << "L-BFGS terminated with objective " << functionValue << "; "
+      Warn << "L-BFGS: terminated with objective " << functionValue << "; "
           << "are the objective and gradient functions implemented correctly?"
           << std::endl;
       break;
     }
 
     // Choose the scaling factor.
-    double scalingFactor = ChooseScalingFactor(itNum, gradient, s, y);
-    if (scalingFactor == 0.0)
+    ElemType scalingFactor = ChooseScalingFactor(itNum, gradient, s, y);
+    if (scalingFactor == 0)
     {
-      Info << "L-BFGS scaling factor computed as 0 (terminating successfully)."
+      Info << "L-BFGS: scaling factor computed as 0 (terminating successfully)."
           << std::endl;
       break;
     }
 
     if (std::isfinite(scalingFactor) == false)
       {
-      Warn << "L-BFGS scaling factor is not finite.  Stopping optimization."
+      Warn << "L-BFGS: scaling factor is not finite.  Stopping optimization."
            << std::endl;
       break;
       }
@@ -457,31 +480,34 @@ L_BFGS::Optimize(FunctionType& function,
     oldIterate = iterate;
     oldGradient = gradient;
 
-    double stepSize; // Set by LineSearch().
+    ElemType stepSize; // Set by LineSearch().
     if (!LineSearch(f, functionValue, iterate, gradient, newIterateTmp,
         searchDirection, stepSize, callbacks...))
     {
-      Warn << "Line search failed.  Stopping optimization." << std::endl;
+      Warn << "L-BFGS: line search failed.  Stopping optimization."
+          << std::endl;
       break; // The line search failed; nothing else to try.
     }
 
     // It is possible that the difference between the two coordinates is zero.
     // In this case we terminate successfully.
-    if (stepSize == 0.0)
+    if (stepSize == 0)
     {
-      Info << "L-BFGS step size of 0 (terminating successfully)."
+      Info << "L-BFGS: computed step size of 0 (terminating successfully)."
           << std::endl;
       break;
     }
 
+    Info << "L-BFGS: iteration " << itNum << ", objective " << functionValue
+        << ", step size " << stepSize << "." << std::endl;
+
     // If we can't make progress on the gradient, then we'll also accept
     // a stable function value.
-    const double denom = std::max(
-        std::max(std::abs(prevFunctionValue), std::abs(functionValue)),
-        (ElemType) 1.0);
+    const ElemType denom = std::max(ElemType(1),
+        std::max(std::abs(prevFunctionValue), std::abs(functionValue)));
     if ((prevFunctionValue - functionValue) / denom <= factr)
     {
-      Info << "L-BFGS function value stable (terminating successfully)."
+      Info << "L-BFGS: function value stable (terminating successfully)."
           << std::endl;
       break;
     }
@@ -499,4 +525,3 @@ L_BFGS::Optimize(FunctionType& function,
 } // namespace ens
 
 #endif // ENSMALLEN_LBFGS_LBFGS_IMPL_HPP
-
diff --git a/inst/include/ensmallen_bits/lookahead/lookahead.hpp b/inst/include/ensmallen_bits/lookahead/lookahead.hpp
index d7cd2cf..65da6c7 100644
--- a/inst/include/ensmallen_bits/lookahead/lookahead.hpp
+++ b/inst/include/ensmallen_bits/lookahead/lookahead.hpp
@@ -131,7 +131,7 @@ class Lookahead
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/lookahead/lookahead_impl.hpp b/inst/include/ensmallen_bits/lookahead/lookahead_impl.hpp
index cdb42b6..0d776b1 100644
--- a/inst/include/ensmallen_bits/lookahead/lookahead_impl.hpp
+++ b/inst/include/ensmallen_bits/lookahead/lookahead_impl.hpp
@@ -66,14 +66,32 @@ inline Lookahead<BaseOptimizerType, DecayPolicyType>::~Lookahead()
   instDecayPolicy.Clean();
 }
 
+template<typename BaseOptimizerType>
+size_t GetBatchSize(
+    const BaseOptimizerType& baseOptimizer,
+    const typename std::enable_if_t<traits::HasBatchSizeSignature<
+        BaseOptimizerType>::value>* = 0)
+{
+  return baseOptimizer.BatchSize();
+}
+
+template<typename BaseOptimizerType>
+size_t GetBatchSize(
+    const BaseOptimizerType& baseOptimizer,
+    const typename std::enable_if_t<!traits::HasBatchSizeSignature<
+        BaseOptimizerType>::value>* = 0)
+{
+  return 1;
+}
+
 //! Optimize the function (minimize).
 template<typename BaseOptimizerType, typename DecayPolicyType>
 template<typename SeparableFunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 Lookahead<BaseOptimizerType, DecayPolicyType>::Optimize(
     SeparableFunctionType& function,
     MatType& iterateIn,
@@ -111,8 +129,9 @@ Lookahead<BaseOptimizerType, DecayPolicyType>::Optimize(
   if (traits::HasResetPolicySignature<BaseOptimizerType>::value &&
       baseOptimizer.ResetPolicy())
   {
-    Warn << "Parameters are reset before every Optimize call; set "
-        << "ResetPolicy() to false.";
+    Warn << "Lookahead: base optimizer parameters are reset before every "
+        << "Optimize() call; set ResetPolicy() of the base optimizer to false "
+        << "to fix this problem." << std::endl;
     baseOptimizer.ResetPolicy() = resetPolicy;
   }
 
@@ -169,9 +188,12 @@ Lookahead<BaseOptimizerType, DecayPolicyType>::Optimize(
       return overallObjective;
     }
 
-    iterate += stepSize * (iterateModel - iterate);
+    iterate += ElemType(stepSize) * (iterateModel - iterate);
     terminate |= Callback::StepTaken(*this, f, iterate, callbacks...);
 
+    Info << "Lookahead: iteration " << i << ", objective " << overallObjective
+        << "." << std::endl;
+
     // Save the current objective.
     lastOverallObjective = overallObjective;
   }
@@ -185,11 +207,9 @@ Lookahead<BaseOptimizerType, DecayPolicyType>::Optimize(
     // Find the number of functions to use.
     const size_t numFunctions = f.NumFunctions();
 
-    size_t batchSize = 1;
     // Check if the optimizer implements the BatchSize() method and use the
     // parameter for the objective calculation.
-    if (traits::HasBatchSizeSignature<BaseOptimizerType>::value)
-      batchSize = baseOptimizer.BatchSize();
+    size_t batchSize = GetBatchSize(baseOptimizer);
 
     overallObjective = 0;
     for (size_t i = 0; i < numFunctions; i += batchSize)
diff --git a/inst/include/ensmallen_bits/moead/decomposition_policies/pbi_decomposition.hpp b/inst/include/ensmallen_bits/moead/decomposition_policies/pbi_decomposition.hpp
index cb4f87b..8da4197 100644
--- a/inst/include/ensmallen_bits/moead/decomposition_policies/pbi_decomposition.hpp
+++ b/inst/include/ensmallen_bits/moead/decomposition_policies/pbi_decomposition.hpp
@@ -63,11 +63,12 @@ class PenaltyBoundaryIntersection
   {
     typedef typename VecType::elem_type ElemType;
     //! A unit vector in the same direction as the provided weight vector.
-    const VecType referenceDirection = weight / arma::norm(weight);
+    const VecType referenceDirection = weight / norm(weight);
     //! Distance of F(x) from the idealPoint along the reference direction.
-    const ElemType d1 = arma::dot(candidateFitness - idealPoint, referenceDirection);
+    const ElemType d1 = dot(candidateFitness - idealPoint, referenceDirection);
     //! The perpendicular distance of F(x) from reference direction.
-    const ElemType d2 = arma::norm(candidateFitness - (idealPoint + d1 * referenceDirection));
+    const ElemType d2 = norm(candidateFitness - (idealPoint + d1 *
+        referenceDirection));
 
     return d1 + static_cast<ElemType>(theta) * d2;
   }
diff --git a/inst/include/ensmallen_bits/moead/decomposition_policies/tchebycheff_decomposition.hpp b/inst/include/ensmallen_bits/moead/decomposition_policies/tchebycheff_decomposition.hpp
index 0507c03..e40678c 100644
--- a/inst/include/ensmallen_bits/moead/decomposition_policies/tchebycheff_decomposition.hpp
+++ b/inst/include/ensmallen_bits/moead/decomposition_policies/tchebycheff_decomposition.hpp
@@ -57,7 +57,7 @@ class Tchebycheff
                                     const VecType& idealPoint,
                                     const VecType& candidateFitness)
   {
-    return arma::max(weight % arma::abs(candidateFitness - idealPoint));
+    return max(weight % abs(candidateFitness - idealPoint));
   }
 };
 
diff --git a/inst/include/ensmallen_bits/moead/decomposition_policies/weighted_decomposition.hpp b/inst/include/ensmallen_bits/moead/decomposition_policies/weighted_decomposition.hpp
index 8007395..a04a830 100644
--- a/inst/include/ensmallen_bits/moead/decomposition_policies/weighted_decomposition.hpp
+++ b/inst/include/ensmallen_bits/moead/decomposition_policies/weighted_decomposition.hpp
@@ -53,7 +53,7 @@ class WeightedAverage
                                     const VecType& /* idealPoint */,
                                     const VecType& candidateFitness)
   {
-    return arma::dot(weight, candidateFitness);
+    return dot(weight, candidateFitness);
   }
 };
 
diff --git a/inst/include/ensmallen_bits/moead/moead.hpp b/inst/include/ensmallen_bits/moead/moead.hpp
index f9c8e24..b271ca2 100644
--- a/inst/include/ensmallen_bits/moead/moead.hpp
+++ b/inst/include/ensmallen_bits/moead/moead.hpp
@@ -28,24 +28,25 @@
 namespace ens {
 
 /**
- * MOEA/D-DE (Multi Objective Evolutionary Algorithm based on Decompositon - 
- * Differential Variant) is a multiobjective optimization algorithm. This class 
- * implements the said optimizer. 
+ * MOEA/D-DE (Multi Objective Evolutionary Algorithm based on Decompositon -
+ * Differential Variant) is a multiobjective optimization algorithm. This class
+ * implements the said optimizer.
  *
- * The algorithm works by generating a candidate population from a fixed starting point. 
- * Reference directions are generated to guide the optimization process towards the Pareto Front. 
- * Further, a decomposition function is defined to decompose the problem to a scalar optimization 
- * objective. Utilizing genetic operators, offsprings are generated with better decomposition values 
- * to replace the neighboring parent solutions. 
+ * The algorithm works by generating a candidate population from a fixed starting point.
+ * Reference directions are generated to guide the optimization process towards the Pareto Front.
+ * Further, a decomposition function is defined to decompose the problem to a scalar optimization
+ * objective. Utilizing genetic operators, offsprings are generated with better decomposition values
+ * to replace the neighboring parent solutions.
  *
  * For more information, see the following:
  * @code
  * @article{li2008multiobjective,
- *   title={Multiobjective optimization problems with complicated Pareto sets, MOEA/D and NSGA-II},
- *   author={Li, Hui and Zhang, Qingfu},
- *   journal={IEEE transactions on evolutionary computation},
- *   pages={284--302},
- *   year={2008},
+ *   title   = {Multiobjective optimization problems with complicated Pareto
+ *              sets, MOEA/D and NSGA-II},
+ *   author  = {Li, Hui and Zhang, Qingfu},
+ *   journal = {IEEE transactions on evolutionary computation},
+ *   pages   = {284--302},
+ *   year    = {2008},
  * @endcode
  */
 template<typename InitPolicyType = Uniform,
@@ -142,9 +143,35 @@ class MOEAD {
   template<typename MatType,
            typename... ArbitraryFunctionType,
            typename... CallbackTypes>
-  typename MatType::elem_type Optimize(std::tuple<ArbitraryFunctionType...>& objectives,
-                                       MatType& iterate,
-                                       CallbackTypes&&... callbacks);
+  typename MatType::elem_type Optimize(
+      std::tuple<ArbitraryFunctionType...>& objectives,
+      MatType& iterate,
+      CallbackTypes&&... callbacks);
+
+  /**
+   * Optimize a set of objectives. The initial population is generated
+   * using the initial point. The output is the best generated front.
+   *
+   * @tparam MatType The type of matrix used to store coordinates.
+   * @tparam CubeType The type of cube used to store the front and Pareto set.
+   * @tparam ArbitraryFunctionType The type of objective function.
+   * @tparam CallbackTypes Types of callback function.
+   * @param objectives std::tuple of the objective functions.
+   * @param iterate The initial reference point for generating population.
+   * @param front The generated front.
+   * @param paretoSet The generated Pareto set.
+   * @param callbacks The callback functions.
+   */
+  template<typename MatType,
+           typename CubeType,
+           typename... ArbitraryFunctionType,
+           typename... CallbackTypes>
+  typename MatType::elem_type Optimize(
+      std::tuple<ArbitraryFunctionType...>& objectives,
+      MatType& iterate,
+      CubeType& front,
+      CubeType& paretoSet,
+      CallbackTypes&&... callbacks);
 
   //! Retrieve population size.
   size_t PopulationSize() const { return populationSize; }
@@ -201,14 +228,6 @@ class MOEAD {
   //! Modify value of upperBound.
   arma::vec& UpperBound() { return upperBound; }
 
-  //! Retrieve the Pareto optimal points in variable space. This returns an empty cube
-  //! until `Optimize()` has been called.
-  const arma::cube& ParetoSet() const { return paretoSet; }
-
-  //! Retrieve the best front (the Pareto frontier). This returns an empty cube until
-  //! `Optimize()` has been called.
-  const arma::cube& ParetoFront() const { return paretoFront; }
-
   //! Get the weight initialization policy.
   const InitPolicyType& InitPolicy() const { return initPolicy; }
   //! Modify the weight initialization policy.
@@ -227,8 +246,9 @@ class MOEAD {
    * @param neighborSize A matrix containing indices of the neighbors.
    * @return std::tuple<size_t, size_t> The chosen pair of indices.
    */
+  template<typename UMatType>
   std::tuple<size_t, size_t> Mating(size_t subProblemIdx,
-                                    const arma::umat& neighborSize,
+                                    const UMatType& neighborSize,
                                     bool sampleNeighbor);
 
   /**
@@ -253,27 +273,28 @@ class MOEAD {
    *
    * @tparam ArbitraryFunctionType std::tuple of multiple function types.
    * @tparam MatType Type of matrix to optimize.
+   * @tparam ColType Type of column vector to store objectives.
    * @param population The elite population.
    * @param objectives The set of objectives.
    * @param calculatedObjectives Vector to store calculated objectives.
    */
   template<std::size_t I = 0,
            typename MatType,
+           typename ColType,
            typename ...ArbitraryFunctionType>
   typename std::enable_if<I == sizeof...(ArbitraryFunctionType), void>::type
-  EvaluateObjectives(
-                     std::vector<MatType>&,
+  EvaluateObjectives(std::vector<MatType>&,
                      std::tuple<ArbitraryFunctionType...>&,
-                     std::vector<arma::Col<typename MatType::elem_type> >&);
+                     std::vector<ColType>&);
 
   template<std::size_t I = 0,
            typename MatType,
+           typename ColType,
            typename ...ArbitraryFunctionType>
   typename std::enable_if<I < sizeof...(ArbitraryFunctionType), void>::type
-  EvaluateObjectives(
-                     std::vector<MatType>& population,
+  EvaluateObjectives(std::vector<MatType>& population,
                      std::tuple<ArbitraryFunctionType...>& objectives,
-                     std::vector<arma::Col<typename MatType::elem_type> >&
+                     std::vector<ColType>&
                      calculatedObjectives);
 
   //! Size of the population.
diff --git a/inst/include/ensmallen_bits/moead/moead_impl.hpp b/inst/include/ensmallen_bits/moead/moead_impl.hpp
index 6aab814..dab45ea 100644
--- a/inst/include/ensmallen_bits/moead/moead_impl.hpp
+++ b/inst/include/ensmallen_bits/moead/moead_impl.hpp
@@ -78,27 +78,52 @@ MOEAD(const size_t populationSize,
     decompPolicy(decompPolicy)
   { /* Nothing to do here. */ }
 
+  //! Optimize the function.
+template <typename InitPolicyType, typename DecompPolicyType>
+template<typename MatType,
+         typename... ArbitraryFunctionType,
+         typename... CallbackTypes>
+typename MatType::elem_type MOEAD<InitPolicyType, DecompPolicyType>::
+Optimize(std::tuple<ArbitraryFunctionType...>& objectives,
+         MatType& iterateIn,
+         CallbackTypes&&... callbacks)
+{
+  typedef typename ForwardType<MatType>::bcube CubeType;
+  CubeType paretoFront, paretoSet;
+  return Optimize(objectives, iterateIn, paretoFront, paretoSet,
+      std::forward<CallbackTypes>(callbacks)...);
+}
+
 //! Optimize the function.
 template <typename InitPolicyType, typename DecompPolicyType>
 template<typename MatType,
+         typename CubeType,
          typename... ArbitraryFunctionType,
          typename... CallbackTypes>
 typename MatType::elem_type MOEAD<InitPolicyType, DecompPolicyType>::
 Optimize(std::tuple<ArbitraryFunctionType...>& objectives,
          MatType& iterateIn,
+         CubeType& paretoFrontIn,
+         CubeType& paretoSetIn,
          CallbackTypes&&... callbacks)
 {
   // Population Size must be at least 3 for MOEA/D-DE to work.
   if (populationSize < 3)
   {
-    throw std::logic_error("MOEA/D-DE::Optimize(): population size should be at least"
-        " 3!");
+    throw std::logic_error("MOEA/D-DE::Optimize(): population size should be "
+        "at least 3!");
   }
 
   // Convenience typedefs.
   typedef typename MatType::elem_type ElemType;
   typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
 
+  typedef typename ForwardType<MatType>::uvec UVecType;
+  typedef typename ForwardType<MatType>::umat UMatType;
+  typedef typename ForwardType<MatType>::brow BaseRowType;
+  typedef typename ForwardType<MatType>::bcol BaseColType;
+  typedef typename ForwardType<CubeType>::bmat CubeBaseMatType;
+
   BaseMatType& iterate = (BaseMatType&) iterateIn;
 
   // Make sure that we have the methods that we need.  Long name...
@@ -137,69 +162,79 @@ Optimize(std::tuple<ArbitraryFunctionType...>& objectives,
   assert(upperBound.n_rows == iterate.n_rows && "The dimensions of "
       "upperBound are not the same as the dimensions of iterate.");
 
+  //! Useful temporaries for float-like comparisons.
+  const BaseMatType castedLowerBound = conv_to<BaseMatType>::from(lowerBound);
+  const BaseMatType castedUpperBound = conv_to<BaseMatType>::from(upperBound);
+
   const size_t numObjectives = sizeof...(ArbitraryFunctionType);
   const size_t numVariables = iterate.n_rows;
 
-  //! Useful temporaries for float-like comparisons.
-  const BaseMatType castedLowerBound = arma::conv_to<BaseMatType>::from(lowerBound);
-  const BaseMatType castedUpperBound = arma::conv_to<BaseMatType>::from(upperBound);
-
   // Controls early termination of the optimization process.
   bool terminate = false;
 
-  // The weight matrix. Each vector represents a decomposition subproblem (M X N).
+  // The weight matrix. Each vector represents a decomposition
+  // subproblem (M X N).
   const BaseMatType weights = initPolicy.template Generate<BaseMatType>(
       numObjectives, populationSize, epsilon);
 
   // 1.1 Storing the indices of nearest neighbors of each weight vector.
-  arma::umat neighborIndices(neighborSize, populationSize);
+  UMatType neighborIndices(neighborSize, populationSize);
   for (size_t i = 0; i < populationSize; ++i)
   {
     // Cache the distance between weights[i] and other weights.
-    const arma::Row<ElemType> distances =
-        arma::sqrt(arma::sum(arma::square(weights.col(i) - weights.each_col())));
-    arma::uvec sortedIndices = arma::stable_sort_index(distances);
+    const BaseRowType distances =
+        conv_to<BaseRowType>::from(
+        sqrt(sum(square(weights.col(i) - weights.each_col()))));
+    UVecType sortedIndices = stable_sort_index(distances);
     // Ignore distance from self.
-    neighborIndices.col(i) = sortedIndices(arma::span(1, neighborSize));
+    neighborIndices.col(i) = sortedIndices(
+          typename GetProxyType<UVecType>::span(1, neighborSize), 0);
   }
 
   // 1.2 Random generation of the initial population.
   std::vector<BaseMatType> population(populationSize);
   for (BaseMatType& individual : population)
   {
-    individual = arma::randu<BaseMatType>(
-        iterate.n_rows, iterate.n_cols) - 0.5 + iterate;
+    individual = randu<BaseMatType>(
+        iterate.n_rows, iterate.n_cols) - ElemType(0.5) + iterate;
 
     // Constrain all genes to be within bounds.
-    individual = arma::min(arma::max(individual, castedLowerBound), castedUpperBound);
+    individual = min(max(individual, castedLowerBound), castedUpperBound);
   }
 
-  Info << "MOEA/D-DE initialized successfully. Optimization started." << std::endl;
+  Info << "MOEA/D-DE initialized successfully. Optimization started."
+      << std::endl;
 
-  std::vector<arma::Col<ElemType>> populationFitness(populationSize);
-  std::fill(populationFitness.begin(), populationFitness.end(),
-      arma::Col<ElemType>(numObjectives, arma::fill::zeros));
+  std::vector<BaseColType> populationFitness(populationSize);
+  for (size_t i = 0; i < populationSize; ++i)
+  {
+    populationFitness[i].set_size(numObjectives);
+    populationFitness[i].zeros();
+  }
   EvaluateObjectives(population, objectives, populationFitness);
 
   // 1.3 Initialize the ideal point z.
-  arma::Col<ElemType> idealPoint(numObjectives);
+  BaseColType idealPoint(numObjectives);
   idealPoint.fill(std::numeric_limits<ElemType>::max());
 
-  for (const arma::Col<ElemType>& individualFitness : populationFitness)
-    idealPoint = arma::min(idealPoint, individualFitness);
+  for (const BaseColType& individualFitness : populationFitness)
+    idealPoint = min(idealPoint, individualFitness);
 
   Callback::BeginOptimization(*this, objectives, iterate, callbacks...);
 
   // 2 The main loop.
-  for (size_t generation = 1; generation <= maxGenerations && !terminate; ++generation)
+  for (size_t generation = 1;
+      generation <= maxGenerations && !terminate; ++generation)
   {
     // Shuffle indexes of subproblems.
-    const arma::uvec shuffle = arma::shuffle(
-        arma::linspace<arma::uvec>(0, populationSize - 1, populationSize));
-    for (size_t subProblemIdx : shuffle)
+    const UVecType shuffleTemp = shuffle(
+        linspace<UVecType>(0, populationSize - 1, populationSize));
+
+    for (size_t i = 0; i < shuffleTemp.n_elem; ++i)
     {
-      // 2.1 Randomly select two indices in neighborIndices[subProblemIdx] and use them
-      // to make a child.
+      const size_t subProblemIdx = shuffleTemp(i);
+      // 2.1 Randomly select two indices in neighborIndices[subProblemIdx]
+      // and use them to make a child.
       size_t r1, r2, r3;
       r1 = subProblemIdx;
       // Randomly choose to sample from the population or the neighbors.
@@ -216,19 +251,21 @@ Optimize(std::tuple<ArbitraryFunctionType...>& objectives,
         if (arma::randu() < crossoverProb)
         {
           candidate(geneIdx) = population[r1](geneIdx) +
-              differentialWeight * (population[r2](geneIdx) -
-                  population[r3](geneIdx));
+              ElemType(differentialWeight) * (population[r2](geneIdx) -
+              population[r3](geneIdx));
 
           // Boundary conditions.
           if (candidate(geneIdx) < castedLowerBound(geneIdx))
           {
             candidate(geneIdx) = castedLowerBound(geneIdx) +
-                arma::randu() * (population[r1](geneIdx) - castedLowerBound(geneIdx));
+                arma::randu<ElemType>() *
+                (population[r1](geneIdx) - castedLowerBound(geneIdx));
           }
           if (candidate(geneIdx) > castedUpperBound(geneIdx))
           {
             candidate(geneIdx) = castedUpperBound(geneIdx) -
-                arma::randu() * (castedUpperBound(geneIdx) - population[r1](geneIdx));
+                arma::randu<ElemType>() *
+                (castedUpperBound(geneIdx) - population[r1](geneIdx));
           }
         }
         else
@@ -238,10 +275,10 @@ Optimize(std::tuple<ArbitraryFunctionType...>& objectives,
       Mutate(candidate, 1.0 / static_cast<double>(numVariables),
           castedLowerBound, castedUpperBound);
 
-      arma::Col<ElemType> candidateFitness(numObjectives);
+      BaseColType candidateFitness(numObjectives);
       //! Creating temp vectors to pass to EvaluateObjectives.
       std::vector<BaseMatType> candidateContainer { candidate };
-      std::vector<arma::Col<ElemType>> fitnessContainer { candidateFitness };
+      std::vector<BaseColType> fitnessContainer { candidateFitness };
       EvaluateObjectives(candidateContainer, objectives, fitnessContainer);
       candidateFitness = std::move(fitnessContainer[0]);
       //! Flush out the dummy containers.
@@ -249,17 +286,18 @@ Optimize(std::tuple<ArbitraryFunctionType...>& objectives,
       candidateContainer.clear();
 
       // 2.4 Update of ideal point.
-      idealPoint = arma::min(idealPoint, candidateFitness);
+      idealPoint = min(idealPoint, candidateFitness);
 
       // 2.5 Update of the population.
       size_t replaceCounter = 0;
       const size_t sampleSize = sampleNeighbor ? neighborSize : populationSize;
 
-      const arma::uvec idxShuffle = arma::shuffle(
-          arma::linspace<arma::uvec>(0, sampleSize - 1, sampleSize));
+      const arma::uvec idxShuffle = shuffle(
+          linspace<arma::uvec>(0, sampleSize - 1, sampleSize));
 
-      for (size_t idx : idxShuffle)
+      for (size_t i = 0; i < idxShuffle.n_elem; ++i)
       {
+        const size_t idx = idxShuffle(i);
         // Preserve diversity by controlling replacement of neighbors
         // by child solution.
         if (replaceCounter >= maxReplace)
@@ -269,9 +307,11 @@ Optimize(std::tuple<ArbitraryFunctionType...>& objectives,
             neighborIndices(idx, subProblemIdx) : idx;
 
         const ElemType candidateDecomposition = decompPolicy.template
-            Apply<arma::Col<ElemType>>(weights.col(pick), idealPoint, candidateFitness);
-        const ElemType parentDecomposition =  decompPolicy.template
-            Apply<arma::Col<ElemType>>(weights.col(pick), idealPoint, populationFitness[pick]);
+            Apply<BaseColType>(conv_to<BaseColType>::from(weights.col(pick)),
+            idealPoint, candidateFitness);
+        const ElemType parentDecomposition = decompPolicy.template
+            Apply<BaseColType>(conv_to<BaseColType>::from(weights.col(pick)),
+            idealPoint, populationFitness[pick]);
 
         if (candidateDecomposition < parentDecomposition)
         {
@@ -291,24 +331,24 @@ Optimize(std::tuple<ArbitraryFunctionType...>& objectives,
   } // End of pass over all the generations.
 
   // Set the candidates from the Pareto Set as the output.
-  paretoSet.set_size(population[0].n_rows, population[0].n_cols, population.size());
+  paretoSetIn.set_size(
+      population[0].n_rows, population[0].n_cols, population.size());
 
-  // The Pareto Front is stored, can be obtained via ParetoSet() getter.
   for (size_t solutionIdx = 0; solutionIdx < population.size(); ++solutionIdx)
   {
-    paretoSet.slice(solutionIdx) =
-        arma::conv_to<arma::mat>::from(population[solutionIdx]);
+    paretoSetIn.slice(solutionIdx) =
+        conv_to<CubeBaseMatType>::from(population[solutionIdx]);
   }
 
   // Set the candidates from the Pareto Front as the output.
-  paretoFront.set_size(populationFitness[0].n_rows, populationFitness[0].n_cols,
-      populationFitness.size());
+  paretoFrontIn.set_size(populationFitness[0].n_rows,
+      populationFitness[0].n_cols, populationFitness.size());
 
-  // The Pareto Front is stored, can be obtained via ParetoFront() getter.
-  for (size_t solutionIdx = 0; solutionIdx < populationFitness.size(); ++solutionIdx)
+  for (size_t solutionIdx = 0;
+      solutionIdx < populationFitness.size(); ++solutionIdx)
   {
-    paretoFront.slice(solutionIdx) =
-        arma::conv_to<arma::mat>::from(populationFitness[solutionIdx]);
+    paretoFrontIn.slice(solutionIdx) =
+        conv_to<CubeBaseMatType>::from(populationFitness[solutionIdx]);
   }
 
   // Assign iterate to first element of the Pareto Set.
@@ -320,19 +360,20 @@ Optimize(std::tuple<ArbitraryFunctionType...>& objectives,
 
   for (size_t geneIdx = 0; geneIdx < numObjectives; ++geneIdx)
   {
-    if (arma::accu(populationFitness[geneIdx]) < performance)
-      performance = arma::accu(populationFitness[geneIdx]);
+    if (accu(populationFitness[geneIdx]) < performance)
+      performance = accu(populationFitness[geneIdx]);
   }
 
   return performance;
 }
 
 //! Randomly chooses to select from parents or neighbors.
-template <typename InitPolicyType, typename DecompPolicyType>
+template<typename InitPolicyType, typename DecompPolicyType>
+template<typename UMatType>
 inline std::tuple<size_t, size_t>
 MOEAD<InitPolicyType, DecompPolicyType>::
 Mating(size_t subProblemIdx,
-       const arma::umat& neighborIndices,
+       const UMatType& neighborIndices,
        bool sampleNeighbor)
 {
   //! Indexes of two points from the sample space.
@@ -368,50 +409,55 @@ Mutate(MatType& candidate,
        const MatType& lowerBound,
        const MatType& upperBound)
 {
-    const size_t numVariables = candidate.n_rows;
-    for (size_t geneIdx = 0; geneIdx < numVariables; ++geneIdx)
-    {
-      // Should this gene be mutated?
-      if (arma::randu() > mutationRate)
-        continue;
-
-      const double geneRange = upperBound(geneIdx) - lowerBound(geneIdx);
-      // Normalised distance from the bounds.
-      const double lowerDelta = (candidate(geneIdx) - lowerBound(geneIdx)) / geneRange;
-      const double upperDelta = (upperBound(geneIdx) - candidate(geneIdx)) / geneRange;
-      const double mutationPower = 1. / (distributionIndex + 1.0);
-      const double rand = arma::randu();
-      double value, perturbationFactor;
-      if (rand < 0.5)
-      {
-        value = 2.0 * rand + (1.0 - 2.0 * rand) *
-            std::pow(upperDelta, distributionIndex + 1.0);
-        perturbationFactor = std::pow(value, mutationPower) - 1.0;
-      }
-      else
-      {
-        value = 2.0 * (1.0 - rand) + 2.0 *(rand - 0.5) *
-            std::pow(lowerDelta, distributionIndex + 1.0);
-        perturbationFactor = 1.0 - std::pow(value, mutationPower);
-      }
+  typedef typename MatType::elem_type ElemType;
 
-      candidate(geneIdx) += perturbationFactor * geneRange;
+  const size_t numVariables = candidate.n_rows;
+  for (size_t geneIdx = 0; geneIdx < numVariables; ++geneIdx)
+  {
+    // Should this gene be mutated?
+    if (arma::randu() > mutationRate)
+      continue;
+
+    const double geneRange = upperBound(geneIdx) - lowerBound(geneIdx);
+    // Normalised distance from the bounds.
+    const double lowerDelta = (candidate(geneIdx) - lowerBound(geneIdx)) /
+        geneRange;
+    const double upperDelta = (upperBound(geneIdx) - candidate(geneIdx)) /
+        geneRange;
+    const double mutationPower = 1. / (distributionIndex + 1.0);
+    const double rand = arma::randu();
+    double value, perturbationFactor;
+    if (rand < 0.5)
+    {
+      value = 2.0 * rand + (1.0 - 2.0 * rand) *
+          std::pow(upperDelta, distributionIndex + 1.0);
+      perturbationFactor = std::pow(value, mutationPower) - 1.0;
     }
-    //! Enforce bounds.
-    candidate = arma::min(arma::max(candidate, lowerBound), upperBound);
+    else
+    {
+      value = 2.0 * (1.0 - rand) + 2.0 * (rand - 0.5) *
+          std::pow(lowerDelta, distributionIndex + 1.0);
+      perturbationFactor = 1.0 - std::pow(value, mutationPower);
+    }
+
+    candidate(geneIdx) += ElemType(perturbationFactor * geneRange);
+  }
+  //! Enforce bounds.
+  candidate = min(max(candidate, lowerBound), upperBound);
 }
 
 //! No objectives to evaluate.
-template <typename InitPolicyType, typename DecompPolicyType>
+template <typename InitPolicyType,
+          typename DecompPolicyType>
 template<std::size_t I,
-         typename MatType,
+         typename InputMatType,
+         typename InputColType,
          typename ...ArbitraryFunctionType>
 typename std::enable_if<I == sizeof...(ArbitraryFunctionType), void>::type
-MOEAD<InitPolicyType, DecompPolicyType>::
-EvaluateObjectives(
-    std::vector<MatType>&,
+MOEAD<InitPolicyType, DecompPolicyType>::EvaluateObjectives(
+    std::vector<InputMatType>&,
     std::tuple<ArbitraryFunctionType...>&,
-    std::vector<arma::Col<typename MatType::elem_type> >&)
+    std::vector<InputColType>&)
 {
  // Nothing to do here.
 }
@@ -419,20 +465,21 @@ EvaluateObjectives(
 //! Evaluate the objectives for the entire population.
 template <typename InitPolicyType, typename DecompPolicyType>
 template<std::size_t I,
-         typename MatType,
+         typename InputMatType,
+         typename InputColType,
          typename ...ArbitraryFunctionType>
 typename std::enable_if<I < sizeof...(ArbitraryFunctionType), void>::type
 MOEAD<InitPolicyType, DecompPolicyType>::
 EvaluateObjectives(
-    std::vector<MatType>& population,
+    std::vector<InputMatType>& population,
     std::tuple<ArbitraryFunctionType...>& objectives,
-    std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives)
+    std::vector<InputColType>& calculatedObjectives)
 {
   for (size_t i = 0; i < population.size(); i++)
   {
     calculatedObjectives[i](I) = std::get<I>(objectives).Evaluate(population[i]);
-    EvaluateObjectives<I+1, MatType, ArbitraryFunctionType...>(population, objectives,
-                                                               calculatedObjectives);
+    EvaluateObjectives<I+1, InputMatType, InputColType,
+        ArbitraryFunctionType...>(population, objectives, calculatedObjectives);
   }
 }
 
diff --git a/inst/include/ensmallen_bits/moead/weight_init_policies/bbs_init.hpp b/inst/include/ensmallen_bits/moead/weight_init_policies/bbs_init.hpp
index b877dd4..0fc6294 100644
--- a/inst/include/ensmallen_bits/moead/weight_init_policies/bbs_init.hpp
+++ b/inst/include/ensmallen_bits/moead/weight_init_policies/bbs_init.hpp
@@ -53,17 +53,17 @@ class BayesianBootstrap
                    const size_t numPoints,
                    const double epsilon)
   {
-      typedef typename MatType::elem_type ElemType;
-      typedef typename arma::Col<ElemType> VecType;
+      typedef typename ForwardType<MatType>::bvec VecType;
 
       MatType weights(numObjectives, numPoints);
       for (size_t pointIdx = 0; pointIdx < numPoints; ++pointIdx)
       {
-        VecType referenceDirection(numObjectives + 1, arma::fill::randu);
+        VecType referenceDirection(numObjectives + 1,
+            GetFillType<VecType>::randu);
         referenceDirection(0) = 0;
         referenceDirection(numObjectives) = 1;
-        referenceDirection = arma::sort(referenceDirection);
-        referenceDirection = arma::diff(referenceDirection);
+        referenceDirection = sort(referenceDirection);
+        referenceDirection = diff(referenceDirection);
         weights.col(pointIdx) = std::move(referenceDirection) + epsilon;
       }
 
diff --git a/inst/include/ensmallen_bits/moead/weight_init_policies/dirichlet_init.hpp b/inst/include/ensmallen_bits/moead/weight_init_policies/dirichlet_init.hpp
index 695b417..2788eb6 100644
--- a/inst/include/ensmallen_bits/moead/weight_init_policies/dirichlet_init.hpp
+++ b/inst/include/ensmallen_bits/moead/weight_init_policies/dirichlet_init.hpp
@@ -15,8 +15,8 @@
 namespace ens {
 
 /**
- * The Dirichlet method for initializing weights. Sampling a 
- * Dirichlet distribution with parameters set to one returns 
+ * The Dirichlet method for initializing weights. Sampling a
+ * Dirichlet distribution with parameters set to one returns
  * point lying on unit simplex with uniform distribution.
  */
 class Dirichlet
@@ -43,10 +43,15 @@ class Dirichlet
                    const size_t numPoints,
                    const double epsilon)
   {
-    MatType weights = arma::randg<MatType>(numObjectives, numPoints,
-        arma::distr_param(1.0, 1.0)) + epsilon;
+    // TODO: Replace with randg once Bandicoot supports it. Simulate randg using
+    // inverse transform sampling.
+    // arma::mat weights = arma::randg<MatType>(numObjectives, numPoints,
+    //       arma::distr_param(1.0, 1.0)) + epsilon;
+    MatType weights = -log(1.0 - randu<MatType>(
+        numObjectives, numPoints)) + epsilon;
+
     // Normalize each column.
-    return arma::normalise(weights, 1, 0);
+    return normalise(weights, 1, 0);
   }
 };
 
diff --git a/inst/include/ensmallen_bits/moead/weight_init_policies/uniform_init.hpp b/inst/include/ensmallen_bits/moead/weight_init_policies/uniform_init.hpp
index 002033a..db63159 100644
--- a/inst/include/ensmallen_bits/moead/weight_init_policies/uniform_init.hpp
+++ b/inst/include/ensmallen_bits/moead/weight_init_policies/uniform_init.hpp
@@ -59,7 +59,8 @@ class Uniform
     //! The requested number of points is not matching any partition number.
     if (numPoints != validNumPoints)
     {
-      size_t nextValidNumPoints = FindNumUniformPoints(numObjectives, numPartitions + 1);
+      size_t nextValidNumPoints = FindNumUniformPoints(
+          numObjectives, numPartitions + 1);
       std::ostringstream oss;
       oss << "DasDennis::Generate(): " << "The requested numPoints " << numPoints
           << " cannot be generated uniformly.\n " << "Either choose numPoints as "
@@ -128,8 +129,7 @@ class Uniform
   /**
    * A helper function for DasDennis
    */
-  template<typename AuxInfoStackType,
-           typename MatType>
+  template<typename AuxInfoStackType, typename MatType>
   void DasDennisHelper(AuxInfoStackType& progressStack,
                        MatType& weights,
                        const size_t numObjectives,
@@ -138,10 +138,10 @@ class Uniform
                        const double epsilon)
   {
     typedef typename MatType::elem_type ElemType;
-    typedef typename arma::Row<ElemType> RowType;
+    typedef typename ForwardType<MatType>::brow RowType;
 
     size_t counter = 0;
-    const ElemType delta = 1.0 / (ElemType)numPartitions;
+    const ElemType delta = 1 / (ElemType) numPartitions;
 
     while ((counter < numPoints) && !progressStack.empty())
     {
@@ -154,7 +154,7 @@ class Uniform
       {
         point.insert_rows(point.n_rows, RowType(1).fill(
             delta * static_cast<ElemType>(beta)));
-        weights.col(counter) = point + epsilon;
+        weights.col(counter) = point + ElemType(epsilon);
         ++counter;
       }
 
@@ -189,7 +189,7 @@ class Uniform
     //! Init the progress stack.
     progressStack.push_back({{}, numPartitions});
     MatType weights(numObjectives, numPoints);
-    weights.fill(arma::datum::nan);
+    weights.fill(arma::Datum<typename MatType::elem_type>::nan);
     DasDennisHelper<decltype(progressStack), MatType>(
         progressStack,
         weights,
diff --git a/inst/include/ensmallen_bits/nsga2/nsga2.hpp b/inst/include/ensmallen_bits/nsga2/nsga2.hpp
index 06035dd..4af8612 100644
--- a/inst/include/ensmallen_bits/nsga2/nsga2.hpp
+++ b/inst/include/ensmallen_bits/nsga2/nsga2.hpp
@@ -40,10 +40,10 @@ namespace ens {
  *
  * @code
  * @article{10.1109/4235.996017,
- *   author = {Deb, K. and Pratap, A. and Agarwal, S. and Meyarivan, T.},
- *   title = {A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II},
- *   year = {2002},
- *   url = {https://doi.org/10.1109/4235.996017},
+ *   author  = {Deb, K. and Pratap, A. and Agarwal, S. and Meyarivan, T.},
+ *   title   = {A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II},
+ *   year    = {2002},
+ *   url     = {https://doi.org/10.1109/4235.996017},
  *   journal = {Trans. Evol. Comp}}
  * @endcode
  *
@@ -125,10 +125,37 @@ class NSGA2
   template<typename MatType,
            typename... ArbitraryFunctionType,
            typename... CallbackTypes>
- typename MatType::elem_type Optimize(
-     std::tuple<ArbitraryFunctionType...>& objectives,
-     MatType& iterate,
-     CallbackTypes&&... callbacks);
+  typename MatType::elem_type Optimize(
+      std::tuple<ArbitraryFunctionType...>& objectives,
+      MatType& iterate,
+      CallbackTypes&&... callbacks);
+
+  /**
+   * Optimize a set of objectives. The initial population is generated using the
+   * starting point. The output is the best generated front.
+   *
+   * @tparam ArbitraryFunctionType std::tuple of multiple objectives.
+   * @tparam MatType Type of matrix to optimize.
+   * @tparam CubeType The type of cube used to store the front and Pareto set.
+   * @tparam CallbackTypes Types of callback functions.
+   * @param objectives Vector of objective functions to optimize for.
+   * @param iterate Starting point.
+   * @param front The generated front.
+   * @param paretoSet The generated Pareto set.
+   * @param callbacks Callback functions.
+   * @return MatType::elem_type The minimum of the accumulated sum over the
+   *     objective values in the best front.
+   */
+  template<typename MatType,
+           typename CubeType,
+           typename... ArbitraryFunctionType,
+           typename... CallbackTypes>
+  typename MatType::elem_type Optimize(
+      std::tuple<ArbitraryFunctionType...>& objectives,
+      MatType& iterate,
+      CubeType& front,
+      CubeType& paretoSet,
+      CallbackTypes&&... callbacks);
 
   //! Get the population size.
   size_t PopulationSize() const { return populationSize; }
@@ -170,60 +197,34 @@ class NSGA2
   //! Modify value of upperBound.
   arma::vec& UpperBound() { return upperBound; }
 
-  //! Retrieve the Pareto optimal points in variable space. This returns an empty cube
-  //! until `Optimize()` has been called.
-  const arma::cube& ParetoSet() const { return paretoSet; }
-
-  //! Retrieve the best front (the Pareto frontier). This returns an empty cube until
-  //! `Optimize()` has been called.
-  const arma::cube& ParetoFront() const { return paretoFront; }
-
-  /**
-   * Retrieve the best front (the Pareto frontier).  This returns an empty
-   * vector until `Optimize()` has been called.  Note that this function is
-   * deprecated and will be removed in ensmallen 3.x!  Use `ParetoFront()`
-   * instead.
-   */
-  [[deprecated("use ParetoFront() instead")]] const std::vector<arma::mat>& Front()
-  {
-    if (rcFront.size() == 0)
-    {
-      // Match the old return format.
-      for (size_t i = 0; i < paretoFront.n_slices; ++i)
-      {
-        rcFront.push_back(arma::mat(paretoFront.slice(i)));
-      }
-    }
-
-    return rcFront;
-  }
-
  private:
   /**
    * Evaluate objectives for the elite population.
    *
    * @tparam ArbitraryFunctionType std::tuple of multiple function types.
-   * @tparam MatType Type of matrix to optimize.
+   * @tparam InputMatType Type of matrix to optimize.
    * @param population The elite population.
    * @param objectives The set of objectives.
-   * @param calculatedObjectives Vector to store calculated objectives.
+   * @param calculatedObjectives Matrix to store calculated objectives (numObjectives x 1 x populationSize).
    */
   template<std::size_t I = 0,
            typename MatType,
+           typename ObjectiveMatType,
            typename ...ArbitraryFunctionType>
   typename std::enable_if<I == sizeof...(ArbitraryFunctionType), void>::type
   EvaluateObjectives(std::vector<MatType>&,
                      std::tuple<ArbitraryFunctionType...>&,
-                     std::vector<arma::Col<typename MatType::elem_type> >&);
+                     ObjectiveMatType&);
 
   template<std::size_t I = 0,
            typename MatType,
+           typename ObjectiveMatType,
            typename ...ArbitraryFunctionType>
   typename std::enable_if<I < sizeof...(ArbitraryFunctionType), void>::type
-  EvaluateObjectives(std::vector<MatType>& population,
-                     std::tuple<ArbitraryFunctionType...>& objectives,
-                     std::vector<arma::Col<typename MatType::elem_type> >&
-                     calculatedObjectives);
+  EvaluateObjectives(
+      std::vector<MatType>& population,
+      std::tuple<ArbitraryFunctionType...>& objectives,
+      ObjectiveMatType& calculatedObjectives);
 
   /**
    * Reproduce candidates from the elite population to generate a new
@@ -235,10 +236,11 @@ class NSGA2
    * @param lowerBound Lower bound of the coordinates of the initial population.
    * @param upperBound Upper bound of the coordinates of the initial population.
    */
-  template<typename MatType>
-  void BinaryTournamentSelection(std::vector<MatType>& population,
-                                 const MatType& lowerBound,
-                                 const MatType& upperBound);
+  template<typename InputMatType>
+  void BinaryTournamentSelection(
+      std::vector<InputMatType>& population,
+      const InputMatType& lowerBound,
+      const InputMatType& upperBound);
 
   /**
    * Crossover two parents to create a pair of new children.
@@ -249,11 +251,12 @@ class NSGA2
    * @param parentA First parent from elite population.
    * @param parentB Second parent from elite population.
    */
-  template<typename MatType>
-  void Crossover(MatType& childA,
-                 MatType& childB,
-                 const MatType& parentA,
-                 const MatType& parentB);
+  template<typename InputMatType>
+  void Crossover(
+      InputMatType& childA,
+      InputMatType& childB,
+      const InputMatType& parentA,
+      const InputMatType& parentB);
 
   /**
    * Mutate the coordinates for a candidate.
@@ -264,10 +267,11 @@ class NSGA2
    * @param lowerBound Lower bound of the coordinates of the initial population.
    * @param upperBound Upper bound of the coordinates of the initial population.
    */
-  template<typename MatType>
-  void Mutate(MatType& child,
-              const MatType& lowerBound,
-              const MatType& upperBound);
+  template<typename InputMatType>
+  void Mutate(
+      InputMatType& child,
+      const InputMatType& lowerBound,
+      const InputMatType& upperBound);
 
   /**
    * Sort the candidate population using their domination count and the set of
@@ -283,7 +287,7 @@ class NSGA2
   void FastNonDominatedSort(
       std::vector<std::vector<size_t> >& fronts,
       std::vector<size_t>& ranks,
-      std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives);
+      MatType& calculatedObjectives);
 
   /**
    * Operator to check if one candidate Pareto-dominates the other.
@@ -300,7 +304,7 @@ class NSGA2
    */
   template<typename MatType>
   bool Dominates(
-      std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives,
+      MatType& calculatedObjectives,
       size_t candidateP,
       size_t candidateQ);
 
@@ -315,7 +319,7 @@ class NSGA2
   template <typename MatType>
   void CrowdingDistanceAssignment(
       const std::vector<size_t>& front,
-      std::vector<arma::Col<typename MatType::elem_type>>& calculatedObjectives,
+      MatType& calculatedObjectives,
       std::vector<typename MatType::elem_type>& crowdingDistance);
 
   /**
@@ -334,16 +338,17 @@ class NSGA2
    *    the population.
    * @return true if the first candidate is preferred, otherwise, false.
    */
-  template<typename MatType>
-  bool CrowdingOperator(size_t idxP,
-                        size_t idxQ,
-                        const std::vector<size_t>& ranks,
-                        const std::vector<typename MatType::elem_type>& crowdingDistance);
+  template<typename InputMatType>
+  bool CrowdingOperator(
+      size_t idxP,
+      size_t idxQ,
+      const std::vector<size_t>& ranks,
+      const std::vector<typename InputMatType::elem_type>& crowdingDistance);
 
   //! The number of objectives being optimised for.
   size_t numObjectives;
 
-  //! The numbeer of variables used per objectives.
+  //! The number of variables used per objectives.
   size_t numVariables;
 
   //! The number of candidates in the population.
@@ -369,19 +374,6 @@ class NSGA2
 
   //! Upper bound of the initial swarm.
   arma::vec upperBound;
-
-  //! The set of all the Pareto optimal points.
-  //! Stored after Optimize() is called.
-  arma::cube paretoSet;
-
-  //! The set of all the Pareto optimal objective vectors.
-  //! Stored after Optimize() is called.
-  arma::cube paretoFront;
-
-  //! A different representation of the Pareto front, for reverse compatibility
-  //! purposes.  This can be removed when ensmallen 3.x is released!  (Along
-  //! with `Front()`.)  This is only populated when `Front()` is called.
-  std::vector<arma::mat> rcFront;
 };
 
 } // namespace ens
diff --git a/inst/include/ensmallen_bits/nsga2/nsga2_impl.hpp b/inst/include/ensmallen_bits/nsga2/nsga2_impl.hpp
index 00c63b0..e75e878 100644
--- a/inst/include/ensmallen_bits/nsga2/nsga2_impl.hpp
+++ b/inst/include/ensmallen_bits/nsga2/nsga2_impl.hpp
@@ -68,6 +68,24 @@ typename MatType::elem_type NSGA2::Optimize(
     std::tuple<ArbitraryFunctionType...>& objectives,
     MatType& iterateIn,
     CallbackTypes&&... callbacks)
+{
+  typedef typename ForwardType<MatType>::bcube CubeType;
+  CubeType paretoFront, paretoSet;
+  return Optimize(objectives, iterateIn, paretoFront, paretoSet,
+      std::forward<CallbackTypes>(callbacks)...);
+}
+
+//! Optimize the function.
+template<typename MatType,
+         typename CubeType,
+         typename... ArbitraryFunctionType,
+         typename... CallbackTypes>
+typename MatType::elem_type NSGA2::Optimize(
+    std::tuple<ArbitraryFunctionType...>& objectives,
+    MatType& iterateIn,
+    CubeType& paretoFrontIn,
+    CubeType& paretoSetIn,
+    CallbackTypes&&... callbacks)
 {
   // Make sure for evolution to work at least four candidates are present.
   if (populationSize < 4 && populationSize % 4 != 0)
@@ -79,6 +97,7 @@ typename MatType::elem_type NSGA2::Optimize(
   // Convenience typedefs.
   typedef typename MatType::elem_type ElemType;
   typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
+  typedef typename ForwardType<CubeType>::bmat CubeBaseMatType;
 
   BaseMatType& iterate = (BaseMatType&) iterateIn;
 
@@ -104,8 +123,9 @@ typename MatType::elem_type NSGA2::Optimize(
   numObjectives = sizeof...(ArbitraryFunctionType);
   numVariables = iterate.n_rows;
 
-  // Cache calculated objectives.
-  std::vector<arma::Col<ElemType> > calculatedObjectives(populationSize);
+  // Cache calculated objectives as a matrix: (numObjectives x populationSize).
+  arma::Mat<ElemType> calculatedObjectives(numObjectives, populationSize,
+      arma::fill::zeros);
 
   // Population size reserved to 2 * populationSize + 1 to accommodate
   // for the size of intermediate candidate population.
@@ -121,8 +141,10 @@ typename MatType::elem_type NSGA2::Optimize(
   std::vector<size_t> ranks;
 
   //! Useful temporaries for float-like comparisons.
-  const BaseMatType castedLowerBound = arma::conv_to<BaseMatType>::from(lowerBound);
-  const BaseMatType castedUpperBound = arma::conv_to<BaseMatType>::from(upperBound);
+  const BaseMatType castedLowerBound = conv_to<BaseMatType>::from(
+      lowerBound);
+  const BaseMatType castedUpperBound = conv_to<BaseMatType>::from(
+      upperBound);
 
   // Controls early termination of the optimization process.
   bool terminate = false;
@@ -131,11 +153,11 @@ typename MatType::elem_type NSGA2::Optimize(
   // starting point.
   for (size_t i = 0; i < populationSize; i++)
   {
-    population.push_back(arma::randu<BaseMatType>(iterate.n_rows,
-        iterate.n_cols) - 0.5 + iterate);
+    population.push_back(randu<BaseMatType>(iterate.n_rows,
+        iterate.n_cols) - ElemType(0.5) + iterate);
 
     // Constrain all genes to be within bounds.
-    population[i] = arma::min(arma::max(population[i], castedLowerBound), castedUpperBound);
+    population[i] = min(max(population[i], castedLowerBound), castedUpperBound);
   }
 
   Info << "NSGA2 initialized successfully. Optimization started." << std::endl;
@@ -143,7 +165,8 @@ typename MatType::elem_type NSGA2::Optimize(
   // Iterate until maximum number of generations is obtained.
   Callback::BeginOptimization(*this, objectives, iterate, callbacks...);
 
-  for (size_t generation = 1; generation <= maxGenerations && !terminate; generation++)
+  for (size_t generation = 1; generation <= maxGenerations && !terminate;
+      generation++)
   {
     Info << "NSGA2: iteration " << generation << "." << std::endl;
 
@@ -152,22 +175,20 @@ typename MatType::elem_type NSGA2::Optimize(
     BinaryTournamentSelection(population, castedLowerBound, castedUpperBound);
 
     // Evaluate the objectives for the new population.
-    calculatedObjectives.resize(population.size());
-    std::fill(calculatedObjectives.begin(), calculatedObjectives.end(),
-        arma::Col<ElemType>(numObjectives, arma::fill::zeros));
+    calculatedObjectives.zeros(numObjectives, population.size());
     EvaluateObjectives(population, objectives, calculatedObjectives);
 
     // Perform fast non dominated sort on P_t ∪ G_t.
     ranks.resize(population.size());
-    FastNonDominatedSort<BaseMatType>(fronts, ranks, calculatedObjectives);
+    FastNonDominatedSort(fronts, ranks, calculatedObjectives);
 
     // Perform crowding distance assignment.
     crowdingDistance.resize(population.size());
     std::fill(crowdingDistance.begin(), crowdingDistance.end(), 0.);
     for (size_t fNum = 0; fNum < fronts.size(); fNum++)
     {
-      CrowdingDistanceAssignment<BaseMatType>(
-          fronts[fNum], calculatedObjectives, crowdingDistance);
+      CrowdingDistanceAssignment(fronts[fNum], calculatedObjectives,
+          crowdingDistance);
     }
 
     // Sort based on crowding distance.
@@ -178,14 +199,17 @@ typename MatType::elem_type NSGA2::Optimize(
             size_t idxP{}, idxQ{};
             for (size_t i = 0; i < population.size(); i++)
             {
-              if (arma::approx_equal(population[i], candidateP, "absdiff", epsilon))
+              if (approx_equal(population[i], candidateP, "absdiff",
+                  ElemType(epsilon)))
                 idxP = i;
 
-              if (arma::approx_equal(population[i], candidateQ, "absdiff", epsilon))
+              if (approx_equal(population[i], candidateQ, "absdiff",
+                  ElemType(epsilon)))
                 idxQ = i;
             }
 
-            return CrowdingOperator<BaseMatType>(idxP, idxQ, ranks, crowdingDistance);
+            return CrowdingOperator<BaseMatType>(idxP, idxQ, ranks,
+                crowdingDistance);
           }
     );
 
@@ -198,28 +222,23 @@ typename MatType::elem_type NSGA2::Optimize(
   }
 
   // Set the candidates from the Pareto Set as the output.
-  paretoSet.set_size(population[0].n_rows, population[0].n_cols, fronts[0].size());
+  paretoSetIn.set_size(population[0].n_rows, population[0].n_cols,
+      fronts[0].size());
   // The Pareto Set is stored, can be obtained via ParetoSet() getter.
   for (size_t solutionIdx = 0; solutionIdx < fronts[0].size(); ++solutionIdx)
   {
-    paretoSet.slice(solutionIdx) =
-      arma::conv_to<arma::mat>::from(population[fronts[0][solutionIdx]]);
+    paretoSetIn.slice(solutionIdx) = conv_to<CubeBaseMatType>::from(
+        population[fronts[0][solutionIdx]]);
   }
 
   // Set the candidates from the Pareto Front as the output.
-  paretoFront.set_size(calculatedObjectives[0].n_rows, calculatedObjectives[0].n_cols,
-      fronts[0].size());
-  // The Pareto Front is stored, can be obtained via ParetoFront() getter.
+  paretoFrontIn.set_size(calculatedObjectives.n_rows, 1, fronts[0].size());
   for (size_t solutionIdx = 0; solutionIdx < fronts[0].size(); ++solutionIdx)
   {
-    paretoFront.slice(solutionIdx) =
-      arma::conv_to<arma::mat>::from(calculatedObjectives[fronts[0][solutionIdx]]);
+    paretoFrontIn.slice(solutionIdx) = conv_to<CubeBaseMatType>::from(
+        calculatedObjectives.col(fronts[0][solutionIdx]));
   }
 
-  // Clear rcFront, in case it is later requested by the user for reverse
-  // compatibility reasons.
-  rcFront.clear();
-
   // Assign iterate to first element of the Pareto Set.
   iterate = population[fronts[0][0]];
 
@@ -227,9 +246,8 @@ typename MatType::elem_type NSGA2::Optimize(
 
   ElemType performance = std::numeric_limits<ElemType>::max();
 
-  for (const arma::Col<ElemType>& objective: calculatedObjectives)
-    if (arma::accu(objective) < performance)
-      performance = arma::accu(objective);
+  for (size_t i = 0; i < calculatedObjectives.n_cols; ++i)
+    performance = std::min(performance, arma::accu(calculatedObjectives.col(i)));
 
   return performance;
 }
@@ -237,12 +255,13 @@ typename MatType::elem_type NSGA2::Optimize(
 //! No objectives to evaluate.
 template<std::size_t I,
          typename MatType,
+         typename ObjectiveMatType,
          typename ...ArbitraryFunctionType>
 typename std::enable_if<I == sizeof...(ArbitraryFunctionType), void>::type
 NSGA2::EvaluateObjectives(
     std::vector<MatType>&,
     std::tuple<ArbitraryFunctionType...>&,
-    std::vector<arma::Col<typename MatType::elem_type> >&)
+    ObjectiveMatType&)
 {
   // Nothing to do here.
 }
@@ -250,34 +269,39 @@ NSGA2::EvaluateObjectives(
 //! Evaluate the objectives for the entire population.
 template<std::size_t I,
          typename MatType,
+         typename ObjectiveMatType,
          typename ...ArbitraryFunctionType>
 typename std::enable_if<I < sizeof...(ArbitraryFunctionType), void>::type
 NSGA2::EvaluateObjectives(
     std::vector<MatType>& population,
     std::tuple<ArbitraryFunctionType...>& objectives,
-    std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives)
+    ObjectiveMatType& calculatedObjectives)
 {
   for (size_t i = 0; i < populationSize; i++)
   {
-    calculatedObjectives[i](I) = std::get<I>(objectives).Evaluate(population[i]);
-    EvaluateObjectives<I+1, MatType, ArbitraryFunctionType...>(population, objectives,
-                                                               calculatedObjectives);
+    calculatedObjectives(I, i) =
+        std::get<I>(objectives).Evaluate(population[i]);
+    EvaluateObjectives<I + 1, MatType, ObjectiveMatType,
+        ArbitraryFunctionType...>(population, objectives, calculatedObjectives);
   }
 }
 
 //! Reproduce and generate new candidates.
-template<typename MatType>
-inline void NSGA2::BinaryTournamentSelection(std::vector<MatType>& population,
-                                             const MatType& lowerBound,
-                                             const MatType& upperBound)
+template<typename InputMatType>
+void NSGA2::BinaryTournamentSelection(
+    std::vector<InputMatType>& population,
+    const InputMatType& lowerBound,
+    const InputMatType& upperBound)
 {
-  std::vector<MatType> children;
+  std::vector<InputMatType> children;
 
   while (children.size() < population.size())
   {
     // Choose two random parents for reproduction from the elite population.
-    size_t indexA = arma::randi<size_t>(arma::distr_param(0, populationSize - 1));
-    size_t indexB = arma::randi<size_t>(arma::distr_param(0, populationSize - 1));
+    size_t indexA = arma::randi<size_t>(
+        arma::distr_param(0, populationSize - 1));
+    size_t indexB = arma::randi<size_t>(
+        arma::distr_param(0, populationSize - 1));
 
     // Make sure that the parents differ.
     if (indexA == indexB)
@@ -289,7 +313,7 @@ inline void NSGA2::BinaryTournamentSelection(std::vector<MatType>& population,
     }
 
     // Initialize the children to the respective parents.
-    MatType childA = population[indexA], childB = population[indexB];
+    InputMatType childA = population[indexA], childB = population[indexB];
 
     Crossover(childA, childB, population[indexA], population[indexB]);
 
@@ -302,18 +326,23 @@ inline void NSGA2::BinaryTournamentSelection(std::vector<MatType>& population,
   }
 
   // Add the candidates to the elite population.
-  population.insert(std::end(population), std::begin(children), std::end(children));
+  population.insert(
+      std::end(population), std::begin(children), std::end(children));
 }
 
 //! Perform crossover of genes for the children.
-template<typename MatType>
-inline void NSGA2::Crossover(MatType& childA,
-                             MatType& childB,
-                             const MatType& parentA,
-                             const MatType& parentB)
+template<typename InputMatType>
+void NSGA2::Crossover(
+    InputMatType& childA,
+    InputMatType& childB,
+    const InputMatType& parentA,
+    const InputMatType& parentB)
 {
+  typedef typename InputMatType::elem_type ElemType;
+
   // Indices at which crossover is to occur.
-  const arma::umat idx = arma::randu<MatType>(childA.n_rows, childA.n_cols) < crossoverProb;
+  const InputMatType idx = conv_to<InputMatType>::from(randu<InputMatType>(
+      childA.n_rows, childA.n_cols) < ElemType(crossoverProb));
 
   // Use traits from parentA for indices where idx is 1 and parentB otherwise.
   childA = parentA % idx + parentB % (1 - idx);
@@ -322,24 +351,30 @@ inline void NSGA2::Crossover(MatType& childA,
 }
 
 //! Perform mutation of the candidates weights with some noise.
-template<typename MatType>
-inline void NSGA2::Mutate(MatType& child,
-                          const MatType& lowerBound,
-                          const MatType& upperBound)
+template<typename InputMatType>
+void NSGA2::Mutate(
+    InputMatType& child,
+    const InputMatType& lowerBound,
+    const InputMatType& upperBound)
 {
-  child += (arma::randu<MatType>(child.n_rows, child.n_cols) < mutationProb) %
-      (mutationStrength * arma::randn<MatType>(child.n_rows, child.n_cols));
+  typedef typename InputMatType::elem_type ElemType;
+
+  child += conv_to<InputMatType>::from(
+      InputMatType(child.n_rows, child.n_cols,
+          GetFillType<InputMatType>::randu) < ElemType(mutationProb)) %
+      (ElemType(mutationStrength) * InputMatType(child.n_rows, child.n_cols,
+          GetFillType<InputMatType>::randn));
 
   // Constrain all genes to be between bounds.
-  child = arma::min(arma::max(child, lowerBound), upperBound);
+  child = min(max(child, lowerBound), upperBound);
 }
 
 //! Sort population into Pareto fronts.
 template<typename MatType>
-inline void NSGA2::FastNonDominatedSort(
+void NSGA2::FastNonDominatedSort(
     std::vector<std::vector<size_t> >& fronts,
     std::vector<size_t>& ranks,
-    std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives)
+    MatType& calculatedObjectives)
 {
   std::map<size_t, size_t> dominationCount;
   std::map<size_t, std::set<size_t> > dominated;
@@ -398,22 +433,24 @@ inline void NSGA2::FastNonDominatedSort(
 //! Check if a candidate Pareto dominates another candidate.
 template<typename MatType>
 inline bool NSGA2::Dominates(
-    std::vector<arma::Col<typename MatType::elem_type> >& calculatedObjectives,
+    MatType& calculatedObjectives,
     size_t candidateP,
     size_t candidateQ)
 {
   bool allBetterOrEqual = true;
   bool atleastOneBetter = false;
-  size_t n_objectives = calculatedObjectives[0].n_elem;
+  const size_t n_objectives = calculatedObjectives.n_rows;
 
   for (size_t i = 0; i < n_objectives; i++)
   {
     // P is worse than Q for the i-th objective function.
-    if (calculatedObjectives[candidateP](i) > calculatedObjectives[candidateQ](i))
+    if (calculatedObjectives(i, candidateP) >
+        calculatedObjectives(i, candidateQ))
       allBetterOrEqual = false;
 
     // P is better than Q for the i-th objective function.
-    else if (calculatedObjectives[candidateP](i) < calculatedObjectives[candidateQ](i))
+    else if (calculatedObjectives(i, candidateP) <
+             calculatedObjectives(i, candidateQ))
       atleastOneBetter = true;
   }
 
@@ -422,34 +459,29 @@ inline bool NSGA2::Dominates(
 
 //! Assign crowding distance to the population.
 template <typename MatType>
-inline void NSGA2::CrowdingDistanceAssignment(
+void NSGA2::CrowdingDistanceAssignment(
     const std::vector<size_t>& front,
-    std::vector<arma::Col<typename MatType::elem_type>>& calculatedObjectives,
+    MatType& calculatedObjectives,
     std::vector<typename MatType::elem_type>& crowdingDistance)
 {
   // Convenience typedefs.
   typedef typename MatType::elem_type ElemType;
+  typedef typename ForwardType<MatType>::uvec UVecType;
+  typedef typename ForwardType<MatType>::bcol BaseColType;
 
   size_t fSize = front.size();
   // Stores the sorted indices of the fronts.
-  arma::uvec sortedIdx  = arma::regspace<arma::uvec>(0, 1, fSize - 1);
+  UVecType sortedIdx = regspace<UVecType>(0, 1, fSize - 1);
 
   for (size_t m = 0; m < numObjectives; m++)
   {
     // Cache fValues of individuals for current objective.
-    arma::Col<ElemType> fValues(fSize);
-    std::transform(front.begin(), front.end(), fValues.begin(),
-      [&](const size_t& individual)
-        {
-          return calculatedObjectives[individual](m);
-        });
+    BaseColType fValues(fSize);
+    for (size_t k = 0; k < fSize; ++k)
+      fValues(k) = calculatedObjectives(m, size_t(front[k]));
 
     // Sort front indices by ascending fValues for current objective.
-    std::sort(sortedIdx.begin(), sortedIdx.end(),
-      [&](const size_t& frontIdxA, const size_t& frontIdxB)
-        {
-          return (fValues(frontIdxA) < fValues(frontIdxB));
-        });
+    sortedIdx = sort_index(fValues, "ascend");
 
     crowdingDistance[front[sortedIdx(0)]] =
         std::numeric_limits<ElemType>::max();
@@ -458,7 +490,7 @@ inline void NSGA2::CrowdingDistanceAssignment(
     ElemType minFval = fValues(sortedIdx(0));
     ElemType maxFval = fValues(sortedIdx(fSize - 1));
     ElemType scale =
-        std::abs(maxFval - minFval) == 0. ? 1. : std::abs(maxFval - minFval);
+        std::abs(maxFval - minFval) == 0 ? 1 : std::abs(maxFval - minFval);
 
     for (size_t i = 1; i < fSize - 1; i++)
     {
@@ -469,16 +501,22 @@ inline void NSGA2::CrowdingDistanceAssignment(
 }
 
 //! Comparator for crowding distance based sorting.
-template<typename MatType>
-inline bool NSGA2::CrowdingOperator(size_t idxP,
-                                    size_t idxQ,
-                                    const std::vector<size_t>& ranks,
-                                    const std::vector<typename MatType::elem_type>& crowdingDistance)
+template<typename InputMatType>
+bool NSGA2::CrowdingOperator(
+    size_t idxP,
+    size_t idxQ,
+    const std::vector<size_t>& ranks,
+    const std::vector<typename InputMatType::elem_type>& crowdingDistance)
 {
   if (ranks[idxP] < ranks[idxQ])
+  {
     return true;
-  else if (ranks[idxP] == ranks[idxQ] && crowdingDistance[idxP] > crowdingDistance[idxQ])
+  }
+  else if (ranks[idxP] == ranks[idxQ] &&
+           crowdingDistance[idxP] > crowdingDistance[idxQ])
+  {
     return true;
+  }
 
   return false;
 }
diff --git a/inst/include/ensmallen_bits/padam/padam_update.hpp b/inst/include/ensmallen_bits/padam/padam_update.hpp
index a4a6924..3881403 100644
--- a/inst/include/ensmallen_bits/padam/padam_update.hpp
+++ b/inst/include/ensmallen_bits/padam/padam_update.hpp
@@ -85,6 +85,8 @@ class PadamUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -95,11 +97,19 @@ class PadamUpdate
      */
     Policy(PadamUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
+        partial(ElemType(parent.partial)),
         iteration(0)
     {
       m.zeros(rows, cols);
       v.zeros(rows, cols);
       vImproved.zeros(rows, cols);
+
+      // Attempt to detect underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -117,50 +127,57 @@ class PadamUpdate
       ++iteration;
 
       // And update the iterate.
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      v *= parent.beta2;
-      v += (1 - parent.beta2) * (gradient % gradient);
+      v *= beta2;
+      v += (1 - beta2) * (gradient % gradient);
 
-      const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
-      const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
+      const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
+      const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));
 
       // Element wise maximum of past and present squared gradients.
-      vImproved = arma::max(vImproved, v);
+      vImproved = max(vImproved, v);
 
-      iterate -= (stepSize * std::sqrt(biasCorrection2) / biasCorrection1) *
-          m / arma::pow(vImproved + parent.epsilon, parent.partial);
+      iterate -= (ElemType(stepSize) *
+          std::sqrt(biasCorrection2) / biasCorrection1) *
+          m / pow(vImproved + epsilon, partial);
     }
 
    private:
-    //! Instantiated parent object.
+    // Instantiated parent object.
     PadamUpdate& parent;
 
-    //! The exponential moving average of gradient values.
+    // The exponential moving average of gradient values.
     GradType m;
 
-    //! The exponential moving average of squared gradient values.
+    // The exponential moving average of squared gradient values.
     GradType v;
 
-    //! The optimal sqaured gradient value.
+    // The optimal squared gradient value.
     GradType vImproved;
 
-    //! The number of iterations.
+    // Parameters converted to the element type of the optimization.
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
+    ElemType partial;
+
+    // The number of iterations.
     size_t iteration;
   };
 
  private:
-  //! The epsilon value used to initialise the squared gradient parameter.
+  // The epsilon value used to initialise the squared gradient parameter.
   double epsilon;
 
-  //! The smoothing parameter.
+  // The smoothing parameter.
   double beta1;
 
-  //! The second moment coefficient.
+  // The second moment coefficient.
   double beta2;
 
-  //! Partial adaptive parameter.
+  // Partial adaptive parameter.
   double partial;
 };
 
diff --git a/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd.hpp b/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd.hpp
index 75610d3..711b6ee 100644
--- a/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd.hpp
+++ b/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd.hpp
@@ -157,4 +157,4 @@ class ParallelSGD
 // Include implementation.
 #include "parallel_sgd_impl.hpp"
 
-#endif
+#endif
\ No newline at end of file
diff --git a/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd_impl.hpp b/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd_impl.hpp
index e9e4429..867af58 100644
--- a/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd_impl.hpp
+++ b/inst/include/ensmallen_bits/parallel_sgd/parallel_sgd_impl.hpp
@@ -132,7 +132,7 @@ typename MatType::elem_type>::type ParallelSGD<DecayPolicyType>::Optimize(
       return overallObjective;
     }
 
-    // Get the stepsize for this iteration
+    // Get the stepsize for this iteration.
     double stepSize = decayPolicy.StepSize(i);
 
     // Shuffle for uniform sampling of functions by each thread.
@@ -180,7 +180,9 @@ typename MatType::elem_type>::type ParallelSGD<DecayPolicyType>::Optimize(
 
             // Call out to utility function to use the right type of OpenMP
             // lock.
-            UpdateLocation(iterate, row, i, stepSize * value);
+            // TODO: if batch size support > 1 is added, `stepSize` will need to
+            // be updated here.
+            UpdateLocation(iterate, row, i, ElemType(stepSize) * value);
           }
         }
         terminate |= Callback::StepTaken(*this, function, iterate,
diff --git a/inst/include/ensmallen_bits/problems/ackley_function_impl.hpp b/inst/include/ensmallen_bits/problems/ackley_function_impl.hpp
index 1fdaa8f..384a017 100644
--- a/inst/include/ensmallen_bits/problems/ackley_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/ackley_function_impl.hpp
@@ -38,8 +38,9 @@ typename MatType::elem_type AckleyFunction::Evaluate(
   const ElemType x2 = coordinates(1);
 
   const ElemType objective = -20 * std::exp(
-      -0.2 * std::sqrt(0.5 * (x1 * x1 + x2 * x2))) -
-      std::exp(0.5 * (std::cos(c * x1) + std::cos(c * x2))) + std::exp(1) + 20;
+      -(std::sqrt((x1 * x1 + x2 * x2) / 2)) / 5) -
+      std::exp((std::cos(ElemType(c) * x1) + std::cos(ElemType(c) * x2)) / 2) +
+      std::exp(ElemType(1)) + 20;
 
   return objective;
 }
@@ -65,14 +66,14 @@ inline void AckleyFunction::Gradient(const MatType& coordinates,
   const ElemType x2 = coordinates(1);
 
   // Aliases for different terms in the expression of the gradient.
-  const ElemType t0 = std::sqrt(0.5 * (x1 * x1 + x2 * x2));
-  const ElemType t1 = 2.0 * std::exp(- 0.2 * t0) / (t0 + epsilon);
-  const ElemType t2 = 0.5 * c *
-      std::exp(0.5 * (std::cos(c * x1) + std::cos(c * x2)));
+  const ElemType t0 = std::sqrt((x1 * x1 + x2 * x2) / 2);
+  const ElemType t1 = 2 * std::exp(-t0 / 5) / (t0 + ElemType(epsilon));
+  const ElemType t2 = ElemType(c) / 2 *
+      std::exp((std::cos(ElemType(c) * x1) + std::cos(ElemType(c) * x2)) / 2);
 
   gradient.set_size(2, 1);
-  gradient(0) = (x1 * t1) + (t2 * std::sin(c * x1));
-  gradient(1) = (x2 * t1) + (t2 * std::sin(c * x2));
+  gradient(0) = (x1 * t1) + (t2 * std::sin(ElemType(c) * x1));
+  gradient(1) = (x2 * t1) + (t2 * std::sin(ElemType(c) * x2));
 }
 
 template<typename MatType, typename GradType>
diff --git a/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions.hpp b/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions.hpp
index e368160..43058db 100644
--- a/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions.hpp
+++ b/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions.hpp
@@ -23,26 +23,28 @@ namespace test {
  * The minimum that satisfies the constraint is x = [1, 4], with an objective
  * value of 70.
  */
+template<typename MatType = arma::mat>
 class AugLagrangianTestFunction
 {
  public:
   AugLagrangianTestFunction();
-  AugLagrangianTestFunction(const arma::mat& initial_point);
+  AugLagrangianTestFunction(const MatType& initial_point);
 
-  double Evaluate(const arma::mat& coordinates);
-  void Gradient(const arma::mat& coordinates, arma::mat& gradient);
+  typename MatType::elem_type Evaluate(const MatType& coordinates);
+  void Gradient(const MatType& coordinates, MatType& gradient);
 
   size_t NumConstraints() const { return 1; }
 
-  double EvaluateConstraint(const size_t index, const arma::mat& coordinates);
+  typename MatType::elem_type EvaluateConstraint(const size_t index,
+                                                 const MatType& coordinates);
   void GradientConstraint(const size_t index,
-                          const arma::mat& coordinates,
-                          arma::mat& gradient);
+                          const MatType& coordinates,
+                          MatType& gradient);
 
-  const arma::mat& GetInitialPoint() const { return initialPoint; }
+  const MatType& GetInitialPoint() const { return initialPoint; }
 
  private:
-  arma::mat initialPoint;
+  MatType initialPoint;
 };
 
 /**
@@ -83,7 +85,7 @@ class GockenbachFunction
   template<typename MatType>
   MatType GetInitialPoint() const
   {
-    return arma::conv_to<MatType>::from(initialPoint);
+    return conv_to<MatType>::from(initialPoint);
   }
 
  private:
diff --git a/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions_impl.hpp b/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions_impl.hpp
index 56a401b..ee6802c 100644
--- a/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/aug_lagrangian_test_functions_impl.hpp
@@ -20,29 +20,37 @@ namespace test {
 //
 // AugLagrangianTestFunction
 //
-inline AugLagrangianTestFunction::AugLagrangianTestFunction()
+template<typename MatType>
+inline AugLagrangianTestFunction<MatType>::AugLagrangianTestFunction()
 {
   // Set the initial point to be (0, 0).
   initialPoint.zeros(2, 1);
 }
 
-inline AugLagrangianTestFunction::AugLagrangianTestFunction(
-      const arma::mat& initialPoint) :
+template<typename MatType>
+inline AugLagrangianTestFunction<MatType>::AugLagrangianTestFunction(
+      const MatType& initialPoint) :
     initialPoint(initialPoint)
 {
   // Nothing to do.
 }
 
-inline double AugLagrangianTestFunction::Evaluate(const arma::mat& coordinates)
+template<typename MatType>
+inline typename MatType::elem_type AugLagrangianTestFunction<MatType>::Evaluate(
+    const MatType& coordinates)
 {
+  typedef typename MatType::elem_type ElemType;
+
   // f(x) = 6 x_1^2 + 4 x_1 x_2 + 3 x_2^2
-  return ((6 * std::pow(coordinates[0], 2)) +
+  return ((6 * std::pow(coordinates[0], ElemType(2))) +
           (4 * (coordinates[0] * coordinates[1])) +
-          (3 * std::pow(coordinates[1], 2)));
+          (3 * std::pow(coordinates[1], ElemType(2))));
 }
 
-inline void AugLagrangianTestFunction::Gradient(const arma::mat& coordinates,
-                                                arma::mat& gradient)
+template<typename MatType>
+inline void AugLagrangianTestFunction<MatType>::Gradient(
+    const MatType& coordinates,
+    MatType& gradient)
 {
   // f'_x1(x) = 12 x_1 + 4 x_2
   // f'_x2(x) = 4 x_1 + 6 x_2
@@ -52,8 +60,11 @@ inline void AugLagrangianTestFunction::Gradient(const arma::mat& coordinates,
   gradient[1] = 4 * coordinates[0] + 6 * coordinates[1];
 }
 
-inline double AugLagrangianTestFunction::EvaluateConstraint(const size_t index,
-    const arma::mat& coordinates)
+template<typename MatType>
+inline typename MatType::elem_type
+AugLagrangianTestFunction<MatType>::EvaluateConstraint(
+    const size_t index,
+    const MatType& coordinates)
 {
   // We return 0 if the index is wrong (not 0).
   if (index != 0)
@@ -63,9 +74,11 @@ inline double AugLagrangianTestFunction::EvaluateConstraint(const size_t index,
   return (coordinates[0] + coordinates[1] - 5);
 }
 
-inline void AugLagrangianTestFunction::GradientConstraint(const size_t index,
-    const arma::mat& /* coordinates */,
-    arma::mat& gradient)
+template<typename MatType>
+inline void AugLagrangianTestFunction<MatType>::GradientConstraint(
+    const size_t index,
+    const MatType& /* coordinates */,
+    MatType& gradient)
 {
   // If the user passed an invalid index (not 0), we will return a zero
   // gradient.
@@ -99,10 +112,12 @@ template<typename MatType>
 inline typename MatType::elem_type GockenbachFunction::Evaluate(
     const MatType& coordinates)
 {
+  typedef typename MatType::elem_type ElemType;
+
   // f(x) = (x_1 - 1)^2 + 2 (x_2 + 2)^2 + 3(x_3 + 3)^2
-  return ((std::pow(coordinates[0] - 1, 2)) +
-          (2 * std::pow(coordinates[1] + 2, 2)) +
-          (3 * std::pow(coordinates[2] + 3, 2)));
+  return ((std::pow(coordinates[0] - 1, ElemType(2))) +
+          (2 * std::pow(coordinates[1] + 2, ElemType(2))) +
+          (3 * std::pow(coordinates[2] + 3, ElemType(2))));
 }
 
 template<typename MatType, typename GradType>
@@ -124,20 +139,21 @@ inline typename MatType::elem_type GockenbachFunction::EvaluateConstraint(
     const size_t index,
     const MatType& coordinates)
 {
-  typename MatType::elem_type constraint = 0;
+  typedef typename MatType::elem_type ElemType;
+
+  ElemType constraint = 0;
 
   switch (index)
   {
     case 0: // g(x) = (x_3 - x_2 - x_1 - 1) = 0
-      constraint = (coordinates[2] - coordinates[1] - coordinates[0] -
-          typename MatType::elem_type(1));
+      constraint = (coordinates[2] - coordinates[1] - coordinates[0] - 1);
       break;
 
     case 1: // h(x) = (x_3 - x_1^2) >= 0
       // To deal with the inequality, the constraint will simply evaluate to 0
       // when h(x) >= 0.
-      constraint = std::min(typename MatType::elem_type(0), (coordinates[2] -
-          std::pow(coordinates[0], typename MatType::elem_type(2))));
+      constraint = std::min(ElemType(0), (coordinates[2] -
+          std::pow(coordinates[0], ElemType(2))));
       break;
   }
 
@@ -322,7 +338,7 @@ inline const arma::mat& LovaszThetaSDP::GetInitialPoint()
   // and because m is always positive,
   //   r = 0.5 + sqrt(0.25 + 2m)
   float m = NumConstraints();
-  float r = 0.5 + sqrt(0.25 + 2 * m);
+  float r = 0.5 + std::sqrt(0.25 + 2 * m);
   if (ceil(r) > vertices)
     r = vertices; // An upper bound on the dimension.
 
@@ -335,9 +351,10 @@ inline const arma::mat& LovaszThetaSDP::GetInitialPoint()
     for (size_t j = 0; j < (size_t) vertices; j++)
     {
       if (i == j)
-        initialPoint(i, j) = sqrt(1.0 / r) + sqrt(1.0 / (vertices * m));
+        initialPoint(i, j) = std::sqrt(1.0 / r) +
+            std::sqrt(1.0 / (vertices * m));
       else
-        initialPoint(i, j) = sqrt(1.0 / (vertices * m));
+        initialPoint(i, j) = std::sqrt(1.0 / (vertices * m));
     }
   }
 
diff --git a/inst/include/ensmallen_bits/problems/beale_function_impl.hpp b/inst/include/ensmallen_bits/problems/beale_function_impl.hpp
index a382005..9833303 100644
--- a/inst/include/ensmallen_bits/problems/beale_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/beale_function_impl.hpp
@@ -35,9 +35,11 @@ typename MatType::elem_type BealeFunction::Evaluate(
   const ElemType x1 = coordinates(0);
   const ElemType x2 = coordinates(1);
 
-  const ElemType objective = std::pow(1.5 - x1 + x1 * x2, 2) +
-      std::pow(2.25 - x1 + x1 * x2 * x2, 2) +
-      std::pow(2.625 - x1 + x1 * pow(x2, 3), 2);
+  const ElemType objective =
+      std::pow(ElemType(1.5) - x1 + x1 * x2, ElemType(2)) +
+      std::pow(ElemType(2.25) - x1 + x1 * x2 * x2, ElemType(2)) +
+      std::pow(ElemType(2.625) - x1 + x1 * std::pow(x2, ElemType(3)),
+          ElemType(2));
 
   return objective;
 }
@@ -64,15 +66,15 @@ inline void BealeFunction::Gradient(const MatType& coordinates,
 
   // Aliases for different terms in the expression of the gradient.
   const ElemType x2Sq = x2 * x2;
-  const ElemType x2Cub = pow(x2, 3);
+  const ElemType x2Cub = std::pow(x2, ElemType(3));
 
   gradient.set_size(2, 1);
-  gradient(0) = ((2 * x2 - 2) * (x1 * x2 - x1 + 1.5)) +
-      ((2 * x2Sq - 2) * (x1 * x2Sq - x1 + 2.25)) +
-      ((2 * x2Cub - 2) * (x1 * x2Cub - x1 + 2.625));
-  gradient(1) = (6 * x1 * x2Sq * (x1 * x2Cub - x1 + 2.625)) +
-      (4 * x1 * x2 * (x1 * x2Sq - x1 + 2.25)) +
-      (2 * x1 * (x1 * x2 - x1 + 1.5));
+  gradient(0) = ((2 * x2 - 2) * (x1 * x2 - x1 + ElemType(1.5))) +
+      ((2 * x2Sq - 2) * (x1 * x2Sq - x1 + ElemType(2.25))) +
+      ((2 * x2Cub - 2) * (x1 * x2Cub - x1 + ElemType(2.625)));
+  gradient(1) = (6 * x1 * x2Sq * (x1 * x2Cub - x1 + ElemType(2.625))) +
+      (4 * x1 * x2 * (x1 * x2Sq - x1 + ElemType(2.25))) +
+      (2 * x1 * (x1 * x2 - x1 + ElemType(1.5)));
 }
 
 template<typename MatType, typename GradType>
diff --git a/inst/include/ensmallen_bits/problems/booth_function_impl.hpp b/inst/include/ensmallen_bits/problems/booth_function_impl.hpp
index d80c999..515aed6 100644
--- a/inst/include/ensmallen_bits/problems/booth_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/booth_function_impl.hpp
@@ -35,8 +35,8 @@ typename MatType::elem_type BoothFunction::Evaluate(
   const ElemType x1 = coordinates(0);
   const ElemType x2 = coordinates(1);
 
-  const ElemType objective = std::pow(x1 + 2 * x2 - 7, 2) +
-      std::pow(2 * x1 + x2 - 5, 2);
+  const ElemType objective = std::pow(x1 + 2 * x2 - 7, ElemType(2)) +
+      std::pow(2 * x1 + x2 - 5, ElemType(2));
 
   return objective;
 }
diff --git a/inst/include/ensmallen_bits/problems/colville_function_impl.hpp b/inst/include/ensmallen_bits/problems/colville_function_impl.hpp
index e3726d5..8e1ebfe 100644
--- a/inst/include/ensmallen_bits/problems/colville_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/colville_function_impl.hpp
@@ -37,10 +37,12 @@ typename MatType::elem_type ColvilleFunction::Evaluate(
   const ElemType x3 = coordinates(2);
   const ElemType x4 = coordinates(3);
 
-  const ElemType objective = 100 * std::pow(std::pow(x1, 2) - x2, 2) +
-      std::pow(x1 - 1, 2) + std::pow(x3 - 1, 2) + 90 *
-      std::pow(std::pow(x3, 2) - x4, 2) + 10.1 * (std::pow(x2 - 1, 2) +
-      std::pow(x4 - 1, 2)) + 19.8 * (x2 - 1) * (x4 - 1);
+  const ElemType objective =
+      100 * std::pow(std::pow(x1, ElemType(2)) - x2, ElemType(2)) +
+      std::pow(x1 - 1, ElemType(2)) + std::pow(x3 - 1, ElemType(2)) +
+      90 * std::pow(std::pow(x3, ElemType(2)) - x4, ElemType(2)) +
+      ElemType(10.1) * (std::pow(x2 - 1, ElemType(2)) +
+      std::pow(x4 - 1, ElemType(2))) + ElemType(19.8) * (x2 - 1) * (x4 - 1);
 
   return objective;
 }
@@ -68,10 +70,12 @@ inline void ColvilleFunction::Gradient(const MatType& coordinates,
   const ElemType x4 = coordinates(3);
 
   gradient.set_size(4, 1);
-  gradient(0) = 2 * (200 * x1 * (std::pow(x1, 2) - x2) + x1 - 1);
-  gradient(1) = 19.8 * x4 - 200 * std::pow(x1, 2) + 220.2 * x2 - 40;
-  gradient(2) = 2 * (180 * x3 * (std::pow(x3, 2) - x4) + x3 - 1);
-  gradient(3) = 200.2 * x4 + 19.8 * x2 - 180 * std::pow(x3, 2) - 40;
+  gradient(0) = 2 * (200 * x1 * (std::pow(x1, ElemType(2)) - x2) + x1 - 1);
+  gradient(1) = ElemType(19.8) * x4 - 200 * std::pow(x1, ElemType(2)) +
+      ElemType(220.2) * x2 - 40;
+  gradient(2) = 2 * (180 * x3 * (std::pow(x3, ElemType(2)) - x4) + x3 - 1);
+  gradient(3) = ElemType(200.2) * x4 + ElemType(19.8) * x2 -
+      180 * std::pow(x3, ElemType(2)) - 40;
 }
 
 template<typename MatType, typename GradType>
diff --git a/inst/include/ensmallen_bits/problems/cross_in_tray_function_impl.hpp b/inst/include/ensmallen_bits/problems/cross_in_tray_function_impl.hpp
index e5814f0..e4bf4f0 100644
--- a/inst/include/ensmallen_bits/problems/cross_in_tray_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/cross_in_tray_function_impl.hpp
@@ -35,10 +35,12 @@ typename MatType::elem_type CrossInTrayFunction::Evaluate(
   const ElemType x1 = coordinates(0);
   const ElemType x2 = coordinates(1);
 
-  const ElemType objective = -0.0001 * std::pow(std::abs(std::sin(x1) *
-      std::sin(x2) * std::exp(std::abs(100 - (std::sqrt(std::pow(x1, 2) +
-      std::pow(x2, 2)) / arma::datum::pi))) + 1), 0.1);
-  return objective;
+  // Compute objective in higher precision, then cast down.
+  const double objective = -0.0001 * std::pow(std::abs(std::sin(double(x1)) *
+      std::sin(double(x2)) *
+      std::exp(std::abs(100 - (std::sqrt(std::pow(double(x1), 2) +
+      std::pow(double(x2), 2)) / arma::datum::pi))) + 1), 0.1);
+  return ElemType(objective);
 }
 
 template<typename MatType>
diff --git a/inst/include/ensmallen_bits/problems/easom_function_impl.hpp b/inst/include/ensmallen_bits/problems/easom_function_impl.hpp
index a63dd89..2a9436a 100644
--- a/inst/include/ensmallen_bits/problems/easom_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/easom_function_impl.hpp
@@ -36,8 +36,8 @@ typename MatType::elem_type EasomFunction::Evaluate(
   const ElemType x2 = coordinates(1);
 
   const ElemType objective = -std::cos(x1) * std::cos(x2) *
-      std::exp(-1.0 * std::pow(x1 - arma::datum::pi, 2) -
-                      std::pow(x2 - arma::datum::pi, 2));
+      std::exp(-1 * std::pow(x1 - arma::Datum<ElemType>::pi, ElemType(2)) -
+                    std::pow(x2 - arma::Datum<ElemType>::pi, ElemType(2)));
 
   return objective;
 }
@@ -63,20 +63,20 @@ inline void EasomFunction::Gradient(const MatType& coordinates,
   const ElemType x2 = coordinates(1);
 
   gradient.set_size(2, 1);
-  gradient(0) = 2 * (x1 - arma::datum::pi) *
-      std::exp(-1.0 * std::pow(x1 - arma::datum::pi, 2) -
-                      std::pow(x2 - arma::datum::pi, 2)) *
+  gradient(0) = 2 * (x1 - arma::Datum<ElemType>::pi) *
+      std::exp(-1 * std::pow(x1 - arma::Datum<ElemType>::pi, ElemType(2)) -
+                    std::pow(x2 - arma::Datum<ElemType>::pi, ElemType(2))) *
       std::cos(x1) * std::cos(x2) +
-      std::exp(-1.0 * std::pow(x1 - arma::datum::pi, 2) -
-                      std::pow(x2 -  arma::datum::pi, 2)) *
+      std::exp(-1 * std::pow(x1 - arma::Datum<ElemType>::pi, ElemType(2)) -
+                    std::pow(x2 - arma::Datum<ElemType>::pi, ElemType(2))) *
       std::sin(x1) * std::cos(x2);
 
-  gradient(1) = 2 * (x2 - arma::datum::pi) *
-      std::exp(-1.0 * std::pow(x1 - arma::datum::pi, 2) -
-                      std::pow(x2 - arma::datum::pi, 2)) *
+  gradient(1) = 2 * (x2 - arma::Datum<ElemType>::pi) *
+      std::exp(-1 * std::pow(x1 - arma::Datum<ElemType>::pi, ElemType(2)) -
+                    std::pow(x2 - arma::Datum<ElemType>::pi, ElemType(2))) *
       std::cos(x1) * std::cos(x2) +
-      std::exp(-1.0 * std::pow(x1 - arma::datum::pi, 2) -
-                      std::pow(x2 - arma::datum::pi, 2)) *
+      std::exp(-1 * std::pow(x1 - arma::Datum<ElemType>::pi, ElemType(2)) -
+                    std::pow(x2 - arma::Datum<ElemType>::pi, ElemType(2))) *
       std::cos(x1) * std::sin(x2);
 }
 
diff --git a/inst/include/ensmallen_bits/problems/fonseca_fleming_function.hpp b/inst/include/ensmallen_bits/problems/fonseca_fleming_function.hpp
index 34bc95b..1de61db 100644
--- a/inst/include/ensmallen_bits/problems/fonseca_fleming_function.hpp
+++ b/inst/include/ensmallen_bits/problems/fonseca_fleming_function.hpp
@@ -46,14 +46,12 @@ class FonsecaFlemingFunction
    * Evaluate the objectives with the given coordinate.
    *
    * @param coords The function coordinates.
-   * @return arma::Col<typename MatType::elem_type>
+   * @return Col<typename MatType::elem_type>
    */
-  arma::Col<typename MatType::elem_type> Evaluate(const MatType& coords)
-  {
-    // Convenience typedef.
-    typedef typename MatType::elem_type ElemType;
 
-    arma::Col<ElemType> objectives(numObjectives);
+  typename ForwardType<MatType>::bvec Evaluate(const MatType& coords)
+  {
+    typename ForwardType<MatType>::bvec objectives(numObjectives);
 
     objectives(0) = objectiveA.Evaluate(coords);
     objectives(1) = objectiveB.Evaluate(coords);
@@ -64,21 +62,18 @@ class FonsecaFlemingFunction
   //! Get the starting point.
   MatType GetInitialPoint()
   {
-    // Convenience typedef.
-    typedef typename MatType::elem_type ElemType;
-
-    return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
+    return MatType(numVariables, 1, GetFillType<MatType>::zeros);
   }
 
   struct ObjectiveA
   {
     typename MatType::elem_type Evaluate(const MatType& coords)
     {
-        return 1.0 - exp(
-             -pow(static_cast<double>(coords[0]) - 1.0 / sqrt(3.0), 2.0)
-             -pow(static_cast<double>(coords[1]) - 1.0 / sqrt(3.0), 2.0)
-             -pow(static_cast<double>(coords[2]) - 1.0 / sqrt(3.0), 2.0)
-        );
+      return typename MatType::elem_type(1.0 - std::exp(
+          -std::pow(static_cast<double>(coords[0]) - 1.0 / std::sqrt(3.0), 2.0)
+          -std::pow(static_cast<double>(coords[1]) - 1.0 / std::sqrt(3.0), 2.0)
+          -std::pow(static_cast<double>(coords[2]) - 1.0 / std::sqrt(3.0), 2.0)
+      ));
     }
   } objectiveA;
 
@@ -86,11 +81,11 @@ class FonsecaFlemingFunction
   {
     typename MatType::elem_type Evaluate(const MatType& coords)
     {
-        return 1.0 - exp(
-            -pow(static_cast<double>(coords[0]) + 1.0 / sqrt(3.0), 2.0)
-            -pow(static_cast<double>(coords[1]) + 1.0 / sqrt(3.0), 2.0)
-            -pow(static_cast<double>(coords[2]) + 1.0 / sqrt(3.0), 2.0)
-        );
+      return typename MatType::elem_type(1.0 - std::exp(
+          -std::pow(static_cast<double>(coords[0]) + 1.0 / std::sqrt(3.0), 2.0)
+          -std::pow(static_cast<double>(coords[1]) + 1.0 / std::sqrt(3.0), 2.0)
+          -std::pow(static_cast<double>(coords[2]) + 1.0 / std::sqrt(3.0), 2.0)
+      ));
     }
   } objectiveB;
 
@@ -100,6 +95,7 @@ class FonsecaFlemingFunction
     return std::make_tuple(objectiveA, objectiveB);
   }
 };
+
 } // namespace test
 } // namespace ens
 
diff --git a/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function.hpp b/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function.hpp
index 3ec963f..196d540 100644
--- a/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function.hpp
+++ b/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function.hpp
@@ -113,14 +113,14 @@ class GeneralizedRosenbrockFunction
   template<typename MatType = arma::mat>
   const MatType GetInitialPoint() const
   {
-    return arma::conv_to<MatType>::from(initialPoint);
+    return conv_to<MatType>::from(initialPoint);
   }
 
   //! Get the final point.
   template<typename MatType = arma::mat>
   const MatType GetFinalPoint() const
   {
-    return arma::ones<MatType>(initialPoint.n_rows, initialPoint.n_cols);
+    return ones<MatType>(initialPoint.n_rows, initialPoint.n_cols);
   }
 
   //! Get the final objective.
diff --git a/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function_impl.hpp b/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function_impl.hpp
index b332f13..82b0b3d 100644
--- a/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/generalized_rosenbrock_function_impl.hpp
@@ -51,12 +51,15 @@ typename MatType::elem_type GeneralizedRosenbrockFunction::Evaluate(
     const size_t begin,
     const size_t batchSize) const
 {
-  typename MatType::elem_type objective = 0.0;
+  typedef typename MatType::elem_type ElemType;
+
+  ElemType objective = 0;
   for (size_t j = begin; j < begin + batchSize; ++j)
   {
     const size_t p = visitationOrder[j];
-    objective += 100 * std::pow((std::pow(coordinates[p], 2)
-        - coordinates[p + 1]), 2) + std::pow(1 - coordinates[p], 2);
+    objective += 100 * std::pow((std::pow(coordinates[p], ElemType(2)) -
+        coordinates[p + 1]), ElemType(2)) +
+        std::pow(1 - coordinates[p], ElemType(2));
   }
 
   return objective;
@@ -66,11 +69,14 @@ template<typename MatType>
 typename MatType::elem_type GeneralizedRosenbrockFunction::Evaluate(
     const MatType& coordinates) const
 {
-  typename MatType::elem_type fval = 0;
+  typedef typename MatType::elem_type ElemType;
+
+  ElemType fval = 0;
   for (size_t i = 0; i < (n - 1); i++)
   {
-    fval += 100 * std::pow(std::pow(coordinates[i], 2) -
-        coordinates[i + 1], 2) + std::pow(1 - coordinates[i], 2);
+    fval += 100 * std::pow(std::pow(coordinates[i], ElemType(2)) -
+        coordinates[i + 1], ElemType(2)) +
+        std::pow(1 - coordinates[i], ElemType(2));
   }
 
   return fval;
@@ -83,13 +89,16 @@ inline void GeneralizedRosenbrockFunction::Gradient(
     GradType& gradient,
     const size_t batchSize) const
 {
+  typedef typename MatType::elem_type ElemType;
+
   gradient.zeros(n);
   for (size_t j = begin; j < begin + batchSize; ++j)
   {
     const size_t p = visitationOrder[j];
-    gradient[p] = 400 * (std::pow(coordinates[p], 3) - coordinates[p] *
-        coordinates[p + 1]) + 2 * (coordinates[p] - 1);
-    gradient[p + 1] = 200 * (coordinates[p + 1] - std::pow(coordinates[p], 2));
+    gradient[p] = 400 * (std::pow(coordinates[p], ElemType(3)) -
+        coordinates[p] * coordinates[p + 1]) + 2 * (coordinates[p] - 1);
+    gradient[p + 1] =
+        200 * (coordinates[p + 1] - std::pow(coordinates[p], ElemType(2)));
   }
 }
 
@@ -98,18 +107,23 @@ inline void GeneralizedRosenbrockFunction::Gradient(
     const MatType& coordinates,
     GradType& gradient) const
 {
+  typedef typename MatType::elem_type ElemType;
+
   gradient.zeros(n);
   for (size_t i = 0; i < (n - 1); i++)
   {
-    gradient[i] = 400 * (std::pow(coordinates[i], 3) - coordinates[i] *
-        coordinates[i + 1]) + 2 * (coordinates[i] - 1);
+    gradient[i] = 400 * (std::pow(coordinates[i], ElemType(3)) -
+        coordinates[i] * coordinates[i + 1]) + 2 * (coordinates[i] - 1);
 
     if (i > 0)
-      gradient[i] += 200 * (coordinates[i] - std::pow(coordinates[i - 1], 2));
+    {
+      gradient[i] +=
+          200 * (coordinates[i] - std::pow(coordinates[i - 1], ElemType(2)));
+    }
   }
 
   gradient[n - 1] = 200 * (coordinates[n - 1] -
-      std::pow(coordinates[n - 2], 2));
+      std::pow(coordinates[n - 2], ElemType(2)));
 }
 
 } // namespace test
diff --git a/inst/include/ensmallen_bits/problems/goldstein_price_function_impl.hpp b/inst/include/ensmallen_bits/problems/goldstein_price_function_impl.hpp
index 29d7e55..913f58c 100644
--- a/inst/include/ensmallen_bits/problems/goldstein_price_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/goldstein_price_function_impl.hpp
@@ -36,12 +36,13 @@ typename MatType::elem_type GoldsteinPriceFunction::Evaluate(
   const ElemType x1 = coordinates(0);
   const ElemType x2 = coordinates(1);
 
-  const ElemType x1Sq = std::pow(x1, 2);
-  const ElemType x2Sq = std::pow(x2, 2);
+  const ElemType x1Sq = std::pow(x1, ElemType(2));
+  const ElemType x2Sq = std::pow(x2, ElemType(2));
   const ElemType x1x2 = x1 * x2;
-  const ElemType objective = (1 + std::pow(x1 + x2 + 1, 2) * (19 - 14 * x1 + 3 *
-      x1Sq - 14 * x2 + 6 * x1x2 + 3 * x2Sq)) * (30 + std::pow(2 * x1 - 3 * x2,
-      2) * (18 - 32 * x1 + 12 * x1Sq + 48 * x2 - 36 * x1x2 + 27 * x2Sq));
+  const ElemType objective = (1 + std::pow(x1 + x2 + 1, ElemType(2)) *
+      (19 - 14 * x1 + 3 * x1Sq - 14 * x2 + 6 * x1x2 + 3 * x2Sq)) *
+      (30 + std::pow(2 * x1 - 3 * x2, ElemType(2)) *
+          (18 - 32 * x1 + 12 * x1Sq + 48 * x2 - 36 * x1x2 + 27 * x2Sq));
 
   return objective;
 }
@@ -67,22 +68,26 @@ inline void GoldsteinPriceFunction::Gradient(const MatType& coordinates,
   const ElemType x2 = coordinates(1);
 
   gradient.set_size(2, 1);
-  gradient(0) = (std::pow(2 * x1 - 3 * x2, 2) * (24 * x1 - 36 * x2 - 32) + (8 *
-      x1 - 12 * x2) * (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 +
-      48 * x2 + 18)) * (std::pow(x1 + x2 + 1, 2) * (3 * x1 * x1 + 6 * x1 * x2 -
-      14 * x1 + 3 * x2 * x2 - 14 * x2 + 19) + 1) + (std::pow(2 * x1 - 3 * x2,
-      2) * (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 + 48 * x2 +
-      18) + 30) * (std::pow(x1 + x2 + 1, 2) * (6 * x1 + 6 * x2 - 14) + (2 * x1 +
-      2 * x2 + 2) * (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 *
-      x2 + 19));
+  gradient(0) = (std::pow(2 * x1 - 3 * x2, ElemType(2)) *
+      (24 * x1 - 36 * x2 - 32) + (8 * x1 - 12 * x2) *
+      (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 + 48 * x2 + 18)) *
+      (std::pow(x1 + x2 + 1, ElemType(2)) *
+      (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 * x2 + 19) + 1) +
+      (std::pow(2 * x1 - 3 * x2, ElemType(2)) *
+      (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 + 48 * x2 + 18) +
+      30) * (std::pow(x1 + x2 + 1, ElemType(2)) * (6 * x1 + 6 * x2 - 14) +
+      (2 * x1 + 2 * x2 + 2) *
+      (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 * x2 + 19));
   gradient(1) = ((- 12 * x1 + 18 * x2) * (12 * x1 * x1 - 36 * x1 * x2 - 32 *
-      x1 + 27 * x2 * x2 + 48 * x2 + 18) + std::pow(2 * x1 - 3 * x2, 2) * (-36 *
-      x1 + 54 * x2 + 48)) * (std::pow(x1 + x2 + 1, 2) * (3 * x1 * x1 + 6 * x1 *
-      x2 - 14 * x1 + 3 * x2 * x2 - 14 * x2 + 19) + 1) + (std::pow(2 * x1 - 3 *
-      x2, 2) * (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 + 48 * x2 +
-      18) + 30) * (std::pow(x1 + x2 + 1, 2) * (6 * x1 + 6 * x2 - 14) + (2 * x1 +
-      2 * x2 + 2) * (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 *
-      x2 + 19));
+      x1 + 27 * x2 * x2 + 48 * x2 + 18) +
+      std::pow(2 * x1 - 3 * x2, ElemType(2)) * (-36 * x1 + 54 * x2 + 48)) *
+      (std::pow(x1 + x2 + 1, ElemType(2)) *
+      (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 * x2 + 19) + 1) + 
+      (std::pow(2 * x1 - 3 * x2, ElemType(2)) *
+      (12 * x1 * x1 - 36 * x1 * x2 - 32 * x1 + 27 * x2 * x2 + 48 * x2 + 18) +
+      30) * (std::pow(x1 + x2 + 1, ElemType(2)) * (6 * x1 + 6 * x2 - 14) +
+      (2 * x1 + 2 * x2 + 2) *
+      (3 * x1 * x1 + 6 * x1 * x2 - 14 * x1 + 3 * x2 * x2 - 14 * x2 + 19));
 }
 
 template<typename MatType, typename GradType>
diff --git a/inst/include/ensmallen_bits/problems/gradient_descent_test_function_impl.hpp b/inst/include/ensmallen_bits/problems/gradient_descent_test_function_impl.hpp
index cf38ee6..b1c447f 100644
--- a/inst/include/ensmallen_bits/problems/gradient_descent_test_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/gradient_descent_test_function_impl.hpp
@@ -21,7 +21,7 @@ template<typename MatType>
 inline typename MatType::elem_type GDTestFunction::Evaluate(
     const MatType& coordinates) const
 {
-  MatType temp = arma::trans(coordinates) * coordinates;
+  MatType temp = trans(coordinates) * coordinates;
   return temp(0, 0);
 }
 
diff --git a/inst/include/ensmallen_bits/problems/himmelblau_function.hpp b/inst/include/ensmallen_bits/problems/himmelblau_function.hpp
index b34f942..26f35f9 100644
--- a/inst/include/ensmallen_bits/problems/himmelblau_function.hpp
+++ b/inst/include/ensmallen_bits/problems/himmelblau_function.hpp
@@ -60,6 +60,12 @@ class HimmelblauFunction
   template<typename MatType = arma::mat>
   MatType GetInitialPoint() const { return MatType("5; -5"); }
 
+  //! Get the final point of the optimization.
+  template<typename MatType = arma::mat>
+  MatType GetFinalPoint() const { return MatType("3; 2"); }
+
+  double GetFinalObjective() const { return 0.0; }
+
   /**
    * Evaluate a function for a particular batch-size.
    *
diff --git a/inst/include/ensmallen_bits/problems/himmelblau_function_impl.hpp b/inst/include/ensmallen_bits/problems/himmelblau_function_impl.hpp
index 52dfc1f..da1cae8 100644
--- a/inst/include/ensmallen_bits/problems/himmelblau_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/himmelblau_function_impl.hpp
@@ -35,8 +35,8 @@ typename MatType::elem_type HimmelblauFunction::Evaluate(
   const ElemType x1 = coordinates(0);
   const ElemType x2 = coordinates(1);
 
-  const ElemType objective = std::pow(x1 * x1 + x2  - 11 , 2) +
-      std::pow(x1 + x2 * x2 - 7, 2);
+  const ElemType objective = std::pow(x1 * x1 + x2 - 11, ElemType(2)) +
+      std::pow(x1 + x2 * x2 - 7, ElemType(2));
   return objective;
 }
 
diff --git a/inst/include/ensmallen_bits/problems/levy_function_n13.hpp b/inst/include/ensmallen_bits/problems/levy_function_n13.hpp
index 06fc85a..f19d338 100644
--- a/inst/include/ensmallen_bits/problems/levy_function_n13.hpp
+++ b/inst/include/ensmallen_bits/problems/levy_function_n13.hpp
@@ -36,7 +36,7 @@ namespace test {
 class LevyFunctionN13
 {
  public:
-  //! Initialize the BealeFunction.
+  //! Initialize the LevyFunctionN13 object.
   LevyFunctionN13();
 
   /**
diff --git a/inst/include/ensmallen_bits/problems/levy_function_n13_impl.hpp b/inst/include/ensmallen_bits/problems/levy_function_n13_impl.hpp
index d554254..1c691c4 100644
--- a/inst/include/ensmallen_bits/problems/levy_function_n13_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/levy_function_n13_impl.hpp
@@ -35,11 +35,12 @@ typename MatType::elem_type LevyFunctionN13::Evaluate(
   const ElemType x1 = coordinates(0);
   const ElemType x2 = coordinates(1);
 
-  const ElemType objective = std::pow(std::sin(3 * arma::datum::pi * x1), 2) +
-      (std::pow(x1 - 1, 2) * (1 + std::pow(
-          std::sin(3 * arma::datum::pi * x2), 2))) +
-      (std::pow(x2 - 1, 2) * (1 + std::pow(
-          std::sin(2 * arma::datum::pi * x2), 2)));
+  const ElemType objective =
+      std::pow(std::sin(3 * arma::Datum<ElemType>::pi * x1), ElemType(2)) +
+      (std::pow(x1 - 1, ElemType(2)) * (1 + std::pow(
+          std::sin(3 * arma::Datum<ElemType>::pi * x2), ElemType(2)))) +
+      (std::pow(x2 - 1, ElemType(2)) * (1 + std::pow(
+          std::sin(2 * arma::Datum<ElemType>::pi * x2), ElemType(2))));
 
   return objective;
 }
@@ -65,15 +66,19 @@ inline void LevyFunctionN13::Gradient(const MatType& coordinates,
   const ElemType x2 = coordinates(1);
   gradient.set_size(2, 1);
 
-  gradient(0) = (2 * x1 - 2) * (std::pow(std::sin(3 * arma::datum::pi * x2),
-      2) + 1) + 6 * arma::datum::pi * std::sin(3 * arma::datum::pi * x1) *
-      std::cos(3 * arma::datum::pi * x1);
-
-  gradient(1) = 6 * arma::datum::pi * std::pow(x1 - 1, 2) * std::sin(3 *
-      arma::datum::pi * x2) * std::cos(3 * arma::datum::pi * x2) +
-      4 * arma::datum::pi * std::pow(x2 - 1, 2) * std::sin(2 *
-      arma::datum::pi * x2) * std::cos(2 * arma::datum::pi * x2) +
-      (2 * x2 - 2) * (std::pow(std::sin(2 * arma::datum::pi * x2), 2) + 1);
+  gradient(0) = (2 * x1 - 2) *
+      (std::pow(std::sin(3 * arma::Datum<ElemType>::pi * x2), ElemType(2)) +
+      1) + 6 * arma::Datum<ElemType>::pi *
+      std::sin(3 * arma::Datum<ElemType>::pi * x1) *
+      std::cos(3 * arma::Datum<ElemType>::pi * x1);
+
+  gradient(1) = 6 * arma::Datum<ElemType>::pi * std::pow(x1 - 1, ElemType(2)) *
+      std::sin(3 * arma::Datum<ElemType>::pi * x2) *
+      std::cos(3 * arma::Datum<ElemType>::pi * x2) +
+      4 * arma::Datum<ElemType>::pi * std::pow(x2 - 1, ElemType(2)) *
+      std::sin(2 * arma::Datum<ElemType>::pi * x2) *
+      std::cos(2 * arma::Datum<ElemType>::pi * x2) + (2 * x2 - 2) *
+      (std::pow(std::sin(2 * arma::Datum<ElemType>::pi * x2), ElemType(2)) + 1);
 }
 
 template<typename MatType, typename GradType>
diff --git a/inst/include/ensmallen_bits/problems/logistic_regression_function.hpp b/inst/include/ensmallen_bits/problems/logistic_regression_function.hpp
index 53c69df..770aa0e 100644
--- a/inst/include/ensmallen_bits/problems/logistic_regression_function.hpp
+++ b/inst/include/ensmallen_bits/problems/logistic_regression_function.hpp
@@ -26,29 +26,34 @@ template<typename MatType = arma::mat>
 class LogisticRegressionFunction
 {
  public:
+  typedef typename MatType::elem_type ElemType;
+  typedef typename ForwardType<MatType>::brow BaseRowType;
+
+  template<typename LabelsType>
   LogisticRegressionFunction(MatType& predictors,
-                             arma::Row<size_t>& responses,
+                             LabelsType& responses,
                              const double lambda = 0);
 
+  template<typename LabelsType>
   LogisticRegressionFunction(MatType& predictors,
-                             arma::Row<size_t>& responses,
+                             LabelsType& responses,
                              MatType& initialPoint,
                              const double lambda = 0);
 
-  //! Return the initial point for the optimization.
+  // Return the initial point for the optimization.
   const MatType& InitialPoint() const { return initialPoint; }
-  //! Modify the initial point for the optimization.
+  // Modify the initial point for the optimization.
   MatType& InitialPoint() { return initialPoint; }
 
-  //! Return the regularization parameter (lambda).
-  const double& Lambda() const { return lambda; }
-  //! Modify the regularization parameter (lambda).
-  double& Lambda() { return lambda; }
+  // Return the regularization parameter (lambda).
+  const ElemType& Lambda() const { return lambda; }
+  // Modify the regularization parameter (lambda).
+  ElemType& Lambda() { return lambda; }
 
-  //! Return the matrix of predictors.
+  // Return the matrix of predictors.
   const MatType& Predictors() const { return predictors; }
   //! Return the vector of responses.
-  const arma::Row<size_t>& Responses() const { return responses; }
+  const BaseRowType& Responses() const { return responses; }
 
   /**
    * Shuffle the order of function visitation.  This may be called by the
@@ -67,7 +72,7 @@ class LogisticRegressionFunction
    *
    * @param parameters Vector of logistic regression parameters.
    */
-  typename MatType::elem_type Evaluate(const MatType& parameters) const;
+  ElemType Evaluate(const MatType& parameters) const;
 
   /**
    * Evaluate the logistic regression log-likelihood function with the given
@@ -86,9 +91,9 @@ class LogisticRegressionFunction
    * @param batchSize Number of points to be passed at a time to use for
    *     objective function evaluation.
    */
-  typename MatType::elem_type Evaluate(const MatType& parameters,
-                                       const size_t begin,
-                                       const size_t batchSize = 1) const;
+  ElemType Evaluate(const MatType& parameters,
+                    const size_t begin,
+                    const size_t batchSize = 1) const;
 
   /**
    * Evaluate the gradient of the logistic regression log-likelihood function
@@ -130,33 +135,34 @@ class LogisticRegressionFunction
    *    be computed.
    * @param gradient Sparse matrix to output gradient into.
    */
+  template<typename GradType>
   void PartialGradient(const MatType& parameters,
                        const size_t j,
-                       arma::sp_mat& gradient) const;
+                       GradType& gradient) const;
 
   /**
    * Evaluate the objective function and gradient of the logistic regression
    * log-likelihood function simultaneously with the given parameters.
    */
   template<typename GradType>
-  typename MatType::elem_type EvaluateWithGradient(
+  ElemType EvaluateWithGradient(
       const MatType& parameters,
       GradType& gradient) const;
 
   template<typename GradType>
-  typename MatType::elem_type EvaluateWithGradient(
+  ElemType EvaluateWithGradient(
       const MatType& parameters,
       const size_t begin,
       GradType& gradient,
       const size_t batchSize = 1) const;
 
-  //! Return the initial point for the optimization.
+  // Return the initial point for the optimization.
   const MatType& GetInitialPoint() const { return initialPoint; }
 
-  //! Return the number of separable functions (the number of predictor points).
+  // Return the number of separable functions (the number of predictor points).
   size_t NumFunctions() const { return predictors.n_cols; }
 
-  //! Return the number of features(add 1 for the intercept term).
+  // Return the number of features(add 1 for the intercept term).
   size_t NumFeatures() const { return predictors.n_rows + 1; }
 
   /**
@@ -174,8 +180,9 @@ class LogisticRegressionFunction
    * @param decisionBoundary Decision boundary (default 0.5).
    * @return Percentage of responses that are predicted correctly.
    */
+  template<typename LabelsType>
   double ComputeAccuracy(const MatType& predictors,
-                         const arma::Row<size_t>& responses,
+                         const LabelsType& responses,
                          const MatType& parameters,
                          const double decisionBoundary = 0.5) const;
 
@@ -191,22 +198,25 @@ class LogisticRegressionFunction
    * @param parameters Vector of logistic regression parameters.
    * @param decisionBoundary Decision boundary (default 0.5).
    */
+  template<typename LabelsType>
   void Classify(const MatType& dataset,
-                arma::Row<size_t>& labels,
+                LabelsType& labels,
                 const MatType& parameters,
                 const double decisionBoundary = 0.5) const;
 
  private:
-  //! The initial point, from which to start the optimization.
+  // The initial point, from which to start the optimization.
   MatType initialPoint;
-  //! The matrix of data points (predictors).  This is an alias until shuffling
-  //! is done.
+  // The matrix of data points (predictors).  This is an alias until shuffling
+  // is done.
   MatType& predictors;
-  //! The vector of responses to the input data points.  This is an alias until
-  //! shuffling is done.
-  arma::Row<size_t>& responses;
-  //! The regularization parameter for L2-regularization.
-  double lambda;
+  // The vector of responses to the input data points, converted to the same
+  // type as the data.
+  BaseRowType responses;
+  // The regularization parameter for L2-regularization.
+  ElemType lambda;
+  // This is lambda/2, cached for convenience.
+  ElemType halfLambda;
 };
 
 // Convenience typedefs.
diff --git a/inst/include/ensmallen_bits/problems/logistic_regression_function_impl.hpp b/inst/include/ensmallen_bits/problems/logistic_regression_function_impl.hpp
index e9a9148..291ad29 100644
--- a/inst/include/ensmallen_bits/problems/logistic_regression_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/logistic_regression_function_impl.hpp
@@ -19,17 +19,26 @@ namespace ens {
 namespace test {
 
 template<typename MatType>
+template<typename LabelsType>
 LogisticRegressionFunction<MatType>::LogisticRegressionFunction(
     MatType& predictors,
-    arma::Row<size_t>& responses,
-    const double lambda) :
+    LabelsType& responsesIn,
+    const double lambdaIn) :
     // We promise to be well-behaved... the elements won't be modified.
     predictors(predictors),
-    responses(responses),
-    lambda(lambda)
+    // On old Armadillo versions, we cannot do both a sparse-to-dense conversion
+    // and element type conversion in one shot.
+    #if ARMA_VERSION_MAJOR < 12 || \
+        (ARMA_VERSION_MAJOR == 12 && ARMA_VERSION_MINOR < 8)
+    responses(conv_to<BaseRowType>::from(conv_to<typename ForwardType<
+        LabelsType, ElemType>::bmat>::from(responsesIn))),
+    #else
+    responses(conv_to<MatType>::from(responsesIn)),
+    #endif
+    lambda(ElemType(lambdaIn)),
+    halfLambda(ElemType(lambdaIn / 2.0))
 {
-  initialPoint = arma::Row<typename MatType::elem_type>(predictors.n_rows + 1,
-      arma::fill::zeros);
+  initialPoint = arma::Row<ElemType>(predictors.n_rows + 1, arma::fill::zeros);
 
   // Sanity check.
   if (responses.n_elem != predictors.n_cols)
@@ -44,21 +53,55 @@ LogisticRegressionFunction<MatType>::LogisticRegressionFunction(
 }
 
 template<typename MatType>
+template<typename LabelsType>
 LogisticRegressionFunction<MatType>::LogisticRegressionFunction(
     MatType& predictors,
-    arma::Row<size_t>& responses,
+    LabelsType& responsesIn,
     MatType& initialPoint,
-    const double lambda) :
+    const double lambdaIn) :
     initialPoint(initialPoint),
     predictors(predictors),
-    responses(responses),
-    lambda(lambda)
+    // On old Armadillo versions, we cannot do both a sparse-to-dense conversion
+    // and element type conversion in one shot.
+    #if ARMA_VERSION_MAJOR < 12 || \
+        (ARMA_VERSION_MAJOR == 12 && ARMA_VERSION_MINOR < 8)
+    responses(conv_to<MatType>::from(conv_to<typename ForwardType<
+        LabelsType, ElemType>::bmat>::from(responsesIn))),
+    #else
+    responses(conv_to<MatType>::from(responsesIn)),
+    #endif
+    lambda(ElemType(lambdaIn)),
+    halfLambda(ElemType(lambdaIn / 2.0))
 {
   // To check if initialPoint is compatible with predictors.
   if (initialPoint.n_rows != (predictors.n_rows + 1) ||
       initialPoint.n_cols != 1)
-    this->initialPoint = arma::Row<typename MatType::elem_type>(
-        predictors.n_rows + 1, arma::fill::zeros);
+  {
+    this->initialPoint = arma::Row<ElemType>(predictors.n_rows + 1,
+        arma::fill::zeros);
+  }
+}
+
+template<typename MatType>
+void ShuffleImpl(MatType& predictors, MatType& responses,
+    const typename std::enable_if_t<!IsSparseMatrixType<MatType>::value>* = 0)
+{
+  MatType allData = shuffle(join_cols(predictors, responses), 1);
+
+  predictors = allData.rows(0, allData.n_rows - 2);
+  responses = allData.row(allData.n_rows - 1);
+}
+
+template<typename MatType, typename BaseRowType>
+void ShuffleImpl(MatType& predictors, BaseRowType& responses,
+    const typename std::enable_if_t<IsSparseMatrixType<MatType>::value>* = 0)
+{
+  // For sparse data shuffle() is not available.
+  arma::uvec ordering = shuffle(linspace<arma::uvec>(0, predictors.n_cols - 1,
+      predictors.n_cols));
+
+  predictors = predictors.cols(ordering);
+  responses = responses.cols(ordering);
 }
 
 /**
@@ -67,20 +110,7 @@ LogisticRegressionFunction<MatType>::LogisticRegressionFunction(
 template<typename MatType>
 void LogisticRegressionFunction<MatType>::Shuffle()
 {
-  MatType newPredictors;
-  arma::Row<size_t> newResponses;
-
-  arma::uvec ordering = arma::shuffle(arma::linspace<arma::uvec>(0,
-      predictors.n_cols - 1, predictors.n_cols));
-
-  newPredictors.set_size(predictors.n_rows, predictors.n_cols);
-  for (size_t i = 0; i < predictors.n_cols; ++i)
-    newPredictors.col(i) = predictors.col(ordering[i]);
-  newResponses = responses.cols(ordering);
-
-  // Take ownership of the new data.
-  predictors = std::move(newPredictors);
-  responses = std::move(newResponses);
+  ShuffleImpl<MatType>(predictors, responses);
 }
 
 /**
@@ -97,19 +127,18 @@ typename MatType::elem_type LogisticRegressionFunction<MatType>::Evaluate(
   //   f(w) = sum(y log(sig(w'x)) + (1 - y) log(sig(1 - w'x))).
   // We want to minimize this function.  L2-regularization is just lambda
   // multiplied by the squared l2-norm of the parameters then divided by two.
-  typedef typename MatType::elem_type ElemType;
+  typedef typename ForwardType<MatType>::brow BaseRowType;
 
   // For the regularization, we ignore the first term, which is the intercept
   // term and take every term except the last one in the decision variable.
-  const ElemType regularization = 0.5 * lambda *
-      arma::dot(parameters.tail_cols(parameters.n_elem - 1),
-      parameters.tail_cols(parameters.n_elem - 1));
+  const ElemType regularization = halfLambda *
+      dot(parameters.tail_cols(parameters.n_elem - 1),
+          parameters.tail_cols(parameters.n_elem - 1));
 
   // Calculate vectors of sigmoids.  The intercept term is parameters(0, 0) and
   // does not need to be multiplied by any of the predictors.
-  const arma::Row<ElemType> sigmoid = 1.0 / (1.0 +
-      arma::exp(-(parameters(0, 0) +
-                parameters.tail_cols(parameters.n_elem - 1) * predictors)));
+  const BaseRowType sigmoid = 1 / (1 + exp(-(parameters(0, 0) +
+      parameters.tail_cols(parameters.n_elem - 1) * predictors)));
 
   // Assemble full objective function.  Often the objective function and the
   // regularization as given are divided by the number of features, but this
@@ -117,9 +146,8 @@ typename MatType::elem_type LogisticRegressionFunction<MatType>::Evaluate(
   // terms for computational efficiency.  Note that the conversion causes some
   // copy and slowdown, but this is so negligible compared to the rest of the
   // calculation it is not worth optimizing for.
-  const ElemType result = arma::accu(arma::log(1.0 -
-      arma::conv_to<arma::Row<ElemType>>::from(responses) + sigmoid %
-      (2 * arma::conv_to<arma::Row<ElemType>>::from(responses) - 1.0)));
+  const ElemType result = accu(
+      log(1 - responses + sigmoid % (2 * responses - 1)));
 
   // Invert the result, because it's a minimization.
   return regularization - result;
@@ -135,25 +163,23 @@ typename MatType::elem_type LogisticRegressionFunction<MatType>::Evaluate(
     const size_t begin,
     const size_t batchSize) const
 {
-  typedef typename MatType::elem_type ElemType;
+  typedef typename ForwardType<MatType>::brow BaseRowType;
 
   // Calculate the regularization term.
-  const ElemType regularization = lambda *
-      (batchSize / (2.0 * predictors.n_cols)) *
-      arma::dot(parameters.tail_cols(parameters.n_elem - 1),
-                parameters.tail_cols(parameters.n_elem - 1));
+  const ElemType regularization = halfLambda *
+      (batchSize / ElemType(predictors.n_cols)) *
+      dot(parameters.tail_cols(parameters.n_elem - 1),
+          parameters.tail_cols(parameters.n_elem - 1));
 
   // Calculate the sigmoid function values.
-  const arma::Row<ElemType> sigmoid = 1.0 / (1.0 +
-      arma::exp(-(parameters(0, 0) +
-                  parameters.tail_cols(parameters.n_elem - 1) *
-                      predictors.cols(begin, begin + batchSize - 1))));
+  const BaseRowType sigmoid = 1 / (1 + exp(-(parameters(0, 0) +
+      parameters.tail_cols(parameters.n_elem - 1) *
+      predictors.cols(begin, begin + batchSize - 1))));
 
   // Compute the objective for the given batch size from a given point.
-  arma::Row<ElemType> respD = arma::conv_to<arma::Row<ElemType>>::from(
-      responses.subvec(begin, begin + batchSize - 1));
-  const ElemType result = arma::accu(arma::log(1.0 - respD + sigmoid %
-      (2 * respD - 1.0)));
+  const ElemType result = accu(log(
+      1 - responses.subvec(begin, begin + batchSize - 1) +
+      sigmoid % (2 * responses.subvec(begin, begin + batchSize - 1) - 1)));
 
   // Invert the result, because it's a minimization.
   return regularization - result;
@@ -166,16 +192,14 @@ void LogisticRegressionFunction<MatType>::Gradient(
     const MatType& parameters,
     GradType& gradient) const
 {
-  typedef typename MatType::elem_type ElemType;
   // Regularization term.
-  MatType regularization;
-  regularization = lambda * parameters.tail_cols(parameters.n_elem - 1);
+  MatType regularization = lambda * parameters.tail_cols(parameters.n_elem - 1);
 
-  const arma::Row<ElemType> sigmoids = (1 / (1 + arma::exp(-parameters(0, 0)
+  const BaseRowType sigmoids = (1 / (1 + exp(-parameters(0, 0)
       - parameters.tail_cols(parameters.n_elem - 1) * predictors)));
 
-  gradient.set_size(arma::size(parameters));
-  gradient[0] = -arma::accu(responses - sigmoids);
+  gradient.set_size(size(parameters));
+  gradient[0] = -accu(responses - sigmoids);
   gradient.tail_cols(parameters.n_elem - 1) = (sigmoids - responses) *
       predictors.t() + regularization;
 }
@@ -185,26 +209,24 @@ void LogisticRegressionFunction<MatType>::Gradient(
 template<typename MatType>
 template<typename GradType>
 void LogisticRegressionFunction<MatType>::Gradient(
-                const MatType& parameters,
-                const size_t begin,
-                GradType& gradient,
-                const size_t batchSize) const
+    const MatType& parameters,
+    const size_t begin,
+    GradType& gradient,
+    const size_t batchSize) const
 {
-  typedef typename MatType::elem_type ElemType;
-
   // Regularization term.
-  MatType regularization;
-  regularization = lambda * parameters.tail_cols(parameters.n_elem - 1)
+  MatType regularization = lambda * parameters.tail_cols(parameters.n_elem - 1)
       / predictors.n_cols * batchSize;
 
-  const arma::Row<ElemType> exponents = parameters(0, 0) +
+  const BaseRowType exponents = parameters(0, 0) +
       parameters.tail_cols(parameters.n_elem - 1) *
       predictors.cols(begin, begin + batchSize - 1);
+
   // Calculating the sigmoid function values.
-  const arma::Row<ElemType> sigmoids = 1.0 / (1.0 + arma::exp(-exponents));
+  const BaseRowType sigmoids = 1 / (1 + exp(-exponents));
 
   gradient.set_size(parameters.n_rows, parameters.n_cols);
-  gradient[0] = -arma::accu(responses.subvec(begin, begin + batchSize - 1) -
+  gradient[0] = -accu(responses.subvec(begin, begin + batchSize - 1) -
       sigmoids);
   gradient.tail_cols(parameters.n_elem - 1) = (sigmoids -
       responses.subvec(begin, begin + batchSize - 1)) *
@@ -215,27 +237,26 @@ void LogisticRegressionFunction<MatType>::Gradient(
  * Evaluate the partial gradient of the logistic regression objective
  * function with respect to the individual features in the parameter.
  */
-template <typename MatType>
+template<typename MatType>
+template<typename GradType>
 void LogisticRegressionFunction<MatType>::PartialGradient(
     const MatType& parameters,
     const size_t j,
-    arma::sp_mat& gradient) const
+    GradType& gradient) const
 {
-  const arma::Row<typename MatType::elem_type> diffs = responses -
-      (1 / (1 + arma::exp(-parameters(0, 0) -
-                          parameters.tail_cols(parameters.n_elem - 1) *
-                              predictors)));
+  const BaseRowType diffs = responses - (1 / (1 + exp(-parameters(0, 0) -
+      parameters.tail_cols(parameters.n_elem - 1) * predictors)));
 
-  gradient.set_size(arma::size(parameters));
+  gradient.set_size(size(parameters));
 
   if (j == 0)
   {
-    gradient[j] = -arma::accu(diffs);
+    gradient[j] = -accu(diffs);
   }
   else
   {
-    gradient[j] = arma::dot(-predictors.row(j - 1), diffs) + lambda *
-      parameters(0, j);
+    gradient[j] = dot(-predictors.row(j - 1), diffs) + lambda *
+        parameters(0, j);
   }
 }
 
@@ -246,30 +267,24 @@ LogisticRegressionFunction<MatType>::EvaluateWithGradient(
     const MatType& parameters,
     GradType& gradient) const
 {
-  typedef typename MatType::elem_type ElemType;
-
   // Regularization term.
-  MatType regularization = lambda *
-      parameters.tail_cols(parameters.n_elem - 1);
+  MatType regularization = lambda * parameters.tail_cols(parameters.n_elem - 1);
 
-  const ElemType objectiveRegularization = lambda / 2.0 *
-      arma::dot(parameters.tail_cols(parameters.n_elem - 1),
-                parameters.tail_cols(parameters.n_elem - 1));
+  const ElemType objectiveRegularization = halfLambda *
+      dot(parameters.tail_cols(parameters.n_elem - 1),
+          parameters.tail_cols(parameters.n_elem - 1));
 
   // Calculate the sigmoid function values.
-  const arma::Row<ElemType> sigmoids = 1.0 / (1.0 +
-      arma::exp(-(parameters(0, 0) +
-                  parameters.tail_cols(parameters.n_elem - 1) * predictors)));
+  const BaseRowType sigmoids = 1 / (1 + exp(-(parameters(0, 0) +
+      parameters.tail_cols(parameters.n_elem - 1) * predictors)));
 
-  gradient.set_size(arma::size(parameters));
-  gradient[0] = -arma::accu(responses - sigmoids);
+  gradient.set_size(size(parameters));
+  gradient[0] = -accu(responses - sigmoids);
   gradient.tail_cols(parameters.n_elem - 1) = (sigmoids - responses) *
       predictors.t() + regularization;
 
   // Now compute the objective function using the sigmoids.
-  ElemType result = arma::accu(arma::log(1.0 -
-      arma::conv_to<arma::Row<ElemType>>::from(responses) + sigmoids %
-      (2 * arma::conv_to<arma::Row<ElemType>>::from(responses) - 1.0)));
+  ElemType result = accu(log(1 - responses + sigmoids % (2 * responses - 1)));
 
   // Invert the result, because it's a minimization.
   return objectiveRegularization - result;
@@ -284,65 +299,64 @@ LogisticRegressionFunction<MatType>::EvaluateWithGradient(
     GradType& gradient,
     const size_t batchSize) const
 {
-  typedef typename MatType::elem_type ElemType;
+  typedef typename ForwardType<MatType>::brow BaseRowType;
 
   // Regularization term.
-  MatType regularization =
-      lambda * parameters.tail_cols(parameters.n_elem - 1) / predictors.n_cols *
+  MatType regularization = lambda *
+      parameters.tail_cols(parameters.n_elem - 1) / predictors.n_cols *
       batchSize;
 
-  const ElemType objectiveRegularization = lambda *
-      (batchSize / (2.0 * predictors.n_cols)) *
-      arma::dot(parameters.tail_cols(parameters.n_elem - 1),
-                parameters.tail_cols(parameters.n_elem - 1));
+  const ElemType objectiveRegularization = halfLambda *
+      (batchSize / ElemType(predictors.n_cols)) *
+      dot(parameters.tail_cols(parameters.n_elem - 1),
+          parameters.tail_cols(parameters.n_elem - 1));
 
   // Calculate the sigmoid function values.
-  const arma::Row<ElemType> sigmoids = 1.0 / (1.0 +
-      arma::exp(-(parameters(0, 0) +
-                  parameters.tail_cols(parameters.n_elem - 1) *
-                      predictors.cols(begin, begin + batchSize - 1))));
+  const BaseRowType sigmoids = 1 / (1 + exp(-(parameters(0, 0) +
+      parameters.tail_cols(parameters.n_elem - 1) *
+      predictors.cols(begin, begin + batchSize - 1))));
 
   gradient.set_size(parameters.n_rows, parameters.n_cols);
-  gradient[0] = -arma::accu(responses.subvec(begin, begin + batchSize - 1) -
+  gradient[0] = -accu(responses.subvec(begin, begin + batchSize - 1) -
       sigmoids);
   gradient.tail_cols(parameters.n_elem - 1) = (sigmoids -
-      responses.subvec(begin, begin + batchSize - 1)) *
+      responses.cols(begin, begin + batchSize - 1)) *
       predictors.cols(begin, begin + batchSize - 1).t() + regularization;
 
   // Now compute the objective function using the sigmoids.
-  arma::Row<ElemType> respD = arma::conv_to<arma::Row<ElemType>>::from(
-      responses.subvec(begin, begin + batchSize - 1));
-  const ElemType result = arma::accu(arma::log(1.0 - respD + sigmoids %
-      (2 * respD - 1.0)));
+  const ElemType result = accu(log(
+      1 - responses.subvec(begin, begin + batchSize - 1) +
+      sigmoids % (2 * responses.subvec(begin, begin + batchSize - 1) - 1)));
 
   // Invert the result, because it's a minimization.
   return objectiveRegularization - result;
 }
 
 template<typename MatType>
+template<typename LabelsType>
 void LogisticRegressionFunction<MatType>::Classify(
     const MatType& dataset,
-    arma::Row<size_t>& labels,
+    LabelsType& labels,
     const MatType& parameters,
     const double decisionBoundary) const
 {
-  // Calculate sigmoid function for each point.  The (1.0 - decisionBoundary)
+  // Calculate sigmoid function for each point.  The (1 - decisionBoundary)
   // term correctly sets an offset so that floor() returns 0 or 1 correctly.
-  labels = arma::conv_to<arma::Row<size_t>>::from((1.0 /
-      (1.0 + arma::exp(-parameters(0) -
+  labels = conv_to<LabelsType>::from((1 / (1 + exp(-parameters(0) -
       parameters.tail_cols(parameters.n_elem - 1) * dataset))) +
-      (1.0 - decisionBoundary));
+      ElemType(1 - decisionBoundary));
 }
 
 template<typename MatType>
+template<typename LabelsType>
 double LogisticRegressionFunction<MatType>::ComputeAccuracy(
     const MatType& predictors,
-    const arma::Row<size_t>& responses,
+    const LabelsType& responses,
     const MatType& parameters,
     const double decisionBoundary) const
 {
   // Predict responses using the current model.
-  arma::Row<size_t> tempResponses;
+  LabelsType tempResponses;
   Classify(predictors, tempResponses, parameters, decisionBoundary);
 
   // Count the number of responses that were correct.
diff --git a/inst/include/ensmallen_bits/problems/maf/maf1_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf1_function.hpp
index 0b2f133..884df10 100644
--- a/inst/include/ensmallen_bits/problems/maf/maf1_function.hpp
+++ b/inst/include/ensmallen_bits/problems/maf/maf1_function.hpp
@@ -23,8 +23,8 @@ namespace test {
  * \f[
  * x_M = [x_i, n - M + 1 <= i <= n]
  * g(x) = \Sigma{i = n - M + 1}^n (x_i - 0.5)^2
- * 
- * f_1(x) = 1 - x_1 * x_2 * ... x_M-1 * (1 + g(x_M)) 
+ *
+ * f_1(x) = 1 - x_1 * x_2 * ... x_M-1 * (1 + g(x_M))
  * f_2(x) = 1 - x_1 * x_2 * ... (1 - x_M-1) * (1 + g(x_M))
  * .
  * .
@@ -50,139 +50,136 @@ namespace test {
  *
  * @tparam MatType Type of matrix to optimize.
  */
-  template <typename MatType = arma::mat>
-  class MAF1
+template <typename MatType = arma::mat>
+class MAF1
+{
+ private:
+  // A fixed no. of Objectives and Variables(|x| = 7, M = 3).
+  size_t numObjectives {3};
+  size_t numVariables {12};
+
+ public:
+  /**
+  * Object Constructor.
+  * Initializes the individual objective functions.
+  *
+  * @param numParetoPoint No. of pareto points in the reference front.
+  */
+  MAF1() :
+      objectiveF1(0, *this),
+      objectiveF2(1, *this),
+      objectiveF3(2, *this)
+  {/* Nothing to do here */}
+
+  // Get the private variables.
+  size_t GetNumObjectives() { return numObjectives; }
+
+  size_t GetNumVariables() { return numVariables; }
+
+  // Get the starting point.
+  arma::Col<typename MatType::elem_type> GetInitialPoint()
+  {
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+    return arma::Col<ElemType>(numVariables, arma::fill::ones);
+  }
+
+  /**
+  * Evaluate the G(x) with the given coordinate.
+  *
+  * @param coords The function coordinates.
+  * @return arma::Row<typename MatType::elem_type>
+  */
+  arma::Row<typename MatType::elem_type> g(const MatType& coords)
+  {
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Row<ElemType> innerSum(size(coords)[1], arma::fill::zeros);
+
+    for (size_t i = numObjectives - 1;i < numVariables;i++)
+    {
+      innerSum += arma::pow((coords.row(i) - 0.5), 2);
+    }
+
+    return innerSum;
+  }
+
+  /**
+  * Evaluate the objectives with the given coordinate.
+  *
+  * @param coords The function coordinates.
+  * @return arma::Mat<typename MatType::elem_type>
+  */
+  arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+  {
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Mat<ElemType> objectives(numObjectives, size(coords)[1]);
+    arma::Row<ElemType> G = g(coords);
+    arma::Row<ElemType> value(coords.n_cols, arma::fill::ones);
+    for (size_t i = 0;i < numObjectives - 1;i++)
+    {
+      objectives.row(i) = (1 - value % (1.0 - coords.row(i))) % (1. + G);
+      value = value % coords.row(i);
+    }
+    objectives.row(numObjectives - 1) = (1 - value) % (1. + G);
+    return objectives;
+  }
+
+  // Individual Objective function.
+  // Changes based on stop variable provided.
+  struct MAF1Objective
   {
-    private:
-
-    // A fixed no. of Objectives and Variables(|x| = 7, M = 3).
-    size_t numObjectives {3};
-    size_t numVariables {12};
-
-    public:
-
-      /**
-      * Object Constructor.
-      * Initializes the individual objective functions.
-      *
-      * @param numParetoPoint No. of pareto points in the reference front.
-      */
-      MAF1() :
-          objectiveF1(0, *this),
-          objectiveF2(1, *this),
-          objectiveF3(2, *this)
-      {/* Nothing to do here */}
-
-      // Get the private variables.
-      size_t GetNumObjectives()
-      { return this -> numObjectives; }
-
-      size_t GetNumVariables()
-      { return this -> numVariables; }
-
-      // Get the starting point.
-      arma::Col<typename MatType::elem_type> GetInitialPoint()
+    MAF1Objective(size_t stop, MAF1& maf): maf(maf), stop(stop)
+    {/* Nothing to do here. */}
+
+    /**
+    * Evaluate one objective with the given coordinate.
+    *
+    * @param coords The function coordinates.
+    * @return arma::Col<typename MatType::elem_type>
+    */
+    typename MatType::elem_type Evaluate(const MatType& coords)
+    {
+      // Convenience typedef.
+      if (stop == 0)
       {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        return arma::Col<ElemType>(numVariables, arma::fill::ones);
+        return coords[0] * (1. + maf.g(coords)[0]);
       }
-
-      /**
-      * Evaluate the G(x) with the given coordinate.
-      *
-      * @param coords The function coordinates.
-      * @return arma::Row<typename MatType::elem_type>
-      */
-      arma::Row<typename MatType::elem_type> g(const MatType& coords)
-      {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        
-        arma::Row<ElemType> innerSum(size(coords)[1], arma::fill::zeros);
-        
-        for (size_t i = numObjectives - 1;i < numVariables;i++)
-        {
-          innerSum += arma::pow((coords.row(i) - 0.5), 2);
-        } 
-        
-        return innerSum;
-      }     
-
-      /**
-      * Evaluate the objectives with the given coordinate.
-      *
-      * @param coords The function coordinates.
-      * @return arma::Mat<typename MatType::elem_type>
-      */
-      arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+      typedef typename MatType::elem_type ElemType;
+      ElemType value = 1.0;
+      for (size_t i = 0; i < stop; i++)
       {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-
-        arma::Mat<ElemType> objectives(numObjectives, size(coords)[1]);
-        arma::Row<ElemType> G = g(coords);
-        arma::Row<ElemType> value(coords.n_cols, arma::fill::ones);
-        for (size_t i = 0;i < numObjectives - 1;i++)
-        {
-          objectives.row(i) = (1 - value % (1.0 - coords.row(i))) % (1. + G);
-          value = value % coords.row(i);
-        }
-        objectives.row(numObjectives - 1) = (1 - value) % (1. + G);
-        return objectives;    
+        value = value * coords[i];
       }
-      
-      // Individual Objective function.
-      // Changes based on stop variable provided. 
-      struct MAF1Objective
-      {
-        MAF1Objective(size_t stop, MAF1& maf): stop(stop), maf(maf)
-        {/* Nothing to do here. */}  
-        
-        /**
-        * Evaluate one objective with the given coordinate.
-        *
-        * @param coords The function coordinates.
-        * @return arma::Col<typename MatType::elem_type>
-        */
-        typename MatType::elem_type Evaluate(const MatType& coords)
-        {
-          // Convenience typedef.
-          if(stop == 0)
-          {
-            return coords[0] * (1. + maf.g(coords)[0]);
-          }
-          typedef typename MatType::elem_type ElemType;
-          ElemType value = 1.0;
-          for (size_t i = 0; i < stop; i++)
-          {
-            value = value * coords[i];
-          }
-
-          if(stop != maf.GetNumObjectives() - 1)
-          {
-            value = value * (1. - coords[stop]);
-          }
-
-          value = (1.0 - value) * (1. + maf.g(coords)[0]);
-          return value;
-        }        
-
-        MAF1& maf;
-        size_t stop;
-      };
-
-      //! Get objective functions.
-      std::tuple<MAF1Objective, MAF1Objective, MAF1Objective> GetObjectives()
+
+      if(stop != maf.GetNumObjectives() - 1)
       {
-        return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
+        value = value * (1. - coords[stop]);
       }
 
-      MAF1Objective objectiveF1;
-      MAF1Objective objectiveF2;
-      MAF1Objective objectiveF3;
+      value = (1.0 - value) * (1. + maf.g(coords)[0]);
+      return value;
+    }
+
+    MAF1& maf;
+    size_t stop;
   };
-  } //namespace test
-  } //namespace ens
 
-#endif
\ No newline at end of file
+  //! Get objective functions.
+  std::tuple<MAF1Objective, MAF1Objective, MAF1Objective> GetObjectives()
+  {
+    return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
+  }
+
+  MAF1Objective objectiveF1;
+  MAF1Objective objectiveF2;
+  MAF1Objective objectiveF3;
+};
+
+} // namespace test
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/problems/maf/maf2_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf2_function.hpp
index 697b26b..9140f9a 100644
--- a/inst/include/ensmallen_bits/problems/maf/maf2_function.hpp
+++ b/inst/include/ensmallen_bits/problems/maf/maf2_function.hpp
@@ -9,7 +9,6 @@
  * the 3-clause BSD license along with ensmallen.  If not, see
  * http://www.opensource.org/licenses/BSD-3-Clause for more information.
  */
-
 #ifndef ENSMALLEN_PROBLEMS_MAF_TWO_FUNCTION_HPP
 #define ENSMALLEN_PROBLEMS_MAF_TWO_FUNCTION_HPP
 
@@ -22,8 +21,8 @@ namespace test {
  * theta_M = [theta_i, n - M + 1 <= i <= n]
  * g_i(x) = \Sigma{i = M + (i - 1) * (n - M + 1) / N}^
  *                        {M - 1 + (i) * (n - M + 1) / N} (x_i - 0.5)^2 * 0.25
- * 
- * f_1(x) = cos(theta_1) * cos(theta_2) * ... cos(theta_M-1) * (1 + g_1(theta_M)) 
+ *
+ * f_1(x) = cos(theta_1) * cos(theta_2) * ... cos(theta_M-1) * (1 + g_1(theta_M))
  * f_2(x) = cos(theta_1) * cos(theta_2) * ... sin(theta_M-1) * (1 + g_2(theta_M))
  * .
  * .
@@ -32,12 +31,12 @@ namespace test {
  *
  * Bounds of the variable space is:
  * 0 <= x_i <= 1 for i = 1,...,n.
- * 
+ *
  * Where theta_i = 0.5 * (1 + 2 * g(X_M) * x_i) / (1 + g(X_M))
- * 
- * 
+ *
+ *
  * For more information, please refer to:
- * 
+ *
  * @code
  * @article{cheng2017benchmark,
  * title={A benchmark test suite for evolutionary many-objective optimization},
@@ -49,156 +48,154 @@ namespace test {
  * publisher={Springer}
  * }
  * @endcode
- * 
+ *
  * @tparam MatType Type of matrix to optimize.
  */
-  template <typename MatType = arma::mat>
-  class MAF2
+template <typename MatType = arma::mat>
+class MAF2
+{
+ private:
+  // A fixed no. of Objectives and Variables(|x| = 7, M = 3).
+  size_t numObjectives {3};
+  size_t numVariables {12};
+  size_t numParetoPoints;
+
+ public:
+  /**
+   * Object Constructor.
+   * Initializes the individual objective functions.
+   *
+   * @param numParetoPoint No. of pareto points in the reference front.
+   */
+  MAF2() :
+      objectiveF1(0, *this),
+      objectiveF2(1, *this),
+      objectiveF3(2, *this)
+  {/*Nothing to do here.*/}
+
+  //! Get the starting point.
+  arma::Col<typename MatType::elem_type> GetInitialPoint()
   {
-    private:
-
-    // A fixed no. of Objectives and Variables(|x| = 7, M = 3).
-    size_t numObjectives {3};
-    size_t numVariables {12};
-    size_t numParetoPoints;
-
-    public:
-
-      /**
-       * Object Constructor.
-       * Initializes the individual objective functions.
-       *
-       * @param numParetoPoint No. of pareto points in the reference front.
-       */
-      MAF2() :
-          objectiveF1(0, *this),
-          objectiveF2(1, *this),
-          objectiveF3(2, *this)
-      {/*Nothing to do here.*/}
-
-      //! Get the starting point.
-      arma::Col<typename MatType::elem_type> GetInitialPoint()
-      {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
-      } 
-      
-      // Get the private variables.
-      
-      // Get the number of objectives.
-      size_t GetNumObjectives()
-      { return this -> numObjectives; }
-      
-      // Get the number of variables.
-      size_t GetNumVariables()
-      { return this -> numVariables; }
-
-      /**
-       * Evaluate the G(x) with the given coordinate.
-       *
-       * @param coords The function coordinates.
-       * @return arma::Row<typename MatType::elem_type>
-       */
-      arma::Mat<typename MatType::elem_type> g(const MatType& coords)
-      {
-        size_t k = numVariables - numObjectives + 1;
-        size_t c = std::floor(k / numObjectives);
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        
-        arma::Mat<ElemType> innerSum(numObjectives, size(coords)[1], 
-            arma::fill::zeros);
-        
-        for (size_t i = 0; i < numObjectives; i++)
-        {
-          size_t j = numObjectives - 1 + (i * c);
-          for(; j < numVariables - 1 + (i + 1) *c && j < numObjectives; j++)
-          {
-            innerSum.row(i) += arma::pow((coords.row(i) - 0.5), 2) * 0.25; 
-          }
-        }
-        
-        return innerSum;
-      }    
-
-      /**
-       * Evaluate the objectives with the given coordinate.
-       *
-       * @param coords The function coordinates.
-       * @return arma::Mat<typename MatType::elem_type>
-       */
-      arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+    return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
+  }
+
+  // Get the private variables.
+
+  // Get the number of objectives.
+  size_t GetNumObjectives() { return numObjectives; }
+
+  // Get the number of variables.
+  size_t GetNumVariables() { return numVariables; }
+
+  /**
+   * Evaluate the G(x) with the given coordinate.
+   *
+   * @param coords The function coordinates.
+   * @return arma::Row<typename MatType::elem_type>
+   */
+  arma::Mat<typename MatType::elem_type> g(const MatType& coords)
+  {
+    size_t k = numVariables - numObjectives + 1;
+    size_t c = std::floor(k / numObjectives);
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Mat<ElemType> innerSum(numObjectives, size(coords)[1],
+        arma::fill::zeros);
+
+    for (size_t i = 0; i < numObjectives; i++)
+    {
+      size_t j = numObjectives - 1 + (i * c);
+      for(; j < numVariables - 1 + (i + 1) *c && j < numObjectives; j++)
       {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-
-        arma::Mat<ElemType> objectives(numObjectives, size(coords)[1]);
-        arma::Mat<ElemType> G = g(coords); 
-        arma::Row<ElemType> value(size(coords)[1], arma::fill::ones);
-        arma::Row<ElemType> theta;
-        for (size_t i = 0; i < numObjectives - 1; i++)
-        {
-          theta = arma::datum::pi * 0.5 * ((coords.row(i) / 2) + 0.25);
-          objectives.row(i) =  value %  
-              arma::sin(theta) % (1.0 + G.row(numObjectives - 1 - i));
-          value = value % arma::cos(theta); 
-        }
-        objectives.row(numObjectives - 1) = value % 
-            (1.0 + G.row(0));
-        return objectives;
+        innerSum.row(i) += arma::pow((coords.row(i) - 0.5), 2) * 0.25;
       }
-      
-      // Individual Objective function.
-      // Changes based on stop variable provided. 
-      struct MAF2Objective
+    }
+
+    return innerSum;
+  }
+
+  /**
+   * Evaluate the objectives with the given coordinate.
+   *
+   * @param coords The function coordinates.
+   * @return arma::Mat<typename MatType::elem_type>
+   */
+  arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+  {
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Mat<ElemType> objectives(numObjectives, size(coords)[1]);
+    arma::Mat<ElemType> G = g(coords);
+    arma::Row<ElemType> value(size(coords)[1], arma::fill::ones);
+    arma::Row<ElemType> theta;
+    for (size_t i = 0; i < numObjectives - 1; i++)
+    {
+      theta = arma::datum::pi * 0.5 * ((coords.row(i) / 2) + 0.25);
+      objectives.row(i) =  value %
+          arma::sin(theta) % (1.0 + G.row(numObjectives - 1 - i));
+      value = value % arma::cos(theta);
+    }
+    objectives.row(numObjectives - 1) = value %
+        (1.0 + G.row(0));
+    return objectives;
+  }
+
+  // Individual Objective function.
+  // Changes based on stop variable provided.
+  struct MAF2Objective
+  {
+    MAF2Objective(size_t stop, MAF2& maf): stop(stop), maf(maf)
+    {/* Nothing to do here. */}
+
+    /**
+     * Evaluate one objective with the given coordinate.
+     *
+     * @param coords The function coordinates.
+     * @return arma::Col<typename MatType::elem_type>
+     */
+    typename MatType::elem_type Evaluate(const MatType& coords)
+    {
+      // Convenience typedef.
+      typedef typename MatType::elem_type ElemType;
+      ElemType value = 1.0;
+      ElemType theta;
+      arma::Col<ElemType> G = maf.g(coords).col(0);
+      for (size_t i = 0; i < stop; i++)
       {
-        MAF2Objective(size_t stop, MAF2& maf): stop(stop), maf(maf)
-        {/* Nothing to do here. */}  
-        
-        /**
-         * Evaluate one objective with the given coordinate.
-         *
-         * @param coords The function coordinates.
-         * @return arma::Col<typename MatType::elem_type>
-         */
-        typename MatType::elem_type Evaluate(const MatType& coords)
-        {
-          // Convenience typedef.
-          typedef typename MatType::elem_type ElemType;
-          ElemType value = 1.0;
-          ElemType theta;
-          arma::Col<ElemType> G = maf.g(coords).col(0);
-          for (size_t i = 0; i < stop; i++)
-          {
-            theta = arma::datum::pi * 0.5 * ((coords[i] / 2) + 0.25); 
-            value = value * std::cos(theta);
-          }
-	        theta = arma::datum::pi * 0.5 * ((coords[stop] / 2) + 0.25);
-          if(stop != maf.numObjectives - 1)
-          {
-            value = value * std::sin(theta);
-          }
-
-          value = value * (1.0 + G[maf.GetNumObjectives() - 1 - stop]);
-          return value;  
-        }
-
-        MAF2& maf;
-        size_t stop;
-      };
-
-      // Return back a tuple of objective functions.
-      std::tuple<MAF2Objective, MAF2Objective, MAF2Objective> GetObjectives()
+        theta = arma::datum::pi * 0.5 * ((coords[i] / 2) + 0.25);
+        value = value * std::cos(theta);
+      }
+
+      theta = arma::datum::pi * 0.5 * ((coords[stop] / 2) + 0.25);
+      if (stop != maf.numObjectives - 1)
       {
-          return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
+        value = value * std::sin(theta);
       }
 
-    MAF2Objective objectiveF1;
-    MAF2Objective objectiveF2;
-    MAF2Objective objectiveF3;
+      value = value * (1.0 + G[maf.GetNumObjectives() - 1 - stop]);
+      return value;
+    }
+
+    MAF2& maf;
+    size_t stop;
   };
-  } //namespace test
-  } //namespace ens
 
-#endif
\ No newline at end of file
+  // Return back a tuple of objective functions.
+  std::tuple<MAF2Objective, MAF2Objective, MAF2Objective> GetObjectives()
+  {
+    return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
+  }
+
+  MAF2Objective objectiveF1;
+  MAF2Objective objectiveF2;
+  MAF2Objective objectiveF3;
+};
+
+} // namespace test
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/problems/maf/maf3_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf3_function.hpp
index 4ea0963..af68a27 100644
--- a/inst/include/ensmallen_bits/problems/maf/maf3_function.hpp
+++ b/inst/include/ensmallen_bits/problems/maf/maf3_function.hpp
@@ -22,9 +22,9 @@ namespace test {
  * The MAF3 function, defined by:
  * \f[
  * x_M = [x_i, n - M + 1 <= i <= n]
- * g(x) = 100 * [|x_M| + \Sigma{i = n - M + 1}^n (x_i - 0.5)^2 - cos(20 * pi * 
+ * g(x) = 100 * [|x_M| + \Sigma{i = n - M + 1}^n (x_i - 0.5)^2 - cos(20 * pi *
  *   (x_i - 0.5))]
- * 
+ *
  * f_1(x) = (cos(x_1 * pi * 0.5) * cos(x_2 * pi * 0.5) * ... cos(x_2 * pi * 0.5) * (1 + g(x_M)))^4
  * f_2(x) = (cos(x_1 * pi * 0.5) * cos(x_2 * pi * 0.5) * ... sin(x_M-1 * pi * 0.5) * (1 + g(x_M)))^4
  * .
@@ -34,9 +34,9 @@ namespace test {
  *
  * Bounds of the variable space is:
  * 0 <= x_i <= 1 for i = 1,...,n.
- * 
+ *
  * For more information, please refer to:
- * 
+ *
  * @code
  * @article{cheng2017benchmark,
  * title={A benchmark test suite for evolutionary many-objective optimization},
@@ -51,146 +51,146 @@ namespace test {
  *
  * @tparam MatType Type of matrix to optimize.
  */
-  template <typename MatType = arma::mat>
-  class MAF3
+template <typename MatType = arma::mat>
+class MAF3
+{
+ private:
+  // A fixed no. of Objectives and Variables(|x| = 12, M = 3).
+  size_t numObjectives {3};
+  size_t numVariables {12};
+
+ public:
+  /**
+   * Object Constructor.
+   * Initializes the individual objective functions.
+   *
+   * @param numParetoPoint No. of pareto points in the reference front.
+   */
+  MAF3() :
+      objectiveF1(0, *this),
+      objectiveF2(1, *this),
+      objectiveF3(2, *this)
+  {/*Nothing to do here.*/}
+
+  //! Get the starting point.
+  arma::Col<typename MatType::elem_type> GetInitialPoint()
   {
-    private:
-
-    // A fixed no. of Objectives and Variables(|x| = 12, M = 3).
-    size_t numObjectives {3};
-    size_t numVariables {12};
-
-    public:
-
-      /**
-       * Object Constructor.
-       * Initializes the individual objective functions.
-       *
-       * @param numParetoPoint No. of pareto points in the reference front.
-       */
-      MAF3() :
-          objectiveF1(0, *this),
-          objectiveF2(1, *this),
-          objectiveF3(2, *this)
-      {/*Nothing to do here.*/}
-
-      //! Get the starting point.
-      arma::Col<typename MatType::elem_type> GetInitialPoint()
-      {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
-      } 
-      
-      // Get the private variables.
-      
-      // Get the number of objectives.
-      size_t GetNumObjectives()
-      { return this -> numObjectives; }
-
-      // Get the number of variables.
-      size_t GetNumVariables()
-      { return this -> numVariables;}
-
-      /**
-       * Evaluate the G(x) with the given coordinate.
-       *
-       * @param coords The function coordinates.
-       * @return arma::Row<typename MatType::elem_type>
-       */
-      arma::Row<typename MatType::elem_type> g(const MatType& coords)
-      {
-        size_t k = numVariables - numObjectives + 1;
-
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        
-        arma::Row<ElemType> innerSum(size(coords)[1], arma::fill::zeros);
-        
-        for (size_t i = numObjectives - 1; i < numVariables; i++)
-        {
-          innerSum += arma::pow((coords.row(i) - 0.5), 2) - 
-              arma::cos(20 * arma::datum::pi * (coords.row(i) - 0.5)); 
-        } 
-        
-        return 100 * (k + innerSum);
-      }    
-
-      /**
-       * Evaluate the objectives with the given coordinate.
-       *
-       * @param coords The function coordinates.
-       * @return arma::Mat<typename MatType::elem_type>
-       */
-      arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+    return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
+  }
+
+  // Get the private variables.
+
+  // Get the number of objectives.
+  size_t GetNumObjectives() { return numObjectives; }
+
+  // Get the number of variables.
+  size_t GetNumVariables() { return numVariables; }
+
+  /**
+   * Evaluate the G(x) with the given coordinate.
+   *
+   * @param coords The function coordinates.
+   * @return arma::Row<typename MatType::elem_type>
+   */
+  arma::Row<typename MatType::elem_type> g(const MatType& coords)
+  {
+    size_t k = numVariables - numObjectives + 1;
+
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Row<ElemType> innerSum(size(coords)[1], arma::fill::zeros);
+
+    for (size_t i = numObjectives - 1; i < numVariables; i++)
+    {
+      innerSum += arma::pow((coords.row(i) - 0.5), 2) -
+          arma::cos(20 * arma::datum::pi * (coords.row(i) - 0.5));
+    }
+
+    return 100 * (k + innerSum);
+  }
+
+  /**
+   * Evaluate the objectives with the given coordinate.
+   *
+   * @param coords The function coordinates.
+   * @return arma::Mat<typename MatType::elem_type>
+   */
+  arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+  {
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Mat<ElemType> objectives(numObjectives, size(coords)[1],
+        arma::fill::ones);
+    arma::Row<ElemType> G = g(coords);
+    arma::Row<ElemType> value = (1.0 + G);
+    for (size_t i = 0; i < numObjectives - 1; i++)
+    {
+      objectives.row(i) =  arma::pow(value, i == 0 ? 2:4) %
+          arma::pow(arma::sin(coords.row(i) * arma::datum::pi * 0.5),
+          i == 0 ? 2:4);
+      value = value % arma::cos(coords.row(i) * arma::datum::pi * 0.5);
+    }
+    objectives.row(numObjectives - 1) = arma::pow(value, 4);
+    return objectives;
+  }
+
+  // Individual Objective function.
+  // Changes based on stop variable provided.
+  struct MAF3Objective
+  {
+    MAF3Objective(size_t stop, MAF3& maf): maf(maf), stop(stop)
+    {/* Nothing to do here. */}
+
+    /**
+     * Evaluate one objective with the given coordinate.
+     *
+     * @param coords The function coordinates.
+     * @return arma::Col<typename MatType::elem_type>
+     */
+    typename MatType::elem_type Evaluate(const MatType& coords)
+    {
+      // Convenience typedef.
+      typedef typename MatType::elem_type ElemType;
+      ElemType value = 1.0;
+      for (size_t i = 0; i < stop; i++)
       {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-
-        arma::Mat<ElemType> objectives(numObjectives, size(coords)[1], arma::fill::ones);
-        arma::Row<ElemType> G = g(coords);
-        arma::Row<ElemType> value = (1.0 + G);
-        for (size_t i = 0; i < numObjectives - 1; i++)
-        {
-          objectives.row(i) =  arma::pow(value, i == 0 ? 2:4) % 
-              arma::pow(arma::sin(coords.row(i) * arma::datum::pi * 0.5), i == 0 ? 2:4);
-          value = value % arma::cos(coords.row(i) * arma::datum::pi * 0.5);
-        }
-        objectives.row(numObjectives - 1) = arma::pow(value, 4);
-        return objectives;
+        value = value * std::cos(coords[i] * arma::datum::pi * 0.5);
       }
-      
-      // Individual Objective function.
-      // Changes based on stop variable provided. 
-      struct MAF3Objective
+
+      if (stop != maf.GetNumObjectives() - 1)
       {
-        MAF3Objective(size_t stop, MAF3& maf): stop(stop), maf(maf)
-        {/* Nothing to do here. */}  
-        
-        /**
-         * Evaluate one objective with the given coordinate.
-         *
-         * @param coords The function coordinates.
-         * @return arma::Col<typename MatType::elem_type>
-         */
-        typename MatType::elem_type Evaluate(const MatType& coords)
-        {
-          // Convenience typedef.
-          typedef typename MatType::elem_type ElemType;
-          ElemType value = 1.0;
-          for (size_t i = 0; i < stop; i++)
-          {
-            value = value * std::cos(coords[i] * arma::datum::pi * 0.5);
-          }
-
-          if(stop != maf.GetNumObjectives() - 1)
-          {
-            value = value * std::sin(coords[stop] * arma::datum::pi * 0.5);
-          }
-
-          value = value * (1. + maf.g(coords)[0]);
-
-          if(stop == 0) {
-            return std::pow(value, 2); 
-          }
-          return std::pow(value, 4);  
-        }        
-
-        MAF3& maf;
-        size_t stop;
-      };
-
-      // Return back a tuple of objective functions.
-      std::tuple<MAF3Objective, MAF3Objective, MAF3Objective> GetObjectives()
+        value = value * std::sin(coords[stop] * arma::datum::pi * 0.5);
+      }
+
+      value = value * (1. + maf.g(coords)[0]);
+
+      if (stop == 0)
       {
-          return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
-      } 
+        return std::pow(value, 2);
+      }
+      return std::pow(value, 4);
+    }
 
-    MAF3Objective objectiveF1;
-    MAF3Objective objectiveF2;
-    MAF3Objective objectiveF3;
+    MAF3& maf;
+    size_t stop;
   };
-  } //namespace test
-  } //namespace ens
 
-#endif
\ No newline at end of file
+  // Return back a tuple of objective functions.
+  std::tuple<MAF3Objective, MAF3Objective, MAF3Objective> GetObjectives()
+  {
+    return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
+  }
+
+  MAF3Objective objectiveF1;
+  MAF3Objective objectiveF2;
+  MAF3Objective objectiveF3;
+};
+
+} // namespace test
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/problems/maf/maf4_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf4_function.hpp
index 9258468..aa4bcd1 100644
--- a/inst/include/ensmallen_bits/problems/maf/maf4_function.hpp
+++ b/inst/include/ensmallen_bits/problems/maf/maf4_function.hpp
@@ -22,10 +22,10 @@ namespace test {
  * The MAF4 function, defined by:
  * \f[
  * x_M = [x_i, n - M + 1 <= i <= n]
- * g(x) = 100 * [|x_M| + \Sigma{i = n - M + 1}^n (x_i - 0.5)^2 - cos(20 * pi * 
+ * g(x) = 100 * [|x_M| + \Sigma{i = n - M + 1}^n (x_i - 0.5)^2 - cos(20 * pi *
  *   (x_i - 0.5))]
- * 
- * f_1(x) = a * (1 - cos(x_1 * pi * 0.5) * cos(x_2 * pi * 0.5) * ... cos(x_2 * pi * 0.5))* (1 + g(x_M)) 
+ *
+ * f_1(x) = a * (1 - cos(x_1 * pi * 0.5) * cos(x_2 * pi * 0.5) * ... cos(x_2 * pi * 0.5))* (1 + g(x_M))
  * f_2(x) = a^2 * (1 - cos(x_1 * pi * 0.5) * cos(x_2 * pi * 0.5) * ... sin(x_M-1 * pi * 0.5)) * (1 + g(x_M))
  * .
  * .
@@ -34,9 +34,9 @@ namespace test {
  *
  * Bounds of the variable space is:
  * 0 <= x_i <= 1 for i = 1,...,n.
- * 
+ *
  * For more information, please refer to:
- * 
+ *
  * @code
  * @article{cheng2017benchmark,
  * title={A benchmark test suite for evolutionary many-objective optimization},
@@ -51,161 +51,156 @@ namespace test {
  *
  * @tparam MatType Type of matrix to optimize.
  */
-  template <typename MatType = arma::mat>
-  class MAF4
+template <typename MatType = arma::mat>
+class MAF4
+{
+ private:
+  // A fixed no. of Objectives and Variables(|x| = 7, M = 3).
+  size_t numObjectives {3};
+  size_t numVariables {12};
+  double a;
+
+ public:
+  /**
+   * Object Constructor.
+   * Initializes the individual objective functions.
+   *
+   * @param numParetoPoint No. of pareto points in the reference front.
+   * @param a The scale factor of the objectives.
+   */
+  MAF4(double a = 2) :
+      a(a),
+      objectiveF1(0, *this),
+      objectiveF2(1, *this),
+      objectiveF3(2, *this)
+  {/*Nothing to do here.*/}
+
+  //! Get the starting point.
+  arma::Col<typename MatType::elem_type> GetInitialPoint()
   {
-    private:
-
-    // A fixed no. of Objectives and Variables(|x| = 7, M = 3).
-    size_t numObjectives {3};
-    size_t numVariables {12}; 
-    double a;
-
-    public:
-
-      /**
-       * Object Constructor.
-       * Initializes the individual objective functions.
-       *
-       * @param numParetoPoint No. of pareto points in the reference front.
-       * @param a The scale factor of the objectives.
-       */
-      MAF4(double a = 2) :
-          objectiveF1(0, *this),
-          objectiveF2(1, *this),
-          objectiveF3(2, *this),
-          a(a)
-      {/*Nothing to do here.*/}
-
-      //! Get the starting point.
-      arma::Col<typename MatType::elem_type> GetInitialPoint()
-      {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
-      }
-      
-      // Get the private variables.
-      
-      // Get the number of objectives.
-      size_t GetNumObjectives()
-      { return this -> numObjectives; }
-
-      // Get the number of variables.
-      size_t GetNumVariables()
-      { return this -> numVariables;}
-
-      //Get the scaling parameter a.
-      size_t GetA()
-      { return this -> a; }
-
-      /**
-       * Set the scale factor of the objectives.
-       * 
-       * @param a The scale factor a of the objectives.
-       */
-      void SetA(double a)
-      { this -> a = a; }
-
-      /**
-       * Evaluate the G(x) with the given coordinate.
-       *
-       * @param coords The function coordinates.
-       * @return arma::Row<typename MatType::elem_type>
-       */
-      arma::Row<typename MatType::elem_type> g(const MatType& coords)
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+    return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
+  }
+
+  // Get the private variables.
+
+  // Get the number of objectives.
+  size_t GetNumObjectives() { return numObjectives; }
+
+  // Get the number of variables.
+  size_t GetNumVariables() { return numVariables;}
+
+  //Get the scaling parameter a.
+  size_t GetA() { return a; }
+
+  /**
+   * Set the scale factor of the objectives.
+   *
+   * @param a The scale factor a of the objectives.
+   */
+  void SetA(double a) { this->a = a; }
+
+  /**
+   * Evaluate the G(x) with the given coordinate.
+   *
+   * @param coords The function coordinates.
+   * @return arma::Row<typename MatType::elem_type>
+   */
+  arma::Row<typename MatType::elem_type> g(const MatType& coords)
+  {
+    size_t k = numVariables - numObjectives + 1;
+
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Row<ElemType> innerSum(size(coords)[1], arma::fill::zeros);
+
+    for (size_t i = numObjectives - 1; i < numVariables; i++)
+    {
+      innerSum += arma::pow((coords.row(i) - 0.5), 2) -
+          arma::cos(20 * arma::datum::pi * (coords.row(i) - 0.5));
+    }
+
+    return 100 * (k + innerSum);
+  }
+
+  /**
+   * Evaluate the objectives with the given coordinate.
+   *
+   * @param coords The function coordinates.
+   * @return arma::Mat<typename MatType::elem_type>
+   */
+  arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+  {
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Mat<ElemType> objectives(numObjectives, size(coords)[1]);
+    arma::Row<ElemType> G = g(coords);
+    arma::Row<ElemType> value(coords.n_cols, arma::fill::ones);
+    for (size_t i = 0; i < numObjectives - 1; i++)
+    {
+      objectives.row(i) = (1.0 - value %
+            arma::sin(coords.row(i) * arma::datum::pi * 0.5)) % (1. + G) *
+            std::pow(a, numObjectives - i);
+      value = value % arma::cos(coords.row(i) * arma::datum::pi * 0.5);
+    }
+    objectives.row(numObjectives - 1) = (1 - value) % (1. + G) *
+                                std::pow(a, 1);
+    return objectives;
+  }
+
+  // Individual Objective function.
+  // Changes based on stop variable provided.
+  struct MAF4Objective
+  {
+    MAF4Objective(size_t stop, MAF4& maf): maf(maf), stop(stop)
+    {/* Nothing to do here. */}
+
+    /**
+     * Evaluate one objective with the given coordinate.
+     *
+     * @param coords The function coordinates.
+     * @return arma::Col<typename MatType::elem_type>
+     */
+    typename MatType::elem_type Evaluate(const MatType& coords)
+    {
+      // Convenience typedef.
+      typedef typename MatType::elem_type ElemType;
+      ElemType value = 1.0;
+      for (size_t i = 0; i < stop; i++)
       {
-        size_t k = numVariables - numObjectives + 1;
-
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        
-        arma::Row<ElemType> innerSum(size(coords)[1], arma::fill::zeros);
-        
-        for (size_t i = numObjectives - 1; i < numVariables; i++)
-        {
-          innerSum += arma::pow((coords.row(i) - 0.5), 2) - 
-              arma::cos(20 * arma::datum::pi * (coords.row(i) - 0.5)); 
-        }
-        
-        return 100 * (k + innerSum);
+        value = value * std::cos(coords[i] * arma::datum::pi * 0.5);
       }
 
-      /**
-       * Evaluate the objectives with the given coordinate.
-       *
-       * @param coords The function coordinates.
-       * @return arma::Mat<typename MatType::elem_type>
-       */
-      arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+      if(stop != maf.GetNumObjectives() - 1)
       {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-
-        arma::Mat<ElemType> objectives(numObjectives, size(coords)[1]);
-        arma::Row<ElemType> G = g(coords);
-        arma::Row<ElemType> value(coords.n_cols, arma::fill::ones);
-        for (size_t i = 0; i < numObjectives - 1; i++)
-        {
-          objectives.row(i) = (1.0 - value % 
-                arma::sin(coords.row(i) * arma::datum::pi * 0.5)) % (1. + G) * 
-                std::pow(a, numObjectives - i);
-          value = value % arma::cos(coords.row(i) * arma::datum::pi * 0.5); 
-        }
-        objectives.row(numObjectives - 1) = (1 - value) % (1. + G) * 
-                                    std::pow(a, 1);  
-        return objectives;    
+        value = value * std::sin(coords[stop] * arma::datum::pi * 0.5);
       }
-      
-      // Individual Objective function.
-      // Changes based on stop variable provided. 
-      struct MAF4Objective
-      {
-        MAF4Objective(size_t stop, MAF4& maf): stop(stop), maf(maf)
-        {/* Nothing to do here. */}  
-        
-        /**
-         * Evaluate one objective with the given coordinate.
-         *
-         * @param coords The function coordinates.
-         * @return arma::Col<typename MatType::elem_type>
-         */
-        typename MatType::elem_type Evaluate(const MatType& coords)
-        {
-          // Convenience typedef.
-          typedef typename MatType::elem_type ElemType;
-          ElemType value = 1.0;
-          for (size_t i = 0; i < stop; i++)
-          {
-            value = value * std::cos(coords[i] * arma::datum::pi * 0.5);
-          }
-
-          if(stop != maf.GetNumObjectives() - 1)
-          {
-            value = value * std::sin(coords[stop] * arma::datum::pi * 0.5);
-          }
-
-          value = std::pow(maf.GetA(), maf.GetNumObjectives() - stop) * 
-              (1 - value) * (1. + maf.g(coords)[0]);
-
-          return value;
-        }        
-
-        MAF4& maf;
-        size_t stop;
-      };
-
-      // Return back a tuple of objective functions.
-      std::tuple<MAF4Objective, MAF4Objective, MAF4Objective> GetObjectives()
-      {
-          return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
-      } 
 
-    MAF4Objective objectiveF1;
-    MAF4Objective objectiveF2;
-    MAF4Objective objectiveF3;
+      value = std::pow(maf.GetA(), maf.GetNumObjectives() - stop) *
+          (1 - value) * (1. + maf.g(coords)[0]);
+
+      return value;
+    }
+
+    MAF4& maf;
+    size_t stop;
   };
-  } //namespace test
-  } //namespace ens
+
+  // Return back a tuple of objective functions.
+  std::tuple<MAF4Objective, MAF4Objective, MAF4Objective> GetObjectives()
+  {
+    return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
+  }
+
+  MAF4Objective objectiveF1;
+  MAF4Objective objectiveF2;
+  MAF4Objective objectiveF3;
+};
+
+} // namespace test
+} // namespace ens
 
 #endif
diff --git a/inst/include/ensmallen_bits/problems/maf/maf5_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf5_function.hpp
index db8c16f..bb1b8c3 100644
--- a/inst/include/ensmallen_bits/problems/maf/maf5_function.hpp
+++ b/inst/include/ensmallen_bits/problems/maf/maf5_function.hpp
@@ -23,8 +23,8 @@ namespace test {
  * \f[
  * x_M = [x_i, n - M + 1 <= i <= n]
  * g(x) = \Sigma{i = n - M + 1}^n (x_i - 0.5)^2
- * 
- * f_1(x) = a^M * cos(x_1^alpha * pi * 0.5) * cos(x_2^alpha * pi * 0.5) * ... cos(x_2^alpha * pi * 0.5) * (1 + g(x_M)) 
+ *
+ * f_1(x) = a^M * cos(x_1^alpha * pi * 0.5) * cos(x_2^alpha * pi * 0.5) * ... cos(x_2^alpha * pi * 0.5) * (1 + g(x_M))
  * f_2(x) = a^M-1 * cos(x_1^alpha * pi * 0.5) * cos(x_2^alpha * pi * 0.5) * ... sin(x_M-1^alpha * pi * 0.5) * (1 + g(x_M))
  * .
  * .
@@ -35,9 +35,9 @@ namespace test {
  * 0 <= x_i <= 1 for i = 1,...,n.
  *
  * This should be optimized to x_i = 0.5 (for all x_i in x_M), at:
- * 
+ *
  * For more information, please refer to:
- * 
+ *
  * @code
  * @article{cheng2017benchmark,
  * title={A benchmark test suite for evolutionary many-objective optimization},
@@ -52,175 +52,169 @@ namespace test {
  *
  * @tparam MatType Type of matrix to optimize.
  */
-  template <typename MatType = arma::mat>
-  class MAF5
+template <typename MatType = arma::mat>
+class MAF5
+{
+ private:
+  // A fixed no. of Objectives and Variables(|x| = 7, M = 3).
+  size_t numObjectives {3};
+  size_t numVariables {12};
+  size_t alpha;
+  size_t a;
+
+ public:
+  /**
+   * Object Constructor.
+   * Initializes the individual objective functions.
+   *
+   * @param alpha The power which each variable is raised to.
+   * @param numParetoPoint No. of pareto points in the reference front.
+   * @param a The scale factor of the objectives.
+   */
+  MAF5(size_t alpha = 100, double a = 2) :
+      alpha(alpha),
+      a(a),
+      objectiveF1(0, *this),
+      objectiveF2(1, *this),
+      objectiveF3(2, *this)
+  {/*Nothing to do here.*/}
+
+  //! Get the starting point.
+  arma::Col<typename MatType::elem_type> GetInitialPoint()
   {
-    private:
-
-    // A fixed no. of Objectives and Variables(|x| = 7, M = 3).
-    size_t numObjectives {3};
-    size_t numVariables {12};
-    size_t alpha;
-    size_t a;
-
-    public:
-
-      /**
-       * Object Constructor.
-       * Initializes the individual objective functions.
-       *
-       * @param alpha The power which each variable is raised to.
-       * @param numParetoPoint No. of pareto points in the reference front.
-       * @param a The scale factor of the objectives.
-       */
-      MAF5(size_t alpha = 100, double a = 2) :
-          alpha(alpha),
-          a(a),
-          objectiveF1(0, *this),
-          objectiveF2(1, *this),
-          objectiveF3(2, *this)
-      {/*Nothing to do here.*/}
-
-      //! Get the starting point.
-      arma::Col<typename MatType::elem_type> GetInitialPoint()
-      {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
-      } 
-      
-      // Get the private variables.
-
-      // Get the number of objectives.
-      size_t GetNumObjectives()
-      { return this -> numObjectives; }
-
-      // Get the number of variables.
-      size_t GetNumVariables()
-      { return this -> numVariables; }
-
-      // Get the scale factor a.
-      double GetA()
-      { return this -> a; }
-
-      // Get the power alpha of each variable.
-      size_t GetAlpha()
-      { return this -> alpha; }
-
-      /**
-       * Set the scale factor a.
-       * 
-       * @param a The scale factor of the objectives.
-       */
-      void SetA(double a)
-      { this -> a = a; }
-
-      /**
-       * Set the power of each variable alpha.
-       * 
-       * @param alpha The power of each variable.
-       */
-      void SetAlpha(size_t alpha)
-      { this -> alpha = alpha; }
-
-      /**
-       * Evaluate the G(x) with the given coordinate.
-       *
-       * @param coords The function coordinates.
-       * @return arma::Row<typename MatType::elem_type>
-       */
-      arma::Row<typename MatType::elem_type> g(const MatType& coords)
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+    return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
+  }
+
+  // Get the private variables.
+
+  // Get the number of objectives.
+  size_t GetNumObjectives() { return numObjectives; }
+
+  // Get the number of variables.
+  size_t GetNumVariables() { return numVariables; }
+
+  // Get the scale factor a.
+  double GetA() { return a; }
+
+  // Get the power alpha of each variable.
+  size_t GetAlpha() { return alpha; }
+
+  /**
+   * Set the scale factor a.
+   *
+   * @param a The scale factor of the objectives.
+   */
+  void SetA(double a) { this->a = a; }
+
+  /**
+   * Set the power of each variable alpha.
+   *
+   * @param alpha The power of each variable.
+   */
+  void SetAlpha(size_t alpha) { this->alpha = alpha; }
+
+  /**
+   * Evaluate the G(x) with the given coordinate.
+   *
+   * @param coords The function coordinates.
+   * @return arma::Row<typename MatType::elem_type>
+   */
+  arma::Row<typename MatType::elem_type> g(const MatType& coords)
+  {
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Row<ElemType> innerSum(size(coords)[1], arma::fill::zeros);
+
+    for (size_t i = numObjectives - 1; i < numVariables; i++)
+    {
+      innerSum += arma::pow((coords.row(i) - 0.5), 2);
+    }
+
+    return innerSum;
+  }
+
+  /**
+   * Evaluate the objectives with the given coordinate.
+   *
+   * @param coords The function coordinates.
+   * @return arma::Mat<typename MatType::elem_type>
+   */
+  arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+  {
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Mat<ElemType> objectives(numObjectives, size(coords)[1]);
+    arma::Row<ElemType> G = g(coords);
+    arma::Row<ElemType> value = (1.0 + G);
+    for (size_t i = 0; i < numObjectives - 1; i++)
+    {
+      objectives.row(i) = std::pow(a, i + 1) * arma::pow(value, 4) %
+          arma::pow(arma::sin(arma::pow(coords.row(i), alpha) *
+          arma::datum::pi * 0.5), 4);
+      value = value % arma::cos(arma::pow(coords.row(i), alpha) *
+          arma::datum::pi * 0.5);
+    }
+    objectives.row(numObjectives - 1) = arma::pow(value, 4) * std::pow(a,
+        numObjectives);
+    return objectives;
+  }
+
+  // Individual Objective function.
+  // Changes based on stop variable provided.
+  struct MAF5Objective
+  {
+    MAF5Objective(size_t stop, MAF5& maf): stop(stop), maf(maf)
+    {/* Nothing to do here.*/}
+
+    /**
+     * Evaluate one objective with the given coordinate.
+     *
+     * @param coords The function coordinates.
+     * @return arma::Col<typename MatType::elem_type>
+     */
+    typename MatType::elem_type Evaluate(const MatType& coords)
+    {
+      // Convenience typedef.
+      typedef typename MatType::elem_type ElemType;
+      ElemType value = 1.0;
+      for (size_t i = 0; i < stop; i++)
       {
+        value = value * std::cos(std::pow(coords[i], maf.GetAlpha())
+            * arma::datum::pi * 0.5);
+      }
 
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        
-        arma::Row<ElemType> innerSum(size(coords)[1], arma::fill::zeros);
-        
-        for (size_t i = numObjectives - 1; i < numVariables; i++)
-        {
-          innerSum += arma::pow((coords.row(i) - 0.5), 2); 
-        } 
-        
-        return innerSum;
-      }    
-
-      /**
-       * Evaluate the objectives with the given coordinate.
-       *
-       * @param coords The function coordinates.
-       * @return arma::Mat<typename MatType::elem_type>
-       */
-      arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+      if (stop != maf.GetNumObjectives() - 1)
       {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-
-        arma::Mat<ElemType> objectives(numObjectives, size(coords)[1]);
-        arma::Row<ElemType> G = g(coords);
-        arma::Row<ElemType> value = (1.0 + G);
-        for (size_t i = 0; i < numObjectives - 1; i++)
-        {
-          objectives.row(i) = std::pow(a, i + 1) * arma::pow(value, 4) %  
-              arma::pow(arma::sin(arma::pow(coords.row(i), alpha) * 
-              arma::datum::pi * 0.5), 4);
-          value = value % arma::cos(arma::pow(coords.row(i), alpha) * arma::datum::pi * 0.5); 
-        }
-        objectives.row(numObjectives - 1) = arma::pow(value, 4) * std::pow(a, numObjectives);
-        return objectives;
+        value = value * std::sin(std::pow(coords[stop], maf.GetAlpha())
+            * arma::datum::pi * 0.5);
       }
-      
-      // Individual Objective function.
-      // Changes based on stop variable provided. 
-      struct MAF5Objective
-      {
-        MAF5Objective(size_t stop, MAF5& maf): stop(stop), maf(maf)
-        {/* Nothing to do here.*/}  
-        
-        /**
-         * Evaluate one objective with the given coordinate.
-         *
-         * @param coords The function coordinates.
-         * @return arma::Col<typename MatType::elem_type>
-         */
-        typename MatType::elem_type Evaluate(const MatType& coords)
-        {
-          // Convenience typedef.
-          typedef typename MatType::elem_type ElemType;
-          ElemType value = 1.0;
-          for (size_t i = 0; i < stop; i++)
-          {
-            value = value * std::cos(std::pow(coords[i], maf.GetAlpha()) 
-                * arma::datum::pi * 0.5);
-          }
-
-          if(stop != maf.GetNumObjectives() - 1)
-          {
-            value = value * std::sin(std::pow(coords[stop], maf.GetAlpha()) 
-                * arma::datum::pi * 0.5);
-          }
-
-          value = value * (1 + maf.g(coords)[0]);
-          value = std::pow(value, 4);
-          value = value * std::pow(maf.GetA(), stop + 1); 
-          return value;
-        }        
-
-        MAF5& maf;
-        size_t stop;
-      };
-
-      // Return back a tuple of objective functions.
-      std::tuple<MAF5Objective, MAF5Objective, MAF5Objective> GetObjectives()
-      {
-          return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
-      } 
 
-    MAF5Objective objectiveF1;
-    MAF5Objective objectiveF2;
-    MAF5Objective objectiveF3;
+      value = value * (1 + maf.g(coords)[0]);
+      value = std::pow(value, 4);
+      value = value * std::pow(maf.GetA(), stop + 1);
+      return value;
+    }
+
+    MAF5& maf;
+    size_t stop;
   };
-  } //namespace test
-  } //namespace ens
 
-#endif
\ No newline at end of file
+  // Return back a tuple of objective functions.
+  std::tuple<MAF5Objective, MAF5Objective, MAF5Objective> GetObjectives()
+  {
+    return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
+  }
+
+  MAF5Objective objectiveF1;
+  MAF5Objective objectiveF2;
+  MAF5Objective objectiveF3;
+};
+
+} // namespace test
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/problems/maf/maf6_function.hpp b/inst/include/ensmallen_bits/problems/maf/maf6_function.hpp
index 2ace864..55de247 100644
--- a/inst/include/ensmallen_bits/problems/maf/maf6_function.hpp
+++ b/inst/include/ensmallen_bits/problems/maf/maf6_function.hpp
@@ -21,8 +21,8 @@ namespace test {
  * \f[
  * theta_M = [theta_i, n - M + 1 <= i <= n]
  * g(x) = \Sigma{i = n - M + 1}^n (x_i - 0.5)^2
- * 
- * f_1(x) = 0.5 * cos(theta_1 * pi * 0.5) * cos(theta_2 * pi * 0.5) * ... cos(theta_2 * pi * 0.5) * (1 + g(theta_M)) 
+ *
+ * f_1(x) = 0.5 * cos(theta_1 * pi * 0.5) * cos(theta_2 * pi * 0.5) * ... cos(theta_2 * pi * 0.5) * (1 + g(theta_M))
  * f_2(x) = 0.5 * cos(theta_1 * pi * 0.5) * cos(theta_2 * pi * 0.5) * ... sin(theta_M-1 * pi * 0.5) * (1 + g(theta_M))
  * .
  * .
@@ -31,13 +31,13 @@ namespace test {
  *
  * Bounds of the variable space is:
  * 0 <= x_i <= 1 for i = 1,...,n.
- * 
+ *
  * Where theta_i = 0.5 * (1 + 2 * g(X_M) * x_i) / (1 + g(X_M))
- * 
+ *
  * This should be optimized to x_i = 0.5 (for all x_i in X_M), at:
- * 
+ *
  * For more information, please refer to:
- * 
+ *
  * @code
  * @article{cheng2017benchmark,
  * title={A benchmark test suite for evolutionary many-objective optimization},
@@ -52,183 +52,177 @@ namespace test {
  *
  * @tparam MatType Type of matrix to optimize.
  */
-  template <typename MatType = arma::mat>
-  class MAF6
+template <typename MatType = arma::mat>
+class MAF6
+{
+ private:
+  // A fixed no. of Objectives and Variables(|x| = 7, M = 3).
+  size_t numObjectives {3};
+  size_t numVariables {12};
+  size_t I;
+
+ public:
+  /**
+   * Object Constructor.
+   * Initializes the individual objective functions.
+   *
+   * @param numParetoPoint No. of pareto points in the reference front.
+   * @param I The manifold dimension (zero indexed).
+   */
+  MAF6(size_t I = 2) :
+      objectiveF1(0, *this),
+      objectiveF2(1, *this),
+      objectiveF3(2, *this),
+      I(I)
+  {/*Nothing to do here.*/}
+
+  //! Get the starting point.
+  arma::Col<typename MatType::elem_type> GetInitialPoint()
+  {
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+    return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
+  }
+
+  // Get the private variables.
+
+  // Get the number of objectives.
+  size_t GetNumObjectives() { return numObjectives; }
+
+  // Get the number of variables.
+  size_t GetNumVariables() { return numVariables; }
+
+  // Get the manifold dimension.
+  size_t GetI() { return I; }
+
+  /**
+   * Set the no. of pareto points.
+   *
+   * @param I The manifold dimension (0 indexed).
+   */
+  void SetI(size_t I) { this->I = I; }
+
+  /**
+   * Evaluate the G(x) with the given coordinate.
+   *
+   * @param coords The function coordinates.
+   * @return arma::Row<typename MatType::elem_type>
+   */
+  arma::Row<typename MatType::elem_type> g(const MatType& coords)
   {
-    private:
-
-    // A fixed no. of Objectives and Variables(|x| = 7, M = 3).
-    size_t numObjectives {3};
-    size_t numVariables {12};
-    size_t I;
-
-    public:
-
-      /**
-       * Object Constructor.
-       * Initializes the individual objective functions.
-       *
-       * @param numParetoPoint No. of pareto points in the reference front.
-       * @param I The manifold dimension (zero indexed).
-       */
-      MAF6(size_t I = 2) :
-          objectiveF1(0, *this),
-          objectiveF2(1, *this),
-          objectiveF3(2, *this),
-          I(I)
-      {/*Nothing to do here.*/}
-
-      //! Get the starting point.
-      arma::Col<typename MatType::elem_type> GetInitialPoint()
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Row<ElemType> innerSum(size(coords)[1], arma::fill::zeros);
+
+    for (size_t i = numObjectives - 1; i < numVariables; i++)
+    {
+      innerSum += arma::pow((coords.row(i) - 0.5), 2);
+    }
+
+    return innerSum;
+  }
+
+  /**
+   * Evaluate the objectives with the given coordinate.
+   *
+   * @param coords The function coordinates.
+   * @return arma::Mat<typename MatType::elem_type>
+   */
+  arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+  {
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    arma::Mat<ElemType> objectives(numObjectives, size(coords)[1]);
+    arma::Row<ElemType> G = g(coords);
+    arma::Row<ElemType> value = (1.0 + 100 * G);
+    arma::Row<ElemType> theta;
+    for (size_t i = 0; i < numObjectives - 1; i++)
+    {
+      if (i < I)
       {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
+        theta = coords.row(i) * arma::datum::pi * 0.5;
       }
-      
-      // Get the private variables.
-      
-      // Get the number of objectives.
-      size_t GetNumObjectives()
-      { return this -> numObjectives; }
-      
-      // Get the number of variables.
-      size_t GetNumVariables()
-      { return this -> numVariables; }
-
-      // Get the manifold dimension.
-      size_t GetI()
-      { return this -> I; }
-
-      /**
-       * Set the no. of pareto points.
-       *
-       * @param I The manifold dimension (0 indexed).
-       */
-      void SetI(size_t I)
-      { this -> I = I; }
-
-      /**
-       * Evaluate the G(x) with the given coordinate.
-       *
-       * @param coords The function coordinates.
-       * @return arma::Row<typename MatType::elem_type>
-       */
-      arma::Row<typename MatType::elem_type> g(const MatType& coords)
+      else
       {
-
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-        
-        arma::Row<ElemType> innerSum(size(coords)[1], arma::fill::zeros);
-        
-        for (size_t i = numObjectives - 1; i < numVariables; i++)
-        {
-          innerSum += arma::pow((coords.row(i) - 0.5), 2); 
-        } 
-        
-        return innerSum;
-      }    
-
-      /**
-       * Evaluate the objectives with the given coordinate.
-       *
-       * @param coords The function coordinates.
-       * @return arma::Mat<typename MatType::elem_type>
-       */
-      arma::Mat<typename MatType::elem_type> Evaluate(const MatType& coords)
+        theta = 0.25 * (1.0  + 2.0 * coords.row(i) % G) / (1.0 + G);
+      }
+      objectives.row(i) =  value %
+          arma::sin(theta);
+      value = value % arma::cos(theta);
+    }
+    objectives.row(numObjectives - 1) = value;
+    return objectives;
+  }
+
+  // Individual Objective function.
+  // Changes based on stop variable provided.
+  struct MAF6Objective
+  {
+    MAF6Objective(size_t stop, MAF6& maf): stop(stop), maf(maf)
+    {/* Nothing to do here. */}
+
+    /**
+     * Evaluate one objective with the given coordinate.
+     *
+     * @param coords The function coordinates.
+     * @return arma::Col<typename MatType::elem_type>
+     */
+    typename MatType::elem_type Evaluate(const MatType& coords)
+    {
+      // Convenience typedef.
+      typedef typename MatType::elem_type ElemType;
+      ElemType value = 1.0;
+      ElemType theta;
+      ElemType G = maf.g(coords)[0];
+      for (size_t i = 0; i < stop; i++)
       {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
-
-        arma::Mat<ElemType> objectives(numObjectives, size(coords)[1]);
-        arma::Row<ElemType> G = g(coords); 
-        arma::Row<ElemType> value = (1.0 + 100 * G);
-        arma::Row<ElemType> theta;
-        for (size_t i = 0; i < numObjectives - 1; i++)
+        if (i < maf.GetI())
+        {
+          theta = arma::datum::pi * coords[i] * 0.5;
+        }
+        else
         {
-          if(i < I)
-          { 
-            theta = coords.row(i) * arma::datum::pi * 0.5;
-          }
-          else
-          {
-            theta = 0.25 * (1.0  + 2.0 * coords.row(i) % G) / (1.0 + G);
-          }
-          objectives.row(i) =  value %  
-              arma::sin(theta);
-          value = value % arma::cos(theta); 
+          theta = 0.25 * (1.0  + 2.0 * coords[i] * G) / (1.0 + G);
         }
-        objectives.row(numObjectives - 1) = value;
-        return objectives;
+        value = value * std::cos(theta);
       }
-      
-      // Individual Objective function.
-      // Changes based on stop variable provided. 
-      struct MAF6Objective
+
+      if (stop < maf.GetI())
       {
-        MAF6Objective(size_t stop, MAF6& maf): stop(stop), maf(maf)
-        {/* Nothing to do here. */}
-        
-        /**
-         * Evaluate one objective with the given coordinate.
-         *
-         * @param coords The function coordinates.
-         * @return arma::Col<typename MatType::elem_type>
-         */
-        typename MatType::elem_type Evaluate(const MatType& coords)
-        {
-          // Convenience typedef.
-          typedef typename MatType::elem_type ElemType;
-          ElemType value = 1.0;
-          ElemType theta;
-          ElemType G = maf.g(coords)[0];
-          for (size_t i = 0; i < stop; i++)
-          {
-            if(i < maf.GetI())
-            {
-             theta  = arma::datum::pi * coords[i] * 0.5;
-            }
-            else
-            {
-                theta = 0.25 * (1.0  + 2.0 * coords[i] * G) / (1.0 + G);
-            }
-            value = value * std::cos(theta);
-          }
-
-          if(stop < maf.GetI())
-          {
-            theta  = arma::datum::pi * coords[stop] * 0.5;
-          }
-          else
-          {
-            theta = 0.25 * (1.0  + 2.0 * coords[stop] * G) / (1.0 + G);
-          }
-
-          if (stop != maf.GetNumObjectives() - 1)
-          {
-            value = value * std::sin(theta);
-          }
-
-          value = value * (1.0 + 100 * G);
-          return value;  
-        }       
-
-        MAF6& maf;
-        size_t stop;
-      };
-
-      // Return back a tuple of objective functions.
-      std::tuple<MAF6Objective, MAF6Objective, MAF6Objective> GetObjectives()
+        theta = arma::datum::pi * coords[stop] * 0.5;
+      }
+      else
+      {
+        theta = 0.25 * (1.0  + 2.0 * coords[stop] * G) / (1.0 + G);
+      }
+
+      if (stop != maf.GetNumObjectives() - 1)
       {
-          return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
+        value = value * std::sin(theta);
       }
 
-    MAF6Objective objectiveF1;
-    MAF6Objective objectiveF2;
-    MAF6Objective objectiveF3;
+      value = value * (1.0 + 100 * G);
+      return value;
+    }
+
+    MAF6& maf;
+    size_t stop;
   };
-  } //namespace test
-  } //namespace ens
 
-#endif
\ No newline at end of file
+  // Return back a tuple of objective functions.
+  std::tuple<MAF6Objective, MAF6Objective, MAF6Objective> GetObjectives()
+  {
+    return std::make_tuple(objectiveF1, objectiveF2, objectiveF3);
+  }
+
+  MAF6Objective objectiveF1;
+  MAF6Objective objectiveF2;
+  MAF6Objective objectiveF3;
+};
+
+} // namespace test
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/problems/matyas_function_impl.hpp b/inst/include/ensmallen_bits/problems/matyas_function_impl.hpp
index 5fbdd29..6b1ac13 100644
--- a/inst/include/ensmallen_bits/problems/matyas_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/matyas_function_impl.hpp
@@ -35,8 +35,8 @@ typename MatType::elem_type MatyasFunction::Evaluate(
   const ElemType x1 = coordinates(0);
   const ElemType x2 = coordinates(1);
 
-  const double objective = 0.26 * (pow(x1, 2) + std::pow(x2, 2)) -
-      0.48 * x1 * x2;
+  const ElemType objective = ElemType(0.26) * (std::pow(x1, ElemType(2)) +
+      std::pow(x2, ElemType(2))) - ElemType(0.48) * x1 * x2;
 
   return objective;
 }
@@ -62,8 +62,8 @@ inline void MatyasFunction::Gradient(const MatType& coordinates,
   const ElemType x2 = coordinates(1);
 
   gradient.set_size(2, 1);
-  gradient(0) = 0.52 * x1 - 48 * x2;
-  gradient(1) = 0.52 * x2 - 0.48 * x1;
+  gradient(0) = ElemType(0.52) * x1 - ElemType(0.48) * x2;
+  gradient(1) = ElemType(0.52) * x2 - ElemType(0.48) * x1;
 }
 
 template<typename MatType, typename GradType>
diff --git a/inst/include/ensmallen_bits/problems/mc_cormick_function_impl.hpp b/inst/include/ensmallen_bits/problems/mc_cormick_function_impl.hpp
index f1e47ed..e060372 100644
--- a/inst/include/ensmallen_bits/problems/mc_cormick_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/mc_cormick_function_impl.hpp
@@ -28,12 +28,16 @@ typename MatType::elem_type McCormickFunction::Evaluate(
     const size_t /* begin */,
     const size_t /* batchSize */) const
 {
+  typedef typename MatType::elem_type ElemType;
+
   // For convenience; we assume these temporaries will be optimized out.
-  const typename MatType::elem_type x1 = coordinates(0);
-  const typename MatType::elem_type x2 = coordinates(1);
+  const ElemType x1 = coordinates(0);
+  const ElemType x2 = coordinates(1);
 
-  const typename MatType::elem_type objective = std::sin(x1 + x2) +
-      std::pow(x1 - x2, 2) - 1.5 * x1 + 2.5 * x2 + 1;
+  const ElemType objective = std::sin(x1 + x2) +
+      std::pow(x1 - x2, ElemType(2)) -
+      ElemType(1.5) * x1 +
+      ElemType(2.5) * x2 + 1;
 
   return objective;
 }
@@ -51,13 +55,15 @@ inline void McCormickFunction::Gradient(const MatType& coordinates,
                                         GradType& gradient,
                                         const size_t /* batchSize */) const
 {
+  typedef typename MatType::elem_type ElemType;
+
   // For convenience; we assume these temporaries will be optimized out.
-  const typename MatType::elem_type x1 = coordinates(0);
-  const typename MatType::elem_type x2 = coordinates(1);
+  const ElemType x1 = coordinates(0);
+  const ElemType x2 = coordinates(1);
 
   gradient.set_size(2, 1);
-  gradient(0) = std::cos(x1 + x2) + 2 * x1 - 2 * x2 - 1.5;
-  gradient(1) = std::cos(x1 + x2) - 2 * x1 + 2 * x2 + 2.5;
+  gradient(0) = std::cos(x1 + x2) + 2 * x1 - 2 * x2 - ElemType(1.5);
+  gradient(1) = std::cos(x1 + x2) - 2 * x1 + 2 * x2 + ElemType(2.5);
 }
 
 template<typename MatType, typename GradType>
diff --git a/inst/include/ensmallen_bits/problems/problems.hpp b/inst/include/ensmallen_bits/problems/problems.hpp
index ad2a9a2..f4acdd0 100644
--- a/inst/include/ensmallen_bits/problems/problems.hpp
+++ b/inst/include/ensmallen_bits/problems/problems.hpp
@@ -28,6 +28,7 @@
 #include "logistic_regression_function.hpp"
 #include "matyas_function.hpp"
 #include "mc_cormick_function.hpp"
+#include "quadratic_function.hpp"
 #include "rastrigin_function.hpp"
 #include "rosenbrock_function.hpp"
 #include "rosenbrock_wood_function.hpp"
diff --git a/inst/include/ensmallen_bits/problems/quadratic_function.hpp b/inst/include/ensmallen_bits/problems/quadratic_function.hpp
new file mode 100644
index 0000000..982d9e5
--- /dev/null
+++ b/inst/include/ensmallen_bits/problems/quadratic_function.hpp
@@ -0,0 +1,108 @@
+/**
+ * @file quadratic_function.hpp
+ * @author Ryan Curtin
+ *
+ * Definition of QuadraticFunction, f(x) = | x |.
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_PROBLEMS_QUADRATIC_FUNCTION_HPP
+#define ENSMALLEN_PROBLEMS_QUADRATIC_FUNCTION_HPP
+
+namespace ens {
+namespace test {
+
+/**
+ * The quadratic value function in one dimension, defined by
+ *
+ * \f[
+ * f(x) = x^2
+ * \f]
+ *
+ * This should optimize to f(x) = 0, at x = [0].
+ */
+class QuadraticFunction
+{
+ public:
+  //! Initialize the QuadraticFunction.
+  QuadraticFunction();
+
+  /**
+   * Shuffle the order of function visitation. This may be called by the
+   * optimizer.
+   */
+  void Shuffle();
+
+  //! Return 1 (the number of functions).
+  size_t NumFunctions() const { return 1; }
+
+  /**
+   * Evaluate a function for a particular batch-size.
+   *
+   * @param coordinates The function coordinates.
+   * @param begin The first function.
+   * @param batchSize Number of points to process.
+   */
+  template<typename MatType>
+  typename MatType::elem_type Evaluate(const MatType& coordinates,
+                                       const size_t begin,
+                                       const size_t batchSize) const;
+
+  /**
+   * Evaluate a function with the given coordinates.
+   *
+   * @param coordinates The function coordinates.
+   */
+  template<typename MatType>
+  typename MatType::elem_type Evaluate(const MatType& coordinates) const;
+
+  /**
+   * Evaluate the gradient of a function for a particular batch-size.
+   *
+   * @param coordinates The function coordinates.
+   * @param begin The first function.
+   * @param gradient The function gradient.
+   * @param batchSize Number of points to process.
+   */
+  template<typename MatType, typename GradType>
+  void Gradient(const MatType& coordinates,
+                const size_t begin,
+                GradType& gradient,
+                const size_t batchSize) const;
+
+  /**
+   * Evaluate the gradient of a function with the given coordinates.
+   *
+   * @param coordinates The function coordinates.
+   * @param gradient The function gradient.
+   */
+  template<typename MatType, typename GradType>
+  void Gradient(const MatType& coordinates, GradType& gradient);
+
+  // Note: GetInitialPoint(), GetFinalPoint(), and GetFinalObjective() are not
+  // required for using ensmallen to optimize this function!  They are
+  // specifically used as a convenience just for ensmallen's testing
+  // infrastructure.
+
+  //! Get the starting point.
+  template<typename MatType = arma::mat>
+  MatType GetInitialPoint() const { return MatType("20.0"); }
+
+  //! Get the final point.
+  template<typename MatType = arma::mat>
+  MatType GetFinalPoint() const { return MatType("0.0"); }
+
+  //! Get the final objective.
+  double GetFinalObjective() const { return 0.0; }
+};
+
+} // namespace test
+} // namespace ens
+
+// Include implementation.
+#include "quadratic_function_impl.hpp"
+
+#endif // ENSMALLEN_PROBLEMS_BEALE_FUNCTION_HPP
diff --git a/inst/include/ensmallen_bits/problems/quadratic_function_impl.hpp b/inst/include/ensmallen_bits/problems/quadratic_function_impl.hpp
new file mode 100644
index 0000000..410a1ec
--- /dev/null
+++ b/inst/include/ensmallen_bits/problems/quadratic_function_impl.hpp
@@ -0,0 +1,61 @@
+/**
+ * @file quadratic_function_impl.hpp
+ * @author Ryan Curtin
+ *
+ * Implementation of QuadraticFunction, f(x) = | x |.
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_PROBLEMS_QUADRATIC_FUNCTION_IMPL_HPP
+#define ENSMALLEN_PROBLEMS_QUADRATIC_FUNCTION_IMPL_HPP
+
+// In case it hasn't been included yet.
+#include "quadratic_function.hpp"
+
+namespace ens {
+namespace test {
+
+inline QuadraticFunction::QuadraticFunction() { /* Nothing to do here */ }
+
+inline void QuadraticFunction::Shuffle() { /* Nothing to do here */ }
+
+template<typename MatType>
+typename MatType::elem_type QuadraticFunction::Evaluate(
+    const MatType& coordinates,
+    const size_t /* begin */,
+    const size_t /* batchSize */) const
+{
+  return coordinates[0] * coordinates[0];
+}
+
+template<typename MatType>
+typename MatType::elem_type QuadraticFunction::Evaluate(const MatType& coordinates)
+    const
+{
+  return Evaluate(coordinates, 0, NumFunctions());
+}
+
+template<typename MatType, typename GradType>
+inline void QuadraticFunction::Gradient(const MatType& coordinates,
+                                  const size_t /* begin */,
+                                  GradType& gradient,
+                                  const size_t /* batchSize */) const
+{
+  gradient.set_size(1, 1);
+  gradient(0, 0) = 2 * coordinates[0];
+}
+
+template<typename MatType, typename GradType>
+inline void QuadraticFunction::Gradient(const MatType& coordinates,
+                                  GradType& gradient)
+{
+  Gradient(coordinates, 0, gradient, 1);
+}
+
+} // namespace test
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/problems/rastrigin_function.hpp b/inst/include/ensmallen_bits/problems/rastrigin_function.hpp
index d207473..7d14bc6 100644
--- a/inst/include/ensmallen_bits/problems/rastrigin_function.hpp
+++ b/inst/include/ensmallen_bits/problems/rastrigin_function.hpp
@@ -104,17 +104,17 @@ class RastriginFunction
   // infrastructure.
 
   //! Get the starting point.
-  template<typename MatType = arma::mat>
+  template<typename MatType>
   MatType GetInitialPoint() const
   {
-    return arma::conv_to<MatType>::from(initialPoint);
+    return conv_to<MatType>::from(initialPoint);
   }
 
   //! Get the final point.
-  template<typename MatType = arma::mat>
+  template<typename MatType>
   MatType GetFinalPoint() const
   {
-    return arma::zeros<MatType>(initialPoint.n_rows, initialPoint.n_cols);
+    return zeros<MatType>(initialPoint.n_rows, initialPoint.n_cols);
   }
 
   //! Get the final objective.
@@ -125,7 +125,7 @@ class RastriginFunction
   size_t n;
 
   //! For shuffling.
-  arma::Row<size_t> visitationOrder;
+  arma::Col<size_t> visitationOrder;
 
   //! Initial starting point.
   arma::mat initialPoint;
diff --git a/inst/include/ensmallen_bits/problems/rastrigin_function_impl.hpp b/inst/include/ensmallen_bits/problems/rastrigin_function_impl.hpp
index 6824cc0..18d8767 100644
--- a/inst/include/ensmallen_bits/problems/rastrigin_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/rastrigin_function_impl.hpp
@@ -18,10 +18,10 @@
 namespace ens {
 namespace test {
 
-inline RastriginFunction::RastriginFunction(const size_t n) :
+inline RastriginFunction::RastriginFunction(
+    const size_t n) :
     n(n),
-    visitationOrder(arma::linspace<arma::Row<size_t> >(0, n - 1, n))
-
+    visitationOrder(linspace<arma::Col<size_t>>(0, n - 1, n))
 {
   initialPoint.set_size(n, 1);
   initialPoint.fill(-3);
@@ -29,12 +29,12 @@ inline RastriginFunction::RastriginFunction(const size_t n) :
 
 inline void RastriginFunction::Shuffle()
 {
-  visitationOrder = arma::shuffle(
-      arma::linspace<arma::Row<size_t> >(0, n - 1, n));
+  visitationOrder = shuffle(linspace<arma::Col<size_t>>(0, n - 1, n));
 }
 
 template<typename MatType>
-typename MatType::elem_type RastriginFunction::Evaluate(
+typename MatType::elem_type
+RastriginFunction::Evaluate(
     const MatType& coordinates,
     const size_t begin,
     const size_t batchSize) const
@@ -42,44 +42,48 @@ typename MatType::elem_type RastriginFunction::Evaluate(
   // Convenience typedef.
   typedef typename MatType::elem_type ElemType;
 
-  ElemType objective = 0.0;
+  ElemType objective = 0;
   for (size_t j = begin; j < begin + batchSize; ++j)
   {
     const size_t p = visitationOrder[j];
-    objective += std::pow(coordinates(p), 2) - 10.0 *
-        std::cos(2.0 * arma::datum::pi * coordinates(p));
+    objective += std::pow(coordinates(p), ElemType(2)) - 10 *
+        std::cos(2 * arma::Datum<ElemType>::pi * coordinates(p));
   }
-  objective += 10.0 * n;
+  objective += 10 * n;
 
   return objective;
 }
 
 template<typename MatType>
-typename MatType::elem_type RastriginFunction::Evaluate(
-    const MatType& coordinates) const
+typename MatType::elem_type
+RastriginFunction::Evaluate(const MatType& coordinates) const
 {
   return Evaluate(coordinates, 0, NumFunctions());
 }
 
 template<typename MatType, typename GradType>
-inline void RastriginFunction::Gradient(const MatType& coordinates,
-                                        const size_t begin,
-                                        GradType& gradient,
-                                        const size_t batchSize) const
+void RastriginFunction::Gradient(
+    const MatType& coordinates,
+    const size_t begin,
+    GradType& gradient,
+    const size_t batchSize) const
 {
+  typedef typename MatType::elem_type ElemType;
+
   gradient.zeros(n, 1);
 
   for (size_t j = begin; j < begin + batchSize; ++j)
   {
     const size_t p = visitationOrder[j];
-    gradient(p) += (10.0 * n) * (2 * (coordinates(p) + 10.0 * arma::datum::pi *
-        std::sin(2.0 * arma::datum::pi * coordinates(p))));
+    gradient(p) += (10 * n) * (2 * (coordinates(p) +
+        10 * arma::Datum<ElemType>::pi *
+        std::sin(2 * arma::Datum<ElemType>::pi * coordinates(p))));
   }
 }
 
 template<typename MatType, typename GradType>
-inline void RastriginFunction::Gradient(const MatType& coordinates,
-                                        GradType& gradient)
+inline void RastriginFunction::Gradient(
+    const MatType& coordinates, GradType& gradient)
 {
   Gradient(coordinates, 0, gradient, NumFunctions());
 }
diff --git a/inst/include/ensmallen_bits/problems/rosenbrock_function_impl.hpp b/inst/include/ensmallen_bits/problems/rosenbrock_function_impl.hpp
index 6bc8186..b9ee036 100644
--- a/inst/include/ensmallen_bits/problems/rosenbrock_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/rosenbrock_function_impl.hpp
@@ -37,8 +37,8 @@ typename MatType::elem_type RosenbrockFunction::Evaluate(
   const ElemType x2 = coordinates(1);
 
   const ElemType objective =
-      /* f1(x) */ 100 * std::pow(x2 - std::pow(x1, 2), 2) +
-      /* f2(x) */ std::pow(1 - x1, 2);
+      /* f1(x) */ 100 * std::pow(x2 - std::pow(x1, ElemType(2)), ElemType(2)) +
+      /* f2(x) */ std::pow(1 - x1, ElemType(2));
 
   return objective;
 }
@@ -64,8 +64,8 @@ void RosenbrockFunction::Gradient(const MatType& coordinates,
   const ElemType x2 = coordinates(1);
 
   gradient.set_size(2, 1);
-  gradient(0) = -2 * (1 - x1) + 400 * (std::pow(x1, 3) - x2 * x1);
-  gradient(1) = 200 * (x2 - std::pow(x1, 2));
+  gradient(0) = -2 * (1 - x1) + 400 * (std::pow(x1, ElemType(3)) - x2 * x1);
+  gradient(1) = 200 * (x2 - std::pow(x1, ElemType(2)));
 }
 
 template<typename MatType, typename GradType>
diff --git a/inst/include/ensmallen_bits/problems/rosenbrock_wood_function.hpp b/inst/include/ensmallen_bits/problems/rosenbrock_wood_function.hpp
index b42422a..0e57406 100644
--- a/inst/include/ensmallen_bits/problems/rosenbrock_wood_function.hpp
+++ b/inst/include/ensmallen_bits/problems/rosenbrock_wood_function.hpp
@@ -91,14 +91,14 @@ class RosenbrockWoodFunction
   template<typename MatType = arma::mat>
   const MatType GetInitialPoint() const
   {
-    return arma::conv_to<MatType>::from(initialPoint);
+    return conv_to<MatType>::from(initialPoint);
   }
 
   //! Get the final point.
   template<typename MatType = arma::mat>
   MatType GetFinalPoint() const
   {
-    return arma::ones<MatType>(initialPoint.n_rows, initialPoint.n_cols);
+    return ones<MatType>(initialPoint.n_rows, initialPoint.n_cols);
   }
 
   //! Get the final objective.
diff --git a/inst/include/ensmallen_bits/problems/rosenbrock_wood_function_impl.hpp b/inst/include/ensmallen_bits/problems/rosenbrock_wood_function_impl.hpp
index 071682e..5d3f095 100644
--- a/inst/include/ensmallen_bits/problems/rosenbrock_wood_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/rosenbrock_wood_function_impl.hpp
@@ -51,13 +51,10 @@ inline void RosenbrockWoodFunction::Gradient(const MatType& coordinates,
                                              GradType& gradient,
                                              const size_t /* batchSize */) const
 {
-  // Convenience typedef.
-  typedef typename MatType::elem_type ElemType;
-
   gradient.set_size(4, 2);
 
-  arma::Col<ElemType> grf(4);
-  arma::Col<ElemType> gwf(4);
+  MatType grf(4, 1);
+  MatType gwf(4, 1);
 
   rf.Gradient(coordinates.col(0), grf);
   wf.Gradient(coordinates.col(1), gwf);
diff --git a/inst/include/ensmallen_bits/problems/schaffer_function_n1.hpp b/inst/include/ensmallen_bits/problems/schaffer_function_n1.hpp
index 4c31974..2f89993 100644
--- a/inst/include/ensmallen_bits/problems/schaffer_function_n1.hpp
+++ b/inst/include/ensmallen_bits/problems/schaffer_function_n1.hpp
@@ -37,7 +37,9 @@ class SchafferFunctionN1
   size_t numVariables;
 
  public:
- //! Initialize the SchafferFunctionN1
+  typedef typename MatType::elem_type ElemType;
+
+  // Initialize the SchafferFunctionN1 object.
   SchafferFunctionN1() : numObjectives(2), numVariables(1)
   {/* Nothing to do here. */}
 
@@ -54,8 +56,8 @@ class SchafferFunctionN1
 
     arma::Col<ElemType> objectives(numObjectives);
 
-    objectives(0) = std::pow(coords[0], 2);
-    objectives(1) = std::pow(coords[0] - 2, 2);
+    objectives(0) = std::pow(coords[0], ElemType(2));
+    objectives(1) = std::pow(coords[0] - 2, ElemType(2));
 
     return objectives;
   }
@@ -71,17 +73,17 @@ class SchafferFunctionN1
 
   struct ObjectiveA
   {
-    typename MatType::elem_type Evaluate(const MatType& coords)
+    ElemType Evaluate(const MatType& coords)
     {
-        return std::pow(coords[0], 2);
+      return std::pow(coords[0], ElemType(2));
     }
   } objectiveA;
 
   struct ObjectiveB
   {
-    typename MatType::elem_type Evaluate(const MatType& coords)
+    ElemType Evaluate(const MatType& coords)
     {
-        return std::pow(coords[0] - 2, 2);
+      return std::pow(coords[0] - 2, ElemType(2));
     }
   } objectiveB;
 
@@ -91,7 +93,8 @@ class SchafferFunctionN1
     return std::make_tuple(objectiveA, objectiveB);
   }
 };
+
 } // namespace test
 } // namespace ens
 
-#endif
\ No newline at end of file
+#endif
diff --git a/inst/include/ensmallen_bits/problems/schaffer_function_n2_impl.hpp b/inst/include/ensmallen_bits/problems/schaffer_function_n2_impl.hpp
index 1e289e6..9d3a9c5 100644
--- a/inst/include/ensmallen_bits/problems/schaffer_function_n2_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/schaffer_function_n2_impl.hpp
@@ -35,9 +35,11 @@ typename MatType::elem_type SchafferFunctionN2::Evaluate(
   const ElemType x1 = coordinates(0);
   const ElemType x2 = coordinates(1);
 
-  const ElemType objective = 0.5 + (std::pow(std::sin(std::pow(x1, 2) -
-      std::pow(x2, 2)), 2) - 0.5) / std::pow(1 + 0.001 *
-      (std::pow(x1, 2) + std::pow(x2, 2)), 2);
+  const ElemType objective = ElemType(0.5) +
+      (std::pow(std::sin(std::pow(x1, ElemType(2)) -
+      std::pow(x2, ElemType(2))), ElemType(2)) - ElemType(0.5)) /
+      std::pow(1 + ElemType(0.001) * (std::pow(x1, ElemType(2)) +
+          std::pow(x2, ElemType(2))), ElemType(2));
 
   return objective;
 }
@@ -67,11 +69,12 @@ inline void SchafferFunctionN2::Gradient(const MatType& coordinates,
   const ElemType x2Sq = x2 * x2;
   const ElemType sum1 = x1Sq - x2Sq;
   const ElemType sinSum1 = sin(sum1);
-  const ElemType sum2 = 0.001 * (x1Sq + x2Sq) + 1;
+  const ElemType sum2 = ElemType(0.001) * (x1Sq + x2Sq) + 1;
   const ElemType trigExpression = 4 * sinSum1 * cos(sum1);
-  const ElemType numerator1 = - 0.004 * (pow(sinSum1, 2) - 0.5);
-  const ElemType expr1 = numerator1 / pow(sum2, 3);
-  const ElemType expr2 = trigExpression / pow(sum2, 2);
+  const ElemType numerator1 =
+      ElemType(-0.004) * (pow(sinSum1, ElemType(2)) - 0.5);
+  const ElemType expr1 = numerator1 / pow(sum2, ElemType(3));
+  const ElemType expr2 = trigExpression / pow(sum2, ElemType(2));
 
   gradient.set_size(2, 1);
   gradient(0) = x1 * (expr1 + expr2);
diff --git a/inst/include/ensmallen_bits/problems/schaffer_function_n4_impl.hpp b/inst/include/ensmallen_bits/problems/schaffer_function_n4_impl.hpp
index 72ef1a6..6b935ca 100644
--- a/inst/include/ensmallen_bits/problems/schaffer_function_n4_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/schaffer_function_n4_impl.hpp
@@ -35,9 +35,11 @@ typename MatType::elem_type SchafferFunctionN4::Evaluate(
   const ElemType x1 = coordinates(0);
   const ElemType x2 = coordinates(1);
 
-  const ElemType objective = 0.5 + (std::pow(std::cos(std::sin(std::abs(
-      std::pow(x1, 2) - std::pow(x2, 2)))), 2) - 0.5) / std::pow(1 + 0.001 *
-      (std::pow(x1, 2) + std::pow(x2, 2)), 2);
+  const ElemType objective = ElemType(0.5) +
+      (std::pow(std::cos(std::sin(std::abs(std::pow(x1, ElemType(2)) -
+      std::pow(x2, ElemType(2))))), ElemType(2)) - ElemType(0.5)) /
+      std::pow(1 + ElemType(0.001) * (std::pow(x1, ElemType(2)) +
+      std::pow(x2, ElemType(2))), ElemType(2));
 
   return objective;
 }
diff --git a/inst/include/ensmallen_bits/problems/schwefel_function.hpp b/inst/include/ensmallen_bits/problems/schwefel_function.hpp
index c973bba..4491e4e 100644
--- a/inst/include/ensmallen_bits/problems/schwefel_function.hpp
+++ b/inst/include/ensmallen_bits/problems/schwefel_function.hpp
@@ -107,14 +107,14 @@ class SchwefelFunction
   template<typename MatType = arma::mat>
   MatType GetInitialPoint() const
   {
-    return arma::conv_to<MatType>::from(initialPoint);
+    return conv_to<MatType>::from(initialPoint);
   }
 
   //! Get the final point.
   template<typename MatType = arma::mat>
   MatType GetFinalPoint() const
   {
-    MatType result(initialPoint.n_rows, initialPoint.n_cols, arma::fill::none);
+    MatType result(initialPoint.n_rows, initialPoint.n_cols);
     result.fill(420.9687);
     return result;
   }
diff --git a/inst/include/ensmallen_bits/problems/sgd_test_function_impl.hpp b/inst/include/ensmallen_bits/problems/sgd_test_function_impl.hpp
index ac5e3d8..d8a36f5 100644
--- a/inst/include/ensmallen_bits/problems/sgd_test_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/sgd_test_function_impl.hpp
@@ -36,7 +36,9 @@ typename MatType::elem_type SGDTestFunction::Evaluate(
     const size_t begin,
     const size_t batchSize) const
 {
-  typename MatType::elem_type objective = 0;
+  typedef typename MatType::elem_type ElemType;
+
+  ElemType objective = 0;
 
   for (size_t i = begin; i < begin + batchSize; i++)
   {
@@ -47,12 +49,12 @@ typename MatType::elem_type SGDTestFunction::Evaluate(
         break;
 
       case 1:
-        objective += std::pow(coordinates[1], 2);
+        objective += std::pow(coordinates[1], ElemType(2));
         break;
 
       case 2:
-        objective += std::pow(coordinates[2], 4) + \
-                     3 * std::pow(coordinates[2], 2);
+        objective += std::pow(coordinates[2], ElemType(4)) + \
+                     3 * std::pow(coordinates[2], ElemType(2));
         break;
     }
   }
@@ -66,6 +68,8 @@ void SGDTestFunction::Gradient(const MatType& coordinates,
                                GradType& gradient,
                                const size_t batchSize) const
 {
+  typedef typename MatType::elem_type ElemType;
+
   gradient.zeros(3);
 
   for (size_t i = begin; i < begin + batchSize; ++i)
@@ -84,7 +88,8 @@ void SGDTestFunction::Gradient(const MatType& coordinates,
         break;
 
       case 2:
-        gradient[2] += 4 * std::pow(coordinates[2], 3) + 6 * coordinates[2];
+        gradient[2] += 4 * std::pow(coordinates[2], ElemType(3)) +
+            6 * coordinates[2];
         break;
     }
   }
diff --git a/inst/include/ensmallen_bits/problems/softmax_regression_function.hpp b/inst/include/ensmallen_bits/problems/softmax_regression_function.hpp
index 0e16307..748782a 100644
--- a/inst/include/ensmallen_bits/problems/softmax_regression_function.hpp
+++ b/inst/include/ensmallen_bits/problems/softmax_regression_function.hpp
@@ -16,9 +16,12 @@
 namespace ens {
 namespace test {
 
+template<typename MatType = arma::mat>
 class SoftmaxRegressionFunction
 {
  public:
+  typedef typename MatType::elem_type ElemType;
+
   /**
    * Construct the Softmax Regression objective function with the given
    * parameters.
@@ -30,14 +33,14 @@ class SoftmaxRegressionFunction
    * @param lambda L2-regularization constant.
    * @param fitIntercept Intercept term flag.
    */
-  SoftmaxRegressionFunction(const arma::mat& data,
+  SoftmaxRegressionFunction(const MatType& data,
                             const arma::Row<size_t>& labels,
                             const size_t numClasses,
                             const double lambda = 0.0001,
                             const bool fitIntercept = false);
 
   //! Initializes the parameters of the model to suitable values.
-  const arma::mat InitializeWeights();
+  const MatType InitializeWeights();
 
   /**
    * Shuffle the dataset.
@@ -53,9 +56,9 @@ class SoftmaxRegressionFunction
    * @param fitIntercept If true, an intercept is fitted.
    * @return Initialized model weights.
    */
-  const arma::mat InitializeWeights(const size_t featureSize,
-                                    const size_t numClasses,
-                                    const bool fitIntercept = false);
+  const MatType InitializeWeights(const size_t featureSize,
+                                  const size_t numClasses,
+                                  const bool fitIntercept = false);
 
   /**
    * Initialize Softmax Regression weights (trainable parameters) with the given
@@ -66,7 +69,7 @@ class SoftmaxRegressionFunction
    * @param numClasses Number of classes for classification.
    * @param fitIntercept Intercept term flag.
    */
-  void InitializeWeights(arma::mat &weights,
+  void InitializeWeights(MatType& weights,
                          const size_t featureSize,
                          const size_t numClasses,
                          const bool fitIntercept = false);
@@ -78,7 +81,7 @@ class SoftmaxRegressionFunction
    * @param groundTruth Pointer to arma::mat which stores the computed matrix.
    */
   void GetGroundTruthMatrix(const arma::Row<size_t>& labels,
-                            arma::sp_mat& groundTruth);
+                            arma::SpMat<ElemType>& groundTruth);
 
   /**
    * Evaluate the probabilities matrix with the passed parameters.
@@ -91,8 +94,8 @@ class SoftmaxRegressionFunction
    * @param start Index of point to start at.
    * @param batchSize Number of points to calculate probabilities for.
    */
-  void GetProbabilitiesMatrix(const arma::mat& parameters,
-                              arma::mat& probabilities,
+  void GetProbabilitiesMatrix(const MatType& parameters,
+                              MatType& probabilities,
                               const size_t start,
                               const size_t batchSize) const;
 
@@ -105,7 +108,7 @@ class SoftmaxRegressionFunction
    *
    * @param parameters Current values of the model parameters.
    */
-  double Evaluate(const arma::mat& parameters) const;
+  ElemType Evaluate(const MatType& parameters) const;
 
   /**
    * Evaluate the objective function of the softmax regression model for a
@@ -118,9 +121,9 @@ class SoftmaxRegressionFunction
    * @param start First index of the data points to use.
    * @param batchSize Number of data points to evaluate objective for.
    */
-  double Evaluate(const arma::mat& parameters,
-                  const size_t start,
-                  const size_t batchSize = 1) const;
+  ElemType Evaluate(const MatType& parameters,
+                    const size_t start,
+                    const size_t batchSize = 1) const;
 
   /**
    * Evaluates the gradient values of the objective function given the current
@@ -131,7 +134,7 @@ class SoftmaxRegressionFunction
    * @param parameters Current values of the model parameters.
    * @param gradient Matrix where gradient values will be stored.
    */
-  void Gradient(const arma::mat& parameters, arma::mat& gradient) const;
+  void Gradient(const MatType& parameters, MatType& gradient) const;
 
   /**
    * Evaluate the gradient of the objective function given the current set of
@@ -144,9 +147,9 @@ class SoftmaxRegressionFunction
    * @param gradient Matrix to store gradient into.
    * @param batchSize Number of data points to evaluate gradient for.
    */
-  void Gradient(const arma::mat& parameters,
+  void Gradient(const MatType& parameters,
                 const size_t start,
-                arma::mat& gradient,
+                MatType& gradient,
                 const size_t batchSize = 1) const;
 
   /**
@@ -158,12 +161,12 @@ class SoftmaxRegressionFunction
    *    gradient is to be computed.
    * @param gradient Out param for the gradient value.
    */
-  void PartialGradient(const arma::mat& parameters,
+  void PartialGradient(const MatType& parameters,
                        size_t j,
-                       arma::sp_mat& gradient) const;
+                       arma::SpMat<ElemType>& gradient) const;
 
   //! Return the initial point for the optimization.
-  const arma::mat& GetInitialPoint() const { return initialPoint; }
+  const MatType& GetInitialPoint() const { return initialPoint; }
 
   //! Gets the number of classes.
   size_t NumClasses() const { return numClasses; }
@@ -184,11 +187,11 @@ class SoftmaxRegressionFunction
 
  private:
   //! Training data matrix.  This is an alias until the data is shuffled.
-  arma::mat data;
+  MatType data;
   //! Label matrix for the provided data.
-  arma::sp_mat groundTruth;
+  arma::SpMat<ElemType> groundTruth;
   //! Initial parameter point.
-  arma::mat initialPoint;
+  MatType initialPoint;
   //! Number of classes.
   size_t numClasses;
   //! L2-regularization constant.
diff --git a/inst/include/ensmallen_bits/problems/softmax_regression_function_impl.hpp b/inst/include/ensmallen_bits/problems/softmax_regression_function_impl.hpp
index d781da1..e6860d7 100644
--- a/inst/include/ensmallen_bits/problems/softmax_regression_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/softmax_regression_function_impl.hpp
@@ -18,14 +18,15 @@
 namespace ens {
 namespace test {
 
-inline SoftmaxRegressionFunction::SoftmaxRegressionFunction(
-    const arma::mat& data,
+template<typename MatType>
+inline SoftmaxRegressionFunction<MatType>::SoftmaxRegressionFunction(
+    const MatType& data,
     const arma::Row<size_t>& labels,
     const size_t numClasses,
     const double lambda,
     const bool fitIntercept) :
-    data(arma::mat(const_cast<arma::mat&>(data).memptr(), data.n_rows,
-      data.n_cols, false, false)),
+    data(MatType(const_cast<MatType&>(data).memptr(), data.n_rows, data.n_cols,
+        false, false)),
     numClasses(numClasses),
     lambda(lambda),
     fitIntercept(fitIntercept)
@@ -40,14 +41,15 @@ inline SoftmaxRegressionFunction::SoftmaxRegressionFunction(
 /**
  * Shuffle the data.
  */
-inline void SoftmaxRegressionFunction::Shuffle()
+template<typename MatType>
+inline void SoftmaxRegressionFunction<MatType>::Shuffle()
 {
   // Determine new ordering.
   arma::uvec ordering = arma::shuffle(arma::linspace<arma::uvec>(0,
       data.n_cols - 1, data.n_cols));
 
   // Re-sort data.
-  arma::mat newData = data.cols(ordering);
+  MatType newData = data.cols(ordering);
   if (data.mem_state >= 1)
     data.reset();
   data = std::move(newData);
@@ -58,8 +60,8 @@ inline void SoftmaxRegressionFunction::Shuffle()
     reverseOrdering[ordering[i]] = i;
 
   arma::umat newLocations(2, groundTruth.n_nonzero);
-  arma::vec values(groundTruth.n_nonzero);
-  arma::sp_mat::const_iterator it = groundTruth.begin();
+  arma::Col<ElemType> values(groundTruth.n_nonzero);
+  typename arma::SpMat<ElemType>::const_iterator it = groundTruth.begin();
   size_t loc = 0;
   while (it != groundTruth.end())
   {
@@ -71,7 +73,7 @@ inline void SoftmaxRegressionFunction::Shuffle()
     ++loc;
   }
 
-  groundTruth = arma::sp_mat(newLocations, values, groundTruth.n_rows,
+  groundTruth = arma::SpMat<ElemType>(newLocations, values, groundTruth.n_rows,
       groundTruth.n_cols);
 }
 
@@ -80,23 +82,26 @@ inline void SoftmaxRegressionFunction::Shuffle()
  * normal distribution. The weights cannot be initialized to zero, as that will
  * lead to each class output being the same.
  */
-inline const arma::mat SoftmaxRegressionFunction::InitializeWeights()
+template<typename MatType>
+inline const MatType SoftmaxRegressionFunction<MatType>::InitializeWeights()
 {
   return InitializeWeights(data.n_rows, numClasses, fitIntercept);
 }
 
-inline const arma::mat SoftmaxRegressionFunction::InitializeWeights(
+template<typename MatType>
+inline const MatType SoftmaxRegressionFunction<MatType>::InitializeWeights(
     const size_t featureSize,
     const size_t numClasses,
     const bool fitIntercept)
 {
-    arma::mat parameters;
-    InitializeWeights(parameters, featureSize, numClasses, fitIntercept);
-    return parameters;
+  MatType parameters;
+  InitializeWeights(parameters, featureSize, numClasses, fitIntercept);
+  return parameters;
 }
 
-inline void SoftmaxRegressionFunction::InitializeWeights(
-    arma::mat &weights,
+template<typename MatType>
+inline void SoftmaxRegressionFunction<MatType>::InitializeWeights(
+    MatType& weights,
     const size_t featureSize,
     const size_t numClasses,
     const bool fitIntercept)
@@ -116,8 +121,9 @@ inline void SoftmaxRegressionFunction::InitializeWeights(
  * labels. The output is in the form of a matrix, which leads to simpler
  * calculations in the Evaluate() and Gradient() methods.
  */
-inline void SoftmaxRegressionFunction::GetGroundTruthMatrix(
-    const arma::Row<size_t>& labels, arma::sp_mat& groundTruth)
+template<typename MatType>
+inline void SoftmaxRegressionFunction<MatType>::GetGroundTruthMatrix(
+    const arma::Row<size_t>& labels, arma::SpMat<ElemType>& groundTruth)
 {
   // Calculate the ground truth matrix according to the labels passed. The
   // ground truth matrix is a matrix of dimensions 'numClasses * numExamples',
@@ -137,25 +143,26 @@ inline void SoftmaxRegressionFunction::GetGroundTruthMatrix(
   }
 
   // All entries are '1'.
-  arma::vec values;
+  arma::Col<ElemType> values;
   values.ones(labels.n_elem);
 
   // Calculate the matrix.
-  groundTruth = arma::sp_mat(rowPointers, colPointers, values, numClasses,
-                             labels.n_elem);
+  groundTruth = arma::SpMat<ElemType>(rowPointers, colPointers, values,
+      numClasses, labels.n_elem);
 }
 
 /**
  * Evaluate the probabilities matrix. If fitIntercept flag is true,
  * it should consider the parameters.cols(0) intercept term.
  */
-inline void SoftmaxRegressionFunction::GetProbabilitiesMatrix(
-    const arma::mat& parameters,
-    arma::mat& probabilities,
+template<typename MatType>
+inline void SoftmaxRegressionFunction<MatType>::GetProbabilitiesMatrix(
+    const MatType& parameters,
+    MatType& probabilities,
     const size_t start,
     const size_t batchSize) const
 {
-  arma::mat hypothesis;
+  MatType hypothesis;
 
   if (fitIntercept)
   {
@@ -183,8 +190,9 @@ inline void SoftmaxRegressionFunction::GetProbabilitiesMatrix(
 /**
  * Evaluates the objective function given the parameters.
  */
-inline double SoftmaxRegressionFunction::Evaluate(
-    const arma::mat& parameters) const
+template<typename MatType>
+inline typename MatType::elem_type SoftmaxRegressionFunction<MatType>::Evaluate(
+    const MatType& parameters) const
 {
   // The objective function is the negative log likelihood of the model
   // calculated over all the training examples. Mathematically it is as follows:
@@ -202,11 +210,11 @@ inline double SoftmaxRegressionFunction::Evaluate(
   // The sum is calculated over all the classes.
   // x_i is the input vector for a particular training example.
   // theta_j is the parameter vector associated with a particular class.
-  arma::mat probabilities;
+  MatType probabilities;
   GetProbabilitiesMatrix(parameters, probabilities, 0, data.n_cols);
 
   // Calculate the log likelihood and regularization terms.
-  double logLikelihood, weightDecay, cost;
+  ElemType logLikelihood, weightDecay, cost;
 
   logLikelihood = arma::accu(groundTruth % arma::log(probabilities)) /
                   data.n_cols;
@@ -222,16 +230,17 @@ inline double SoftmaxRegressionFunction::Evaluate(
 /**
  * Evaluate the objective function for the given points given the parameters.
  */
-inline double SoftmaxRegressionFunction::Evaluate(
-    const arma::mat& parameters,
+template<typename MatType>
+inline typename MatType::elem_type SoftmaxRegressionFunction<MatType>::Evaluate(
+    const MatType& parameters,
     const size_t start,
     const size_t batchSize) const
 {
-  arma::mat probabilities;
+  MatType probabilities;
   GetProbabilitiesMatrix(parameters, probabilities, start, batchSize);
 
   // Calculate the log likelihood and regularization terms.
-  double logLikelihood, weightDecay;
+  ElemType logLikelihood, weightDecay;
 
   logLikelihood = arma::accu(groundTruth.cols(start, start + batchSize - 1) %
       arma::log(probabilities)) / batchSize;
@@ -243,8 +252,9 @@ inline double SoftmaxRegressionFunction::Evaluate(
 /**
  * Calculates and stores the gradient values given a set of parameters.
  */
-inline void SoftmaxRegressionFunction::Gradient(
-    const arma::mat& parameters, arma::mat& gradient) const
+template<typename MatType>
+inline void SoftmaxRegressionFunction<MatType>::Gradient(
+    const MatType& parameters, MatType& gradient) const
 {
   // Calculate the class probabilities for each training example. The
   // probabilities for each of the classes are given by:
@@ -252,7 +262,7 @@ inline void SoftmaxRegressionFunction::Gradient(
   // The sum is calculated over all the classes.
   // x_i is the input vector for a particular training example.
   // theta_j is the parameter vector associated with a particular class.
-  arma::mat probabilities;
+  MatType probabilities;
   GetProbabilitiesMatrix(parameters, probabilities, 0, data.n_cols);
 
   // Calculate the parameter gradients.
@@ -261,13 +271,13 @@ inline void SoftmaxRegressionFunction::Gradient(
   {
     // Treating the intercept term parameters.col(0) seperately to avoid
     // the cost of building matrix [1; data].
-    arma::mat inner = probabilities - groundTruth;
+    MatType inner = probabilities - groundTruth;
     gradient.col(0) =
-      inner * arma::ones<arma::mat>(data.n_cols, 1) / data.n_cols +
-      lambda * parameters.col(0);
+        inner * arma::ones<MatType>(data.n_cols, 1) / data.n_cols +
+        lambda * parameters.col(0);
     gradient.cols(1, parameters.n_cols - 1) =
-      inner * data.t() / data.n_cols +
-      lambda * parameters.cols(1, parameters.n_cols - 1);
+        inner * data.t() / data.n_cols +
+        lambda * parameters.cols(1, parameters.n_cols - 1);
   }
   else
   {
@@ -276,23 +286,24 @@ inline void SoftmaxRegressionFunction::Gradient(
   }
 }
 
-inline void SoftmaxRegressionFunction::Gradient(
-    const arma::mat& parameters,
+template<typename MatType>
+inline void SoftmaxRegressionFunction<MatType>::Gradient(
+    const MatType& parameters,
     const size_t start,
-    arma::mat& gradient,
+    MatType& gradient,
     const size_t batchSize) const
 {
-  arma::mat probabilities;
+  MatType probabilities;
   GetProbabilitiesMatrix(parameters, probabilities, start, batchSize);
 
   // Calculate the parameter gradients.
   gradient.set_size(parameters.n_rows, parameters.n_cols);
   if (fitIntercept)
   {
-    arma::mat inner = probabilities - groundTruth.cols(start, start +
+    MatType inner = probabilities - groundTruth.cols(start, start +
         batchSize - 1);
     gradient.col(0) =
-        inner * arma::ones<arma::mat>(batchSize, 1) / batchSize +
+        inner * arma::ones<MatType>(batchSize, 1) / batchSize +
         lambda * parameters.col(0);
     gradient.cols(1, parameters.n_cols - 1) =
         inner * data.cols(start, start + batchSize - 1).t() / batchSize +
@@ -306,24 +317,25 @@ inline void SoftmaxRegressionFunction::Gradient(
   }
 }
 
-inline void SoftmaxRegressionFunction::PartialGradient(
-    const arma::mat& parameters,
+template<typename MatType>
+inline void SoftmaxRegressionFunction<MatType>::PartialGradient(
+    const MatType& parameters,
     const size_t j,
-    arma::sp_mat& gradient) const
+    arma::SpMat<ElemType>& gradient) const
 {
   gradient.zeros(arma::size(parameters));
 
-  arma::mat probabilities;
+  MatType probabilities;
   GetProbabilitiesMatrix(parameters, probabilities, 0, data.n_cols);
 
   // Calculate the required part of the gradient.
-  arma::mat inner = probabilities - groundTruth;
+  MatType inner = probabilities - groundTruth;
   if (fitIntercept)
   {
     if (j == 0)
     {
       gradient.col(j) =
-          inner * arma::ones<arma::mat>(data.n_cols, 1) / data.n_cols +
+          inner * arma::ones<MatType>(data.n_cols, 1) / data.n_cols +
           lambda * parameters.col(0);
     }
     else
diff --git a/inst/include/ensmallen_bits/problems/sparse_test_function_impl.hpp b/inst/include/ensmallen_bits/problems/sparse_test_function_impl.hpp
index a737f6e..79bfa66 100644
--- a/inst/include/ensmallen_bits/problems/sparse_test_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/sparse_test_function_impl.hpp
@@ -31,11 +31,13 @@ inline typename MatType::elem_type SparseTestFunction::Evaluate(
     const size_t i,
     const size_t batchSize) const
 {
-  typename MatType::elem_type result = 0.0;
+  typedef typename MatType::elem_type ElemType;
+
+  ElemType result = 0;
   for (size_t j = i; j < i + batchSize; ++j)
   {
-    result += coordinates[j] * coordinates[j] + bi[j] * coordinates[j] +
-        intercepts[j];
+    result += coordinates[j] * coordinates[j] +
+        ElemType(bi[j]) * coordinates[j] + ElemType(intercepts[j]);
   }
 
   return result;
@@ -46,11 +48,13 @@ template<typename MatType>
 inline typename MatType::elem_type SparseTestFunction::Evaluate(
     const MatType& coordinates) const
 {
-  typename MatType::elem_type objective = 0.0;
+  typedef typename MatType::elem_type ElemType;
+
+  ElemType objective = 0;
   for (size_t i = 0; i < NumFunctions(); ++i)
   {
-    objective += coordinates[i] * coordinates[i] + bi[i] * coordinates[i] +
-      intercepts[i];
+    objective += coordinates[i] * coordinates[i] +
+      ElemType(bi[i]) * coordinates[i] + ElemType(intercepts[i]);
   }
 
   return objective;
@@ -65,7 +69,7 @@ inline void SparseTestFunction::Gradient(const MatType& coordinates,
 {
   gradient.zeros(arma::size(coordinates));
   for (size_t j = i; j < i + batchSize; ++j)
-    gradient[j] = 2 * coordinates[j] + bi[j];
+    gradient[j] = 2 * coordinates[j] + typename MatType::elem_type(bi[j]);
 }
 
 //! Evaluate the gradient of a feature function.
@@ -75,7 +79,7 @@ inline void SparseTestFunction::PartialGradient(const MatType& coordinates,
                                                 GradType& gradient) const
 {
   gradient.zeros(arma::size(coordinates));
-  gradient[j] = 2 * coordinates[j] + bi[j];
+  gradient[j] = 2 * coordinates[j] + typename MatType::elem_type(bi[j]);
 }
 
 } // namespace test
diff --git a/inst/include/ensmallen_bits/problems/sphere_function.hpp b/inst/include/ensmallen_bits/problems/sphere_function.hpp
index a5039e8..c08b548 100644
--- a/inst/include/ensmallen_bits/problems/sphere_function.hpp
+++ b/inst/include/ensmallen_bits/problems/sphere_function.hpp
@@ -108,14 +108,14 @@ class SphereFunction
   template<typename MatType = arma::mat>
   MatType GetInitialPoint() const
   {
-    return arma::conv_to<MatType>::from(initialPoint);
+    return conv_to<MatType>::from(initialPoint);
   }
 
   //! Get the final point.
   template<typename MatType = arma::mat>
   MatType GetFinalPoint() const
   {
-    return arma::zeros<MatType>(initialPoint.n_rows, initialPoint.n_cols);
+    return zeros<MatType>(initialPoint.n_rows, initialPoint.n_cols);
   }
 
   //! Get the final objective.
diff --git a/inst/include/ensmallen_bits/problems/sphere_function_impl.hpp b/inst/include/ensmallen_bits/problems/sphere_function_impl.hpp
index c83a2b0..783c7c5 100644
--- a/inst/include/ensmallen_bits/problems/sphere_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/sphere_function_impl.hpp
@@ -45,11 +45,13 @@ typename MatType::elem_type SphereFunction::Evaluate(
     const size_t begin,
     const size_t batchSize) const
 {
-  typename MatType::elem_type objective = 0.0;
+  typedef typename MatType::elem_type ElemType;
+
+  ElemType objective = 0;
   for (size_t j = begin; j < begin + batchSize; ++j)
   {
     const size_t p = visitationOrder[j];
-    objective += std::pow(coordinates(p), 2);
+    objective += std::pow(coordinates(p), ElemType(2));
   }
 
   return objective;
@@ -73,7 +75,7 @@ void SphereFunction::Gradient(const MatType& coordinates,
   for (size_t j = begin; j < begin + batchSize; ++j)
   {
     const size_t p = visitationOrder[j];
-    gradient(p) += 2.0 * coordinates[p];
+    gradient(p) += 2 * coordinates[p];
   }
 }
 
diff --git a/inst/include/ensmallen_bits/problems/styblinski_tang_function.hpp b/inst/include/ensmallen_bits/problems/styblinski_tang_function.hpp
index 0009a3a..0c9c2db 100644
--- a/inst/include/ensmallen_bits/problems/styblinski_tang_function.hpp
+++ b/inst/include/ensmallen_bits/problems/styblinski_tang_function.hpp
@@ -109,7 +109,7 @@ class StyblinskiTangFunction
   template<typename MatType = arma::mat>
   MatType GetInitialPoint() const
   {
-    return arma::conv_to<MatType>::from(initialPoint);
+    return conv_to<MatType>::from(initialPoint);
   }
 
   //! Get the final point.
@@ -118,7 +118,7 @@ class StyblinskiTangFunction
   {
     MatType result(initialPoint.n_rows, initialPoint.n_cols);
     for (size_t i = 0; i < result.n_elem; ++i)
-      result[i] = -2.903534;
+      result[i] = typename MatType::elem_type(-2.903534);
     return result;
   }
 
diff --git a/inst/include/ensmallen_bits/problems/styblinski_tang_function_impl.hpp b/inst/include/ensmallen_bits/problems/styblinski_tang_function_impl.hpp
index 671de35..aa043c8 100644
--- a/inst/include/ensmallen_bits/problems/styblinski_tang_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/styblinski_tang_function_impl.hpp
@@ -24,7 +24,10 @@ inline StyblinskiTangFunction::StyblinskiTangFunction(const size_t n) :
 
 {
   initialPoint.set_size(n, 1);
-  initialPoint.fill(-5);
+  // Manual reimplementation of fill() that also works for sparse types (for
+  // testing).
+  for (size_t i = 0; i < n; ++i)
+    initialPoint[i] = -5;
 }
 
 inline void StyblinskiTangFunction::Shuffle()
@@ -39,12 +42,14 @@ typename MatType::elem_type StyblinskiTangFunction::Evaluate(
     const size_t begin,
     const size_t batchSize) const
 {
-  typename MatType::elem_type objective = 0.0;
+  typedef typename MatType::elem_type ElemType;
+
+  typename MatType::elem_type objective = ElemType(0);
   for (size_t j = begin; j < begin + batchSize; ++j)
   {
     const size_t p = visitationOrder[j];
-    objective += std::pow(coordinates(p), 4) - 16 *
-        std::pow(coordinates(p), 2) + 5 * coordinates(p);
+    objective += std::pow(coordinates(p), ElemType(4)) - 16 *
+        std::pow(coordinates(p), ElemType(2)) + 5 * coordinates(p);
   }
   objective /= 2;
 
@@ -64,13 +69,15 @@ void StyblinskiTangFunction::Gradient(const MatType& coordinates,
                                       GradType& gradient,
                                       const size_t batchSize) const
 {
+  typedef typename MatType::elem_type ElemType;
+
   gradient.zeros(n, 1);
 
   for (size_t j = begin; j < begin + batchSize; ++j)
   {
     const size_t p = visitationOrder[j];
-    gradient(p) += 0.5 * (4 * std::pow(coordinates(p), 3) -
-        32.0 * coordinates(p) + 5.0);
+    gradient(p) += (4 * std::pow(coordinates(p), ElemType(3)) -
+        32 * coordinates(p) + 5) / 2;
   }
 }
 
diff --git a/inst/include/ensmallen_bits/problems/three_hump_camel_function_impl.hpp b/inst/include/ensmallen_bits/problems/three_hump_camel_function_impl.hpp
index 5a1cc50..3ee9eee 100644
--- a/inst/include/ensmallen_bits/problems/three_hump_camel_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/three_hump_camel_function_impl.hpp
@@ -36,8 +36,9 @@ typename MatType::elem_type ThreeHumpCamelFunction::Evaluate(
   const ElemType x1 = coordinates(0);
   const ElemType x2 = coordinates(1);
 
-  const ElemType objective = (2 * std::pow(x1, 2)) - (1.05 * std::pow(x1, 4)) +
-      (std::pow(x1, 6) / 6) + (x1 * x2) + std::pow(x2, 2);
+  const ElemType objective = (2 * std::pow(x1, ElemType(2))) -
+      (ElemType(1.05) * std::pow(x1, ElemType(4))) +
+      (std::pow(x1, ElemType(6)) / 6) + (x1 * x2) + std::pow(x2, ElemType(2));
   return objective;
 }
 
@@ -62,7 +63,8 @@ inline void ThreeHumpCamelFunction::Gradient(const MatType& coordinates,
   const ElemType x2 = coordinates(1);
 
   gradient.set_size(2, 1);
-  gradient(0) = std::pow(x1, 5) - (4.2 * std::pow(x1, 3)) + (4 * x1) + x2;
+  gradient(0) = std::pow(x1, ElemType(5)) -
+      (ElemType(4.2) * std::pow(x1, ElemType(3))) + (4 * x1) + x2;
   gradient(1) = x1 + (2 * x2);
 }
 
diff --git a/inst/include/ensmallen_bits/problems/wood_function_impl.hpp b/inst/include/ensmallen_bits/problems/wood_function_impl.hpp
index 756b81f..7143493 100644
--- a/inst/include/ensmallen_bits/problems/wood_function_impl.hpp
+++ b/inst/include/ensmallen_bits/problems/wood_function_impl.hpp
@@ -39,12 +39,12 @@ typename MatType::elem_type WoodFunction::Evaluate(
   const ElemType x4 = coordinates(3);
 
   const ElemType objective =
-      /* f1(x) */ 100 * std::pow(x2 - std::pow(x1, 2), 2) +
-      /* f2(x) */ std::pow(1 - x1, 2) +
-      /* f3(x) */ 90 * std::pow(x4 - std::pow(x3, 2), 2) +
-      /* f4(x) */ std::pow(1 - x3, 2) +
-      /* f5(x) */ 10 * std::pow(x2 + x4 - 2, 2) +
-      /* f6(x) */ (1.0 / 10.0) * std::pow(x2 - x4, 2);
+      /* f1(x) */ 100 * std::pow(x2 - std::pow(x1, ElemType(2)), ElemType(2)) +
+      /* f2(x) */ std::pow(1 - x1, ElemType(2)) +
+      /* f3(x) */ 90 * std::pow(x4 - std::pow(x3, ElemType(2)), ElemType(2)) +
+      /* f4(x) */ std::pow(1 - x3, ElemType(2)) +
+      /* f5(x) */ 10 * std::pow(x2 + x4 - 2, ElemType(2)) +
+      /* f6(x) */ ElemType(1.0 / 10.0) * std::pow(x2 - x4, ElemType(2));
 
   return objective;
 }
@@ -72,12 +72,12 @@ inline void WoodFunction::Gradient(const MatType& coordinates,
   const ElemType x4 = coordinates(3);
 
   gradient.set_size(4, 1);
-  gradient(0) = 400 * (std::pow(x1, 3) - x2 * x1) - 2 * (1 - x1);
-  gradient(1) = 200 * (x2 - std::pow(x1, 2)) + 20 * (x2 + x4 - 2) +
-      (1.0 / 5.0) * (x2 - x4);
-  gradient(2) = 360 * (std::pow(x3, 3) - x4 * x3) - 2 * (1 - x3);
-  gradient(3) = 180 * (x4 - std::pow(x3, 2)) + 20 * (x2 + x4 - 2) -
-      (1.0 / 5.0) * (x2 - x4);
+  gradient(0) = 400 * (std::pow(x1, ElemType(3)) - x2 * x1) - 2 * (1 - x1);
+  gradient(1) = 200 * (x2 - std::pow(x1, ElemType(2))) + 20 * (x2 + x4 - 2) +
+      ElemType(1.0 / 5.0) * (x2 - x4);
+  gradient(2) = 360 * (std::pow(x3, ElemType(3)) - x4 * x3) - 2 * (1 - x3);
+  gradient(3) = 180 * (x4 - std::pow(x3, ElemType(2))) + 20 * (x2 + x4 - 2) -
+      ElemType(1.0 / 5.0) * (x2 - x4);
 }
 
 template<typename MatType, typename GradType>
diff --git a/inst/include/ensmallen_bits/problems/zdt/zdt1_function.hpp b/inst/include/ensmallen_bits/problems/zdt/zdt1_function.hpp
index ef8889c..bbc4c51 100644
--- a/inst/include/ensmallen_bits/problems/zdt/zdt1_function.hpp
+++ b/inst/include/ensmallen_bits/problems/zdt/zdt1_function.hpp
@@ -48,110 +48,112 @@ namespace test {
  *
  * @tparam MatType Type of matrix to optimize.
  */
-  template<typename MatType = arma::mat>
-  class ZDT1
+template<typename MatType = arma::mat>
+class ZDT1
+{
+ private:
+  size_t numParetoPoints {100};
+  size_t numObjectives {2};
+  size_t numVariables {30};
+
+ public:
+   //! Initialize the ZDT1
+  ZDT1(size_t numParetoPoints = 100) :
+      numParetoPoints(numParetoPoints),
+      objectiveF1(*this),
+      objectiveF2(*this)
+  {/* Nothing to do here. */}
+
+  /**
+   * Evaluate the objectives with the given coordinate.
+   *
+   * @param coords The function coordinates.
+   * @return arma::Col<typename MatType::elem_type>
+   */
+  arma::Col<typename MatType::elem_type> Evaluate(const MatType& coords)
   {
-   private:
-    size_t numParetoPoints {100};
-    size_t numObjectives {2};
-    size_t numVariables {30};
-
-   public:
-     //! Initialize the ZDT1
-    ZDT1(size_t numParetoPoints = 100) :
-        numParetoPoints(numParetoPoints),
-        objectiveF1(*this),
-        objectiveF2(*this)
-    {/* Nothing to do here. */}
-
-    /**
-     * Evaluate the objectives with the given coordinate.
-     *
-     * @param coords The function coordinates.
-     * @return arma::Col<typename MatType::elem_type>
-     */
-    arma::Col<typename MatType::elem_type> Evaluate(const MatType& coords)
-    {
-      // Convenience typedef.
-      typedef typename MatType::elem_type ElemType;
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
 
-      arma::Col<ElemType> objectives(numObjectives);
-      objectives(0) = coords[0];
-      ElemType sum = arma::accu(coords(arma::span(1, numVariables - 1), 0));
-      ElemType g = 1. + 9. * sum / (static_cast<ElemType>(numVariables) - 1.);
-      ElemType objectiveRatio = objectives(0) / g;
-      objectives(1) = g * (1. - std::sqrt(objectiveRatio));
+    arma::Col<ElemType> objectives(numObjectives);
+    objectives(0) = coords[0];
 
-      return objectives;
-    }
+    ElemType sum = accu(coords.submat(1, 0, numVariables - 1, 0));
+    ElemType g = 1 + 9 * sum / (static_cast<ElemType>(numVariables) - 1.0);
+    ElemType objectiveRatio = objectives(0) / g;
+    objectives(1) = g * (1 - std::sqrt(objectiveRatio));
 
-    //! Get the starting point.
-    MatType GetInitialPoint()
-    {
-      // Convenience typedef.
-      typedef typename MatType::elem_type ElemType;
+    return objectives;
+  }
 
-      return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
-    }
+  //! Get the starting point.
+  MatType GetInitialPoint()
+  {
+    // Convenience typedef.
+    typedef typename MatType::elem_type ElemType;
+
+    return arma::Col<ElemType>(numVariables, 1, arma::fill::zeros);
+  }
 
-    struct ObjectiveF1
+  struct ObjectiveF1
+  {
+    ObjectiveF1(ZDT1& zdtClass) : zdtClass(zdtClass)
+    {/*Nothing to do here */}
+
+    typename MatType::elem_type Evaluate(const MatType& coords)
     {
-      ObjectiveF1(ZDT1& zdtClass) : zdtClass(zdtClass)
-      {/*Nothing to do here */}
+      return coords[0];
+    }
 
-      typename MatType::elem_type Evaluate(const MatType& coords)
-      {
-        return coords[0];
-      }
+    ZDT1& zdtClass;
+  };
 
-      ZDT1& zdtClass;
-    };
+  struct ObjectiveF2
+  {
+    ObjectiveF2(ZDT1& zdtClass) : zdtClass(zdtClass)
+    {/*Nothing to do here */}
 
-    struct ObjectiveF2
+    typename MatType::elem_type Evaluate(const MatType& coords)
     {
-      ObjectiveF2(ZDT1& zdtClass) : zdtClass(zdtClass)
-      {/*Nothing to do here */}
+      // Convenience typedef.
+      typedef typename MatType::elem_type ElemType;
 
-      typename MatType::elem_type Evaluate(const MatType& coords)
-      {
-        // Convenience typedef.
-        typedef typename MatType::elem_type ElemType;
+      size_t numVariables = zdtClass.numVariables;
+      ElemType sum = arma::accu(coords(arma::span(1, numVariables - 1), 0));
+      ElemType g = 1 + 9 * sum / (static_cast<ElemType>(numVariables - 1));
+      ElemType objectiveRatio = zdtClass.objectiveF1.Evaluate(coords) / g;
 
-        size_t numVariables = zdtClass.numVariables;
-        ElemType sum = arma::accu(coords(arma::span(1, numVariables - 1), 0));
-        ElemType g = 1. + 9. * sum / (static_cast<ElemType>(numVariables - 1));
-        ElemType objectiveRatio = zdtClass.objectiveF1.Evaluate(coords) / g;
+      return g * (1 - std::sqrt(objectiveRatio));
+    }
 
-        return g * (1. - std::sqrt(objectiveRatio));
-      }
+    ZDT1& zdtClass;
+  };
 
-      ZDT1& zdtClass;
-    };
+  //! Get objective functions.
+  std::tuple<ObjectiveF1, ObjectiveF2> GetObjectives()
+  {
+    return std::make_tuple(objectiveF1, objectiveF2);
+  }
 
-    //! Get objective functions.
-    std::tuple<ObjectiveF1, ObjectiveF2> GetObjectives()
-    {
-      return std::make_tuple(objectiveF1, objectiveF2);
-    }
+  //! Get the Reference Front.
+  //! Refer PR #273 Ipynb notebook to see the plot of Reference
+  //! Front. The implementation has been taken from pymoo.
+  arma::cube GetReferenceFront()
+  {
+    arma::cube front(2, 1, numParetoPoints);
+    arma::vec x = arma::linspace(0, 1, numParetoPoints);
+    arma::vec y = 1 - arma::sqrt(x);
+    for (size_t idx = 0; idx < numParetoPoints; ++idx)
+      front.slice(idx) = arma::vec{ x(idx), y(idx) };
 
-    //! Get the Reference Front.
-    //! Refer PR #273 Ipynb notebook to see the plot of Reference
-    //! Front. The implementation has been taken from pymoo.
-    arma::cube GetReferenceFront()
-    {
-      arma::cube front(2, 1, numParetoPoints);
-      arma::vec x = arma::linspace(0, 1, numParetoPoints);
-      arma::vec y = 1 - arma::sqrt(x);
-      for (size_t idx = 0; idx < numParetoPoints; ++idx)
-        front.slice(idx) = arma::vec{ x(idx), y(idx) };
+    return front;
+  }
 
-      return front;
-    }
+  ObjectiveF1 objectiveF1;
+  ObjectiveF2 objectiveF2;
+};
 
-    ObjectiveF1 objectiveF1;
-    ObjectiveF2 objectiveF2;
-  };
-  } //namespace test
-  } //namespace ens
+} // namespace test
+} // namespace ens
 
 #endif
diff --git a/inst/include/ensmallen_bits/problems/zdt/zdt2_function.hpp b/inst/include/ensmallen_bits/problems/zdt/zdt2_function.hpp
index 440ce6c..a8ef492 100644
--- a/inst/include/ensmallen_bits/problems/zdt/zdt2_function.hpp
+++ b/inst/include/ensmallen_bits/problems/zdt/zdt2_function.hpp
@@ -152,7 +152,8 @@ namespace test {
     ObjectiveF1 objectiveF1;
     ObjectiveF2 objectiveF2;
   };
-  } //namespace test
-  } //namespace ens
 
-#endif
\ No newline at end of file
+} //namespace test
+} //namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/problems/zdt/zdt3_function.hpp b/inst/include/ensmallen_bits/problems/zdt/zdt3_function.hpp
index b62406a..be0b2b6 100644
--- a/inst/include/ensmallen_bits/problems/zdt/zdt3_function.hpp
+++ b/inst/include/ensmallen_bits/problems/zdt/zdt3_function.hpp
@@ -125,7 +125,7 @@ namespace test {
         typedef typename MatType::elem_type ElemType;
 
         size_t numVariables = zdtClass.numVariables;
-        ElemType sum = arma::accu(coords(arma::span(1, numVariables - 1), 0));
+        ElemType sum = accu(coords.submat(1, 0, numVariables - 1, 0));
         ElemType g = 1. + 9. * sum / (static_cast<ElemType>(numVariables - 1));
         ElemType objectiveRatio = zdtClass.objectiveF1.Evaluate(coords) / g;
 
@@ -182,7 +182,8 @@ namespace test {
     ObjectiveF1 objectiveF1;
     ObjectiveF2 objectiveF2;
   };
-  } //namespace test
-  } //namespace ens
 
-#endif
\ No newline at end of file
+} // namespace test
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/problems/zdt/zdt4_function.hpp b/inst/include/ensmallen_bits/problems/zdt/zdt4_function.hpp
index fad2ba9..27b273a 100644
--- a/inst/include/ensmallen_bits/problems/zdt/zdt4_function.hpp
+++ b/inst/include/ensmallen_bits/problems/zdt/zdt4_function.hpp
@@ -155,6 +155,8 @@ namespace test {
     ObjectiveF1 objectiveF1;
     ObjectiveF2 objectiveF2;
   };
-  } //namespace test
-  } //namespace ens
+
+} //namespace test
+} //namespace ens
+
 #endif
diff --git a/inst/include/ensmallen_bits/problems/zdt/zdt6_function.hpp b/inst/include/ensmallen_bits/problems/zdt/zdt6_function.hpp
index 68d2364..b404cff 100644
--- a/inst/include/ensmallen_bits/problems/zdt/zdt6_function.hpp
+++ b/inst/include/ensmallen_bits/problems/zdt/zdt6_function.hpp
@@ -157,6 +157,8 @@ namespace test {
     ObjectiveF1 objectiveF1;
     ObjectiveF2 objectiveF2;
   };
-  } //namespace test
-  } //namespace ens
-#endif
\ No newline at end of file
+
+} //namespace test
+} //namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/pso/init_policies/default_init.hpp b/inst/include/ensmallen_bits/pso/init_policies/default_init.hpp
index 519c515..7a09785 100644
--- a/inst/include/ensmallen_bits/pso/init_policies/default_init.hpp
+++ b/inst/include/ensmallen_bits/pso/init_policies/default_init.hpp
@@ -12,6 +12,7 @@
  */
 #ifndef ENSMALLEN_PSO_INIT_POLICIES_DEFAULT_INIT_HPP
 #define ENSMALLEN_PSO_INIT_POLICIES_DEFAULT_INIT_HPP
+
 #include <assert.h>
 
 namespace ens {
@@ -65,26 +66,30 @@ class DefaultInit
   {
     // Convenience typedef.
     typedef typename MatType::elem_type ElemType;
-    typedef typename CubeType::elem_type CubeElemType;
+
+    typedef typename ForwardType<MatType>::umat UMatType;
+    typedef typename ForwardType<MatType>::bmat BaseMatType;
 
     // Randomly initialize the particle positions.
     particlePositions.randu(iterate.n_rows, iterate.n_cols, numParticles);
 
     // Check if lowerBound is equal to upperBound. If equal, reinitialize.
-    arma::umat lbEquality = (lowerBound == upperBound);
+    UMatType lbEquality = (lowerBound == upperBound);
     if (lbEquality.n_rows == 1 && lbEquality(0, 0) == 1)
     {
       lowerBound.set_size(iterate.n_rows, iterate.n_cols);
-      lowerBound.fill(-1.0);
+      lowerBound.fill(-1);
 
       upperBound.set_size(iterate.n_rows, iterate.n_cols);
-      upperBound.fill(1.0);
+      upperBound.fill(1);
     }
     // Check if lowerBound and upperBound are vectors of a single dimension.
     else if (lbEquality.n_rows == 1 && lbEquality(0, 0) == 0)
     {
-      lowerBound = -lowerBound(0) * arma::ones(iterate.n_rows, iterate.n_cols);
-      upperBound = upperBound(0) * arma::ones(iterate.n_rows, iterate.n_cols);
+      BoundMatType ones = BoundMatType(iterate.n_rows, iterate.n_cols);
+      ones.fill(1);
+      lowerBound = -lowerBound(0) * ones;
+      upperBound = upperBound(0) * ones;
     }
 
     // Check the dimensions of lowerBound and upperBound.
@@ -97,8 +102,8 @@ class DefaultInit
     for (size_t i = 0; i < numParticles; i++)
     {
       particlePositions.slice(i) = particlePositions.slice(i) %
-          arma::conv_to<arma::Mat<CubeElemType> >::from(upperBound - lowerBound)
-          + arma::conv_to<arma::Mat<CubeElemType> >::from(lowerBound);
+          conv_to<BaseMatType>::from(upperBound - lowerBound) +
+          conv_to<BaseMatType>::from(lowerBound);
     }
 
     // Randomly initialize particle velocities.
@@ -114,7 +119,6 @@ class DefaultInit
     particleBestFitnesses.set_size(numParticles);
     particleBestFitnesses.fill(std::numeric_limits<ElemType>::max());
   }
-
 };
 
 } // ens
diff --git a/inst/include/ensmallen_bits/pso/pso.hpp b/inst/include/ensmallen_bits/pso/pso.hpp
index cda0693..8f9411c 100644
--- a/inst/include/ensmallen_bits/pso/pso.hpp
+++ b/inst/include/ensmallen_bits/pso/pso.hpp
@@ -88,7 +88,7 @@ class PSOType
    * @param initPolicy Particle initialization policy.
    */
   PSOType(const size_t numParticles = 64,
-          const arma::mat& lowerBound = arma::ones(1, 1),
+          const arma::mat& lowerBound = arma::zeros(1, 1),
           const arma::mat& upperBound = arma::ones(1, 1),
           const size_t maxIterations = 3000,
           const size_t horizonSize = 350,
@@ -145,8 +145,8 @@ class PSOType
               VelocityUpdatePolicy(),
           const InitPolicy& initPolicy = InitPolicy()) :
           numParticles(numParticles),
-          lowerBound(lowerBound * arma::ones(1, 1)),
-          upperBound(upperBound * arma::ones(1, 1)),
+          lowerBound({ lowerBound }),
+          upperBound({ upperBound }),
           maxIterations(maxIterations),
           horizonSize(horizonSize),
           impTolerance(impTolerance),
@@ -163,7 +163,7 @@ class PSOType
    * returned.
    *
    * @tparam ArbitraryFunctionType Type of the function to be optimized.
-   * @tparam MatType Type of matrix to optimize.
+   * @tparam InputMatType Type of matrix to optimize.
    * @tparam CallbackTypes Types of callback functions.
    * @param function Function to be optimized.
    * @param iterate Initial point (will be modified).
@@ -171,11 +171,11 @@ class PSOType
    * @return Objective value of the final point.
    */
   template<typename ArbitraryFunctionType,
-           typename MatType,
+           typename InputMatType,
            typename... CallbackTypes>
-  typename MatType::elem_type Optimize(ArbitraryFunctionType& function,
-                                       MatType& iterate,
-                                       CallbackTypes&&... callbacks);
+  typename InputMatType::elem_type Optimize(ArbitraryFunctionType& function,
+                                            InputMatType& iterate,
+                                            CallbackTypes&&... callbacks);
 
   //! Retrieve value of numParticles.
   size_t NumParticles() const { return numParticles; }
@@ -259,6 +259,7 @@ class PSOType
 
   //! Velocity update policy used.
   VelocityUpdatePolicy velocityUpdatePolicy;
+
   //! Particle initialization policy used.
   InitPolicy initPolicy;
 
@@ -266,7 +267,7 @@ class PSOType
   Any instUpdatePolicy;
 };
 
-using LBestPSO = PSOType<LBestUpdate>;
+using LBestPSO = PSOType<LBestUpdate, DefaultInit>;
 } // ens
 
 #include "pso_impl.hpp"
diff --git a/inst/include/ensmallen_bits/pso/pso_impl.hpp b/inst/include/ensmallen_bits/pso/pso_impl.hpp
index ed8385a..4731e4b 100644
--- a/inst/include/ensmallen_bits/pso/pso_impl.hpp
+++ b/inst/include/ensmallen_bits/pso/pso_impl.hpp
@@ -36,16 +36,19 @@ namespace ens {
 template<typename VelocityUpdatePolicy,
          typename InitPolicy>
 template<typename ArbitraryFunctionType,
-         typename MatType,
+         typename InputMatType,
          typename... CallbackTypes>
-typename MatType::elem_type PSOType<VelocityUpdatePolicy, InitPolicy>::Optimize(
+typename InputMatType::elem_type PSOType<
+    VelocityUpdatePolicy, InitPolicy>::Optimize(
     ArbitraryFunctionType& function,
-    MatType& iterateIn,
+    InputMatType& iterateIn,
     CallbackTypes&&... callbacks)
 {
   // Convenience typedefs.
-  typedef typename MatType::elem_type ElemType;
-  typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
+  typedef typename InputMatType::elem_type ElemType;
+  typedef typename ForwardType<InputMatType>::bmat BaseMatType;
+  typedef typename ForwardType<InputMatType>::bcol BaseColType;
+  typedef typename ForwardType<InputMatType>::bcube BaseCubeType;
 
   // The update policy internally use a templated class so that
   // we can know MatType only when Optimize() is called.
@@ -79,17 +82,18 @@ typename MatType::elem_type PSOType<VelocityUpdatePolicy, InitPolicy>::Optimize(
   }
 
   // Initialize helper variables.
-  arma::Cube<ElemType> particlePositions;
-  arma::Cube<ElemType> particleVelocities;
-  arma::Col<ElemType> particleFitnesses;
-  arma::Col<ElemType> particleBestFitnesses;
-  arma::Cube<ElemType> particleBestPositions;
+  BaseCubeType particlePositions, particleVelocities, particleBestPositions;
+  BaseColType particleFitnesses, particleBestFitnesses;
+
+  //! Useful temporaries for float-like comparisons.
+  BaseMatType castedlowerBound = conv_to<BaseMatType>::from(lowerBound);
+  BaseMatType castedupperBound = conv_to<BaseMatType>::from(upperBound);
 
   // Initialize particles using the init policy.
   initPolicy.Initialize(iterate,
       numParticles,
-      lowerBound,
-      upperBound,
+      castedlowerBound,
+      castedupperBound,
       particlePositions,
       particleVelocities,
       particleFitnesses,
@@ -125,7 +129,8 @@ typename MatType::elem_type PSOType<VelocityUpdatePolicy, InitPolicy>::Optimize(
   // in the PSO method.
   // The performanceHorizon will be updated with the best particle
   // in a FIFO manner.
-  for (size_t i = 0; (i < horizonSize) && !terminate; i++)
+  size_t iteration = 0;
+  for (size_t i = 0; (i < horizonSize) && !terminate; i++, iteration++)
   {
     // Calculate fitness and evaluate personal best.
     for (size_t j = 0; (j < numParticles) && !terminate; j++)
@@ -167,15 +172,25 @@ typename MatType::elem_type PSOType<VelocityUpdatePolicy, InitPolicy>::Optimize(
 
     // Append bestFitness to performanceHorizon.
     performanceHorizon.push(bestFitness);
+
+    Info << "PSO: iteration " << iteration << ": objective " << bestFitness
+        << "." << std::endl;
   }
 
   // Run the remaining iterations of PSO.
-  for (size_t i = 0; (i < maxIterations - horizonSize) && !terminate; i++)
+  for (size_t i = 0; (i < maxIterations - horizonSize) && !terminate; i++,
+       iteration++)
   {
     // Check if there is any improvement over the horizon.
     // If there is no significant improvement, terminate.
     if (performanceHorizon.front() - performanceHorizon.back() < impTolerance)
+    {
+      Info << "PSO: improvement over horizon ("
+          << (performanceHorizon.front() - performanceHorizon.back())
+          << ") below convergence tolerance (" << impTolerance
+          << "); optimization complete." << std::endl;
       break;
+    }
 
     // Calculate fitness and evaluate personal best.
     for (size_t j = 0; (j < numParticles) && !terminate; j++)
@@ -217,6 +232,9 @@ typename MatType::elem_type PSOType<VelocityUpdatePolicy, InitPolicy>::Optimize(
     performanceHorizon.pop();
     // Push most recent bestFitness to performanceHorizon.
     performanceHorizon.push(bestFitness);
+
+    Info << "PSO: iteration " << iteration << ": objective " << bestFitness
+        << "." << std::endl;
   }
 
   // Copy results back.
diff --git a/inst/include/ensmallen_bits/pso/update_policies/lbest_update.hpp b/inst/include/ensmallen_bits/pso/update_policies/lbest_update.hpp
index cbb45db..49da9c0 100644
--- a/inst/include/ensmallen_bits/pso/update_policies/lbest_update.hpp
+++ b/inst/include/ensmallen_bits/pso/update_policies/lbest_update.hpp
@@ -12,6 +12,7 @@
  */
 #ifndef ENSMALLEN_PSO_UPDATE_POLICIES_LBEST_UPDATE_HPP
 #define ENSMALLEN_PSO_UPDATE_POLICIES_LBEST_UPDATE_HPP
+
 #include <assert.h>
 
 namespace ens {
@@ -63,118 +64,121 @@ class LBestUpdate
    * instantiated at the start of the optimization, and holds parameters
    * specific to an individual optimization.
    */
-  template<typename MatType>
+  template<
+      typename MatType, typename ColType = typename ForwardType<MatType>::bcol>
   class Policy
   {
-    public:
+   public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This is called by the optimizer method before the start of the iteration
      * update process.
      *
      * @param parent Instantiated parent class.
      */
-     Policy(const LBestUpdate& /* parent */) : n(0)
-     { /* Do nothing. */ }
-
-     /**
-      * The Initialize method is called by PSO Optimizer method before the
-      * start of the iteration process. It calculates the value of the
-      * constriction coefficent, initializes the local best indices of each
-      * particle to itself, and sets the shape of the r1 and r2 vectors.
-      *
-      * @param exploitationFactor Influence of personal best achieved.
-      * @param explorationFactor Influence of neighbouring particles.
-      * @param numParticles The number of particles in the swarm.
-      * @param iterate The user input, used for shaping intermediate vectors.
-      */
-     void Initialize(const double exploitationFactor,
-                     const double explorationFactor,
-                     const size_t numParticles,
-                     MatType& iterate)
-     {
-       // Copy values to aliases.
-       n = numParticles;
-       c1 = exploitationFactor;
-       c2 = explorationFactor;
-
-       // Calculate the constriction factor
-       static double phi = c1 + c2;
-       assert(phi > 4.0 && "The sum of the exploitation and exploration "
-           "factors must be greater than 4.");
-
-       chi = 2.0 / std::abs(2.0 - phi - std::sqrt((phi - 4.0) * phi));
-
-       // Initialize local best indices to self indices of particles.
-       localBestIndices = arma::linspace<
-           arma::Col<typename MatType::elem_type> >(0, n-1, n);
-
-       // Set sizes r1 and r2.
-       r1.set_size(iterate.n_rows, iterate.n_cols);
-       r2.set_size(iterate.n_rows, iterate.n_cols);
-     }
-
-     /**
-      * Update step for LBestPSO. Compares personal best of each particle with
-      * that of its neighbours, and sets the best of the 3 as the lobal best.
-      * This particle is then used for calculating the velocity for the update
-      * step.
-      *
-      * @param particlePositions The current coordinates of particles.
-      * @param particleVelocities The current velocities (will be modified).
-      * @param particleFitnesses The current fitness values or particles.
-      * @param particleBestPositions The personal best coordinates of particles.
-      * @param particleBestFitnesses The personal best fitness values of
-      *     particles.
-      */
-     void Update(arma::Cube<typename MatType::elem_type>& particlePositions,
-                 arma::Cube<typename MatType::elem_type>& particleVelocities,
-                 arma::Cube<typename MatType::elem_type>& particleBestPositions,
-                 arma::Col<typename MatType::elem_type>& particleBestFitnesses)
-     {
-       // Velocity update logic.
-       for (size_t i = 0; i < n; i++)
-       {
-         localBestIndices(i) =
-             particleBestFitnesses(left(i)) < particleBestFitnesses(i) ?
-             left(i) : i;
-         localBestIndices(i) =
-             particleBestFitnesses(right(i)) < particleBestFitnesses(i) ?
-             right(i) : i;
-       }
-
-       for (size_t i = 0; i < n; i++)
-       {
-         // Generate random numbers for current particle.
-         r1.randu();
-         r2.randu();
-         particleVelocities.slice(i) = chi * (particleVelocities.slice(i) +
-             c1 * r1 % (particleBestPositions.slice(i) -
-             particlePositions.slice(i)) + c2 * r2 %
-             (particleBestPositions.slice(localBestIndices(i)) -
-             particlePositions.slice(i)));
-       }
-     }
-
-    private:
-     //! Number of particles.
-     size_t n;
-
-     //! Exploitation factor.
-     typename MatType::elem_type c1;
-
-     //! Exploration factor.
-     typename MatType::elem_type c2;
-
-     //! Constriction factor chi.
-     typename MatType::elem_type chi;
-
-     //! Vectors of random numbers.
-     MatType r1, r2;
-
-     //! Indices of each particle's best neighbour.
-     arma::Col<typename MatType::elem_type> localBestIndices;
-
-     // Helper functions for calculating neighbours.
+    Policy(const LBestUpdate& /* parent */) : n(0)
+    { /* Do nothing. */ }
+
+    /**
+     * The Initialize method is called by PSO Optimizer method before the
+     * start of the iteration process. It calculates the value of the
+     * constriction coefficent, initializes the local best indices of each
+     * particle to itself, and sets the shape of the r1 and r2 vectors.
+     *
+     * @param exploitationFactor Influence of personal best achieved.
+     * @param explorationFactor Influence of neighbouring particles.
+     * @param numParticles The number of particles in the swarm.
+     * @param iterate The user input, used for shaping intermediate vectors.
+     */
+    void Initialize(const double exploitationFactor,
+                    const double explorationFactor,
+                    const size_t numParticles,
+                    MatType& iterate)
+    {
+      // Copy values to aliases.
+      n = numParticles;
+      c1 = ElemType(exploitationFactor);
+      c2 = ElemType(explorationFactor);
+
+      // Calculate the constriction factor
+      const ElemType phi = c1 + c2;
+      assert(phi > 4 && "The sum of the exploitation and exploration "
+          "factors must be greater than 4.");
+
+      chi = 2 / std::abs(2 - phi - std::sqrt((phi - 4) * phi));
+
+      // Initialize local best indices to self indices of particles.
+      localBestIndices = linspace<arma::uvec>(0, n - 1, n);
+
+      // Set sizes r1 and r2.
+      r1.set_size(iterate.n_rows, iterate.n_cols);
+      r2.set_size(iterate.n_rows, iterate.n_cols);
+    }
+
+    /**
+     * Update step for LBestPSO. Compares personal best of each particle with
+     * that of its neighbours, and sets the best of the 3 as the lobal best.
+     * This particle is then used for calculating the velocity for the update
+     * step.
+     *
+     * @param particlePositions The current coordinates of particles.
+     * @param particleVelocities The current velocities (will be modified).
+     * @param particleFitnesses The current fitness values or particles.
+     * @param particleBestPositions The personal best coordinates of particles.
+     * @param particleBestFitnesses The personal best fitness values of
+     *     particles.
+     */
+    template<typename CubeType, typename VecType>
+    void Update(CubeType& particlePositions,
+                CubeType& particleVelocities,
+                CubeType& particleBestPositions,
+                VecType& particleBestFitnesses)
+    {
+      // Velocity update logic.
+      for (size_t i = 0; i < n; i++)
+      {
+        localBestIndices(i) =
+            particleBestFitnesses(left(i)) < particleBestFitnesses(i) ?
+            left(i) : i;
+        localBestIndices(i) =
+            particleBestFitnesses(right(i)) < particleBestFitnesses(i) ?
+            right(i) : i;
+      }
+
+      for (size_t i = 0; i < n; i++)
+      {
+        // Generate random numbers for current particle.
+        r1.randu();
+        r2.randu();
+        particleVelocities.slice(i) = chi * (particleVelocities.slice(i) +
+            c1 * r1 % (particleBestPositions.slice(i) -
+            particlePositions.slice(i)) + c2 * r2 %
+            (particleBestPositions.slice(localBestIndices(i)) -
+            particlePositions.slice(i)));
+      }
+    }
+
+   private:
+    // Number of particles.
+    size_t n;
+
+    // Exploitation factor.
+    ElemType c1;
+
+    // Exploration factor.
+    ElemType c2;
+
+    // Constriction factor chi.
+    ElemType chi;
+
+    // Vectors of random numbers.
+    MatType r1, r2;
+
+    //! Indices of each particle's best neighbour.
+    arma::uvec localBestIndices;
+
+    // Helper functions for calculating neighbours.
     inline size_t left(size_t index) { return (index + n - 1) % n; }
     inline size_t right(size_t index) { return (index + 1) % n; }
   };
diff --git a/inst/include/ensmallen_bits/qhadam/qhadam.hpp b/inst/include/ensmallen_bits/qhadam/qhadam.hpp
index e29d4f2..38426bd 100644
--- a/inst/include/ensmallen_bits/qhadam/qhadam.hpp
+++ b/inst/include/ensmallen_bits/qhadam/qhadam.hpp
@@ -27,10 +27,10 @@ namespace ens {
  *
  * @code
  * @inproceedings{ma2019qh,
- *   title={Quasi-hyperbolic momentum and Adam for deep learning},
- *   author={Jerry Ma and Denis Yarats},
- *   booktitle={International Conference on Learning Representations},
- *   year={2019}
+ *   title     = {Quasi-hyperbolic momentum and Adam for deep learning},
+ *   author    = {Jerry Ma and Denis Yarats},
+ *   booktitle = {International Conference on Learning Representations},
+ *   year      = {2019}
  * }
  * @endcode
  *
@@ -100,7 +100,7 @@ class QHAdam
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/qhadam/qhadam_update.hpp b/inst/include/ensmallen_bits/qhadam/qhadam_update.hpp
index f408377..e0d0315 100644
--- a/inst/include/ensmallen_bits/qhadam/qhadam_update.hpp
+++ b/inst/include/ensmallen_bits/qhadam/qhadam_update.hpp
@@ -94,6 +94,8 @@ class QHAdamUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -104,10 +106,19 @@ class QHAdamUpdate
      */
     Policy(QHAdamUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
+        v1(ElemType(parent.v1)),
+        v2(ElemType(parent.v2)),
         iteration(0)
     {
       m.zeros(rows, cols);
       v.zeros(rows, cols);
+
+      // Attempt to detect underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -125,35 +136,40 @@ class QHAdamUpdate
       ++iteration;
 
       // And update the iterate.
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      v *= parent.beta2;
-      v += (1 - parent.beta2) * (gradient % gradient);
+      v *= beta2;
+      v += (1 - beta2) * (gradient % gradient);
 
-      const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
-      const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
+      const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
+      const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));
 
       GradType mDash = m / biasCorrection1;
       GradType vDash = v / biasCorrection2;
 
       // QHAdam recovers Adam when v2 = v1 = 1.
-      iterate -= stepSize *
-          ((((1 - parent.v1) * gradient) + parent.v1 * mDash) /
-           (arma::sqrt(((1 - parent.v2) * (gradient % gradient)) +
-            parent.v2 * vDash) + parent.epsilon));
+      iterate -= ElemType(stepSize) * ((((1 - v1) * gradient) + v1 * mDash) /
+           (sqrt(((1 - v2) * square(gradient)) + v2 * vDash) + epsilon));
     }
 
    private:
-    //! Instantiated parent object.
+    // Instantiated parent object.
     QHAdamUpdate& parent;
 
-    //! The exponential moving average of gradient values.
+    // The exponential moving average of gradient values.
     GradType m;
 
     // The exponential moving average of squared gradient values.
     GradType v;
 
+    // Parameters converted to the element type of the optimization.
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
+    ElemType v1;
+    ElemType v2;
+
     // The number of iterations.
     size_t iteration;
   };
diff --git a/inst/include/ensmallen_bits/rmsprop/rmsprop.hpp b/inst/include/ensmallen_bits/rmsprop/rmsprop.hpp
index 9c3607a..53bb3e2 100644
--- a/inst/include/ensmallen_bits/rmsprop/rmsprop.hpp
+++ b/inst/include/ensmallen_bits/rmsprop/rmsprop.hpp
@@ -109,7 +109,7 @@ class RMSProp
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/rmsprop/rmsprop_update.hpp b/inst/include/ensmallen_bits/rmsprop/rmsprop_update.hpp
index c8507ba..e769c28 100644
--- a/inst/include/ensmallen_bits/rmsprop/rmsprop_update.hpp
+++ b/inst/include/ensmallen_bits/rmsprop/rmsprop_update.hpp
@@ -76,6 +76,8 @@ class RMSPropUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -85,10 +87,16 @@ class RMSPropUpdate
      * @param cols Number of columns in the gradient matrix.
      */
     Policy(RMSPropUpdate& parent, const size_t rows, const size_t cols) :
-        parent(parent)
+        parent(parent),
+        epsilon(ElemType(parent.epsilon)),
+        alpha(ElemType(parent.alpha))
     {
       // Leaky sum of squares of parameter gradient.
       meanSquaredGradient.zeros(rows, cols);
+
+      // Attempt to catch underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -102,10 +110,10 @@ class RMSPropUpdate
                 const double stepSize,
                 const GradType& gradient)
     {
-      meanSquaredGradient *= parent.alpha;
-      meanSquaredGradient += (1 - parent.alpha) * (gradient % gradient);
-      iterate -= stepSize * gradient / (arma::sqrt(meanSquaredGradient) +
-          parent.epsilon);
+      meanSquaredGradient *= alpha;
+      meanSquaredGradient += (1 - alpha) * (gradient % gradient);
+      iterate -= ElemType(stepSize) * gradient / (sqrt(meanSquaredGradient) +
+          epsilon);
     }
 
    private:
@@ -113,6 +121,9 @@ class RMSPropUpdate
     GradType meanSquaredGradient;
     // Reference to instantiated parent object.
     RMSPropUpdate& parent;
+    // Parameters converted to the element type of the optimization.
+    ElemType epsilon;
+    ElemType alpha;
   };
 
  private:
diff --git a/inst/include/ensmallen_bits/sa/exponential_schedule.hpp b/inst/include/ensmallen_bits/sa/exponential_schedule.hpp
index ff9acf5..25baa20 100644
--- a/inst/include/ensmallen_bits/sa/exponential_schedule.hpp
+++ b/inst/include/ensmallen_bits/sa/exponential_schedule.hpp
@@ -46,11 +46,10 @@ class ExponentialSchedule
    * @param currentEnergy Current energy of system (not used).
    */
   template<typename ElemType>
-  double NextTemperature(
-      const double currentTemperature,
-      const ElemType /* currentEnergy */)
+  ElemType NextTemperature(
+      const double currentTemperature, const ElemType /* currentEnergy */)
   {
-    return (1 - lambda) * currentTemperature;
+    return ElemType((1 - lambda) * currentTemperature);
   }
 
   //! Get the cooling speed, lambda.
diff --git a/inst/include/ensmallen_bits/sa/sa_impl.hpp b/inst/include/ensmallen_bits/sa/sa_impl.hpp
index a969d5c..e2d70a4 100644
--- a/inst/include/ensmallen_bits/sa/sa_impl.hpp
+++ b/inst/include/ensmallen_bits/sa/sa_impl.hpp
@@ -76,9 +76,9 @@ typename MatType::elem_type SA<CoolingScheduleType>::Optimize(
   size_t idx = 0;
   size_t sweepCounter = 0;
 
-  BaseMatType accept(rows, cols, arma::fill::zeros);
-  BaseMatType moveSize(rows, cols, arma::fill::none);
-  moveSize.fill(initMoveCoef);
+  BaseMatType accept(rows, cols);
+  BaseMatType moveSize(rows, cols, GetFillType<BaseMatType>::none);
+  moveSize.fill(ElemType(initMoveCoef));
 
   Callback::BeginOptimization(*this, function, iterate, callbacks...);
 
@@ -158,7 +158,7 @@ bool SA<CoolingScheduleType>::GenerateMove(
   // MoveControl() is derived for the Laplace distribution.
 
   // Sample from a Laplace distribution with scale parameter moveSize(idx).
-  const double unif = 2.0 * arma::randu() - 1.0;
+  const ElemType unif = 2 * arma::randu<ElemType>() - 1;
   const ElemType move = (unif < 0) ? (moveSize(idx) * std::log(1 + unif)) :
       (-moveSize(idx) * std::log(1 - unif));
 
@@ -219,17 +219,15 @@ inline void SA<CoolingScheduleType>::MoveControl(const size_t nMoves,
                                                  MatType& accept,
                                                  MatType& moveSize)
 {
-  MatType target;
-  target.copy_size(accept);
-  target.fill(0.44);
-  moveSize = arma::log(moveSize);
-  moveSize += gain * (accept / (double) nMoves - target);
-  moveSize = arma::exp(moveSize);
-
-  // To avoid the use of element-wise arma::min(), which is only available in
-  // Armadillo after v3.930, we use a for loop here instead.
-  for (size_t i = 0; i < accept.n_elem; ++i)
-    moveSize(i) = (moveSize(i) > maxMoveCoef) ? maxMoveCoef : moveSize(i);
+  typedef typename MatType::elem_type ElemType;
+
+  MatType target(accept.n_rows, accept.n_cols, GetFillType<MatType>::none);
+  target.fill(ElemType(0.44));
+
+  moveSize = log(moveSize);
+  moveSize += ElemType(gain) * (accept / (ElemType) nMoves - target);
+  moveSize = exp(moveSize);
+  moveSize.clamp(ElemType(-maxMoveCoef), ElemType(maxMoveCoef));
 
   accept.zeros();
 }
diff --git a/inst/include/ensmallen_bits/sarah/sarah.hpp b/inst/include/ensmallen_bits/sarah/sarah.hpp
index 074f462..e92aa95 100644
--- a/inst/include/ensmallen_bits/sarah/sarah.hpp
+++ b/inst/include/ensmallen_bits/sarah/sarah.hpp
@@ -97,7 +97,7 @@ class SARAHType
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/sarah/sarah_impl.hpp b/inst/include/ensmallen_bits/sarah/sarah_impl.hpp
index 3b001cd..d7a7b80 100644
--- a/inst/include/ensmallen_bits/sarah/sarah_impl.hpp
+++ b/inst/include/ensmallen_bits/sarah/sarah_impl.hpp
@@ -45,8 +45,8 @@ template<typename SeparableFunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 SARAHType<UpdatePolicyType>::Optimize(
     SeparableFunctionType& functionIn,
     MatType& iterateIn,
@@ -145,15 +145,15 @@ SARAHType<UpdatePolicyType>::Optimize(
 
       f += effectiveBatchSize;
     }
-    v /= (double) numFunctions;
+    v /= (ElemType) numFunctions;
 
     if (terminate)
       break;
 
     // Update iterate with full gradient (v).
-    iterate -= stepSize * v;
+    iterate -= ElemType(stepSize) * v;
 
-    const ElemType vNorm = arma::norm(v);
+    const ElemType vNorm = norm(v);
 
     for (size_t f = 0, currentFunction = 0; f < innerIterations;
         /* incrementing done manually */)
@@ -228,7 +228,8 @@ SARAHType<UpdatePolicyType>::Optimize(
     for (size_t i = 0; i < numFunctions; i += batchSize)
     {
       const size_t effectiveBatchSize = std::min(batchSize, numFunctions - i);
-      const ElemType objective = function.Evaluate(iterate, i, effectiveBatchSize);
+      const ElemType objective = function.Evaluate(iterate, i,
+          effectiveBatchSize);
       overallObjective += objective;
 
       // The optimization is finished, so we don't need to care about the result
diff --git a/inst/include/ensmallen_bits/sarah/sarah_plus_update.hpp b/inst/include/ensmallen_bits/sarah/sarah_plus_update.hpp
index 2ddf64a..12e669c 100644
--- a/inst/include/ensmallen_bits/sarah/sarah_plus_update.hpp
+++ b/inst/include/ensmallen_bits/sarah/sarah_plus_update.hpp
@@ -52,10 +52,12 @@ class SARAHPlusUpdate
               const double stepSize,
               const double vNorm)
   {
-    v += (gradient - gradient0) / (double) batchSize;
-    iterate -= stepSize * v;
+    typedef typename MatType::elem_type ElemType;
 
-    if (arma::norm(v) <= gamma * vNorm)
+    v += (gradient - gradient0) / (ElemType) batchSize;
+    iterate -= ElemType(stepSize) * v;
+
+    if (norm(v) <= ElemType(gamma * vNorm))
       return true;
 
     return false;
diff --git a/inst/include/ensmallen_bits/sarah/sarah_update.hpp b/inst/include/ensmallen_bits/sarah/sarah_update.hpp
index 0c38ba4..0a63f02 100644
--- a/inst/include/ensmallen_bits/sarah/sarah_update.hpp
+++ b/inst/include/ensmallen_bits/sarah/sarah_update.hpp
@@ -41,8 +41,10 @@ class SARAHUpdate
               const double stepSize,
               const double /* vNorm */)
   {
-    v += (gradient - gradient0) / (double) batchSize;
-    iterate -= stepSize * v;
+    typedef typename MatType::elem_type ElemType;
+
+    v += (gradient - gradient0) / (ElemType) batchSize;
+    iterate -= ElemType(stepSize) * v;
     return false;
   }
 };
diff --git a/inst/include/ensmallen_bits/sdp/lin_alg.hpp b/inst/include/ensmallen_bits/sdp/lin_alg.hpp
index bfd70c7..6cdf1ac 100644
--- a/inst/include/ensmallen_bits/sdp/lin_alg.hpp
+++ b/inst/include/ensmallen_bits/sdp/lin_alg.hpp
@@ -92,7 +92,7 @@ inline void Smat(const MatAType& input, MatBType& output)
   MatBType iMat(input);
 
   const size_t n = static_cast<size_t>
-      (ceil((-1. + sqrt(1. + 8. * iMat.n_elem))/2.));
+      (ceil((-1. + std::sqrt(1. + 8. * iMat.n_elem))/2.));
 
   output.zeros(n, n);
 
diff --git a/inst/include/ensmallen_bits/sdp/lrsdp.hpp b/inst/include/ensmallen_bits/sdp/lrsdp.hpp
index c918163..de85d83 100644
--- a/inst/include/ensmallen_bits/sdp/lrsdp.hpp
+++ b/inst/include/ensmallen_bits/sdp/lrsdp.hpp
@@ -75,6 +75,22 @@ class LRSDP
   typename MatType::elem_type Optimize(MatType& coordinates,
                                        CallbackTypes&&... callbacks);
 
+  /**
+   * Optimize the LRSDP and return the final objective value, using the given
+   * starting Lagrange multipliers and penalty parameter for the augmented
+   * Lagrangian inner optimizer.  The given coordinates will be modified to
+   * contain the final solution, and the given lambda/sigma will be modified to
+   * contain the final values.
+   *
+   * @param coordinates Starting coordinates for the optimization.
+   * @param callbacks Callback functions.
+   */
+  template<typename MatType, typename VecType, typename... CallbackTypes>
+  typename MatType::elem_type Optimize(MatType& coordinates,
+                                       VecType& lambda,
+                                       double& sigma,
+                                       CallbackTypes&&... callbacks);
+
   //! Return the SDP that will be solved.
   const SDPType& SDP() const { return function.SDP(); }
   //! Modify the SDP that will be solved.
diff --git a/inst/include/ensmallen_bits/sdp/lrsdp_function.hpp b/inst/include/ensmallen_bits/sdp/lrsdp_function.hpp
index a0ac114..da6050d 100644
--- a/inst/include/ensmallen_bits/sdp/lrsdp_function.hpp
+++ b/inst/include/ensmallen_bits/sdp/lrsdp_function.hpp
@@ -101,8 +101,7 @@ class LRSDPFunction
   template<typename MatType = arma::mat>
   MatType GetInitialPoint() const
   {
-    MatType result = arma::conv_to<MatType>::from(initialPoint);
-    return result;
+    return conv_to<MatType>::from(initialPoint);
   }
 
   //! Return the SDP object representing the problem.
@@ -143,48 +142,48 @@ class LRSDPFunction
 template<>
 template<typename MatType>
 inline typename MatType::elem_type
-AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_mat>>>::Evaluate(
+AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_mat>>, arma::vec>::Evaluate(
     const MatType& coordinates) const;
 
 template<>
 template<typename MatType>
 inline typename MatType::elem_type
-AugLagrangianFunction<LRSDPFunction<SDP<arma::mat>>>::Evaluate(
+AugLagrangianFunction<LRSDPFunction<SDP<arma::mat>>, arma::vec>::Evaluate(
     const MatType& coordinates) const;
 
 template<>
 template<typename MatType, typename GradType>
-inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_mat>>>::Gradient(
+inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_mat>>, arma::vec>::Gradient(
     const MatType& coordinates,
     GradType& gradient) const;
 
 template<>
 template<typename MatType, typename GradType>
-inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::mat>>>::Gradient(
+inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::mat>>, arma::vec>::Gradient(
     const MatType& coordinates,
     GradType& gradient) const;
 
 template<>
 template<typename MatType>
 inline typename MatType::elem_type
-AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_fmat>>>::Evaluate(
+AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_fmat>>, arma::vec>::Evaluate(
     const MatType& coordinates) const;
 
 template<>
 template<typename MatType>
 inline typename MatType::elem_type
-AugLagrangianFunction<LRSDPFunction<SDP<arma::fmat>>>::Evaluate(
+AugLagrangianFunction<LRSDPFunction<SDP<arma::fmat>>, arma::vec>::Evaluate(
     const MatType& coordinates) const;
 
 template<>
 template<typename MatType, typename GradType>
-inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_fmat>>>::Gradient(
+inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_fmat>>, arma::vec>::Gradient(
     const MatType& coordinates,
     GradType& gradient) const;
 
 template<>
 template<typename MatType, typename GradType>
-inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::fmat>>>::Gradient(
+inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::fmat>>, arma::vec>::Gradient(
     const MatType& coordinates,
     GradType& gradient) const;
 
diff --git a/inst/include/ensmallen_bits/sdp/lrsdp_function_impl.hpp b/inst/include/ensmallen_bits/sdp/lrsdp_function_impl.hpp
index 6cf2230..f30d70a 100644
--- a/inst/include/ensmallen_bits/sdp/lrsdp_function_impl.hpp
+++ b/inst/include/ensmallen_bits/sdp/lrsdp_function_impl.hpp
@@ -109,10 +109,10 @@ void LRSDPFunction<SDPType>::GradientConstraint(
          "for arbitrary optimizers!");
 }
 
-//! Utility function for updating R*R^T matrix.
-//! Note: Caching R*R^T provide significant computation optimization
-//! by reducing redundant R*R^T calculations in case of functions are not used
-//! updating coordinates matrix, hence leaving R*R^T unchanged.
+// Utility function for updating R*R^T matrix.
+// Note: Caching R*R^T provide significant computation optimization
+// by reducing redundant R*R^T calculations in case of functions are not used
+// updating coordinates matrix, hence leaving R*R^T unchanged.
 template<typename SDPType, typename MatType>
 void UpdateRRT(LRSDPFunction<SDPType>& function,
                MatType&& newrrt)
@@ -120,15 +120,15 @@ void UpdateRRT(LRSDPFunction<SDPType>& function,
   function.template RRT<MatType>() = std::move(newrrt);
 }
 
-//! Utility function for calculating part of the objective when AugLagrangian is
-//! used with an LRSDPFunction.
+// Utility function for calculating part of the objective when AugLagrangian is
+// used with an LRSDPFunction.
 template <typename MatrixType, typename VecType, typename MatType>
 static inline void
 UpdateObjective(typename MatType::elem_type& objective,
                 const MatType& rrt,
                 const std::vector<MatrixType>& ais,
                 const VecType& bis,
-                const arma::vec& lambda,
+                const VecType& lambda,
                 const size_t lambdaOffset,
                 const double sigma)
 {
@@ -144,15 +144,15 @@ UpdateObjective(typename MatType::elem_type& objective,
   }
 }
 
-//! Utility function for calculating part of the gradient when AugLagrangian is
-//! used with an LRSDPFunction.
+// Utility function for calculating part of the gradient when AugLagrangian is
+// used with an LRSDPFunction.
 template <typename MatrixType, typename VecType, typename MatType>
 static inline void
 UpdateGradient(MatType& s,
                const MatType& rrt,
                const std::vector<MatrixType>& ais,
                const VecType& bis,
-               const arma::vec& lambda,
+               const VecType& lambda,
                const size_t lambdaOffset,
                const double sigma)
 {
@@ -167,11 +167,11 @@ UpdateGradient(MatType& s,
   }
 }
 
-template<typename SDPType, typename MatType>
+template<typename SDPType, typename MatType, typename VecType>
 static inline double
 EvaluateImpl(LRSDPFunction<SDPType>& function,
              const MatType& coordinates,
-             const arma::vec& lambda,
+             const VecType& lambda,
              const double sigma)
 {
   // We can calculate the entire objective in a smart way.
@@ -220,11 +220,14 @@ EvaluateImpl(LRSDPFunction<SDPType>& function,
   return objective;
 }
 
-template<typename SDPType, typename MatType, typename GradType>
+template<typename SDPType,
+         typename MatType,
+         typename VecType,
+         typename GradType>
 static inline void
 GradientImpl(const LRSDPFunction<SDPType>& function,
              const MatType& coordinates,
-             const arma::vec& lambda,
+             const VecType& lambda,
              const double sigma,
              GradType& gradient)
 {
@@ -254,7 +257,7 @@ GradientImpl(const LRSDPFunction<SDPType>& function,
 template<>
 template<typename MatType>
 inline typename MatType::elem_type
-AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_mat>>>::Evaluate(
+AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_mat>>, arma::vec>::Evaluate(
     const MatType& coordinates) const
 {
   return EvaluateImpl(function, coordinates, lambda, sigma);
@@ -263,7 +266,7 @@ AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_mat>>>::Evaluate(
 template<>
 template<typename MatType>
 inline typename MatType::elem_type
-AugLagrangianFunction<LRSDPFunction<SDP<arma::mat>>>::Evaluate(
+AugLagrangianFunction<LRSDPFunction<SDP<arma::mat>>, arma::vec>::Evaluate(
     const MatType& coordinates) const
 {
   return EvaluateImpl(function, coordinates, lambda, sigma);
@@ -271,7 +274,7 @@ AugLagrangianFunction<LRSDPFunction<SDP<arma::mat>>>::Evaluate(
 
 template<>
 template<typename MatType, typename GradType>
-inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_mat>>>::Gradient(
+inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_mat>>, arma::vec>::Gradient(
     const MatType& coordinates,
     GradType& gradient) const
 {
@@ -280,7 +283,7 @@ inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_mat>>>::Gradient(
 
 template<>
 template<typename MatType, typename GradType>
-inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::mat>>>::Gradient(
+inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::mat>>, arma::vec>::Gradient(
     const MatType& coordinates,
     GradType& gradient) const
 {
@@ -290,7 +293,7 @@ inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::mat>>>::Gradient(
 template<>
 template<typename MatType>
 inline typename MatType::elem_type
-AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_fmat>>>::Evaluate(
+AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_fmat>>, arma::fvec>::Evaluate(
     const MatType& coordinates) const
 {
   return EvaluateImpl(function, coordinates, lambda, sigma);
@@ -299,7 +302,7 @@ AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_fmat>>>::Evaluate(
 template<>
 template<typename MatType>
 inline typename MatType::elem_type
-AugLagrangianFunction<LRSDPFunction<SDP<arma::fmat>>>::Evaluate(
+AugLagrangianFunction<LRSDPFunction<SDP<arma::fmat>>, arma::fvec>::Evaluate(
     const MatType& coordinates) const
 {
   return EvaluateImpl(function, coordinates, lambda, sigma);
@@ -307,7 +310,7 @@ AugLagrangianFunction<LRSDPFunction<SDP<arma::fmat>>>::Evaluate(
 
 template<>
 template<typename MatType, typename GradType>
-inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_fmat>>>::Gradient(
+inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_fmat>>, arma::fvec>::Gradient(
     const MatType& coordinates,
     GradType& gradient) const
 {
@@ -316,7 +319,7 @@ inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::sp_fmat>>>::Gradient(
 
 template<>
 template<typename MatType, typename GradType>
-inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::fmat>>>::Gradient(
+inline void AugLagrangianFunction<LRSDPFunction<SDP<arma::fmat>>, arma::fvec>::Gradient(
     const MatType& coordinates,
     GradType& gradient) const
 {
diff --git a/inst/include/ensmallen_bits/sdp/lrsdp_impl.hpp b/inst/include/ensmallen_bits/sdp/lrsdp_impl.hpp
index 85c5479..1fd0094 100644
--- a/inst/include/ensmallen_bits/sdp/lrsdp_impl.hpp
+++ b/inst/include/ensmallen_bits/sdp/lrsdp_impl.hpp
@@ -35,9 +35,28 @@ typename MatType::elem_type LRSDP<SDPType>::Optimize(
   function.RRTAny().template Set<MatType>(
       new MatType(coordinates * coordinates.t()));
 
-  augLag.Sigma() = 10;
   augLag.MaxIterations() = maxIterations;
-  augLag.Optimize(function, coordinates, callbacks...);
+  typename ForwardType<MatType>::bvec lambda(function.NumConstraints());
+  double sigma = 10;
+  augLag.Optimize(function, coordinates, lambda, sigma, callbacks...);
+
+  return function.Evaluate(coordinates);
+}
+
+template<typename SDPType>
+template<typename MatType, typename VecType, typename... CallbackTypes>
+typename MatType::elem_type LRSDP<SDPType>::Optimize(
+    MatType& coordinates,
+    VecType& lambda,
+    double& sigma,
+    CallbackTypes&&... callbacks)
+{
+  function.RRTAny().Clean();
+  function.RRTAny().template Set<MatType>(
+      new MatType(coordinates * coordinates.t()));
+
+  augLag.MaxIterations() = maxIterations;
+  augLag.Optimize(function, coordinates, lambda, sigma, callbacks...);
 
   return function.Evaluate(coordinates);
 }
diff --git a/inst/include/ensmallen_bits/sdp/primal_dual_impl.hpp b/inst/include/ensmallen_bits/sdp/primal_dual_impl.hpp
index d7b65ee..38d16b0 100644
--- a/inst/include/ensmallen_bits/sdp/primal_dual_impl.hpp
+++ b/inst/include/ensmallen_bits/sdp/primal_dual_impl.hpp
@@ -92,7 +92,7 @@ Alpha(const MatType& a, const MatType& dA, double tau, double& alpha)
  *
  * where A, H are symmetric matrices.
  *
- * TODO(stephentu): Note this method current uses arma's builtin arma::syl
+ * TODO(stephentu): Note this method current uses arma's builtin arma::sylvester
  * method, which is overkill for this situation. See Lemma 7.2 of [AHO98] for
  * how to solve this Lyapunov equation using an eigenvalue decomposition of A.
  *
@@ -101,7 +101,7 @@ template<typename MatType, typename AType, typename BType>
 static inline void
 SolveLyapunov(MatType& x, const AType& a, const BType& h)
 {
-  arma::syl(x, a, a, -h);
+  arma::sylvester(x, a, a, -h);
 }
 
 /**
@@ -163,7 +163,7 @@ SolveKKTSystem(const SparseConstraintType& aSparse,
   }
 
   MatType subTerm(aSparse.n_cols, 1, arma::fill::zeros);
-  
+
   if (aSparse.n_rows)
   {
     dySparse = dy(arma::span(0, aSparse.n_rows - 1), 0);
@@ -483,8 +483,8 @@ typename MatType::elem_type PrimalDualSolver::Optimize(
     const double sparsePrimalInfeas = arma::norm(sdp.SparseB() - aSparse * sx,
         2);
     const double densePrimalInfeas = arma::norm(sdp.DenseB() - aDense * sx, 2);
-    const double primalInfeas = sqrt(sparsePrimalInfeas * sparsePrimalInfeas +
-        densePrimalInfeas * densePrimalInfeas);
+    const double primalInfeas = std::sqrt(sparsePrimalInfeas *
+        sparsePrimalInfeas + densePrimalInfeas * densePrimalInfeas);
 
     primalObj = arma::dot(sdp.C(), coordinates);
 
diff --git a/inst/include/ensmallen_bits/sgd/sgd.hpp b/inst/include/ensmallen_bits/sgd/sgd.hpp
index 3b8a92a..c81516f 100644
--- a/inst/include/ensmallen_bits/sgd/sgd.hpp
+++ b/inst/include/ensmallen_bits/sgd/sgd.hpp
@@ -124,7 +124,7 @@ class SGD
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/sgd/sgd_impl.hpp b/inst/include/ensmallen_bits/sgd/sgd_impl.hpp
index d34115b..a6b98f5 100644
--- a/inst/include/ensmallen_bits/sgd/sgd_impl.hpp
+++ b/inst/include/ensmallen_bits/sgd/sgd_impl.hpp
@@ -58,8 +58,8 @@ template<typename SeparableFunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 SGD<UpdatePolicyType, DecayPolicyType>::Optimize(
     SeparableFunctionType& function,
     MatType& iterateIn,
@@ -150,8 +150,14 @@ SGD<UpdatePolicyType, DecayPolicyType>::Optimize(
         gradient, callbacks...);
 
     // Use the update policy to take a step.
+    // TODO: remove old behavior in ensmallen 4.0.0.
+    #if defined(ENS_OLD_SEPARABLE_STEP_BEHAVIOR)
     instUpdatePolicy.As<InstUpdatePolicyType>().Update(iterate, stepSize,
         gradient);
+    #else
+    instUpdatePolicy.As<InstUpdatePolicyType>().Update(iterate,
+        (stepSize / effectiveBatchSize), gradient);
+    #endif
 
     terminate |= Callback::StepTaken(*this, f, iterate, callbacks...);
 
@@ -194,9 +200,12 @@ SGD<UpdatePolicyType, DecayPolicyType>::Optimize(
           overallObjective, callbacks...);
 
       // Reset the counter variables.
-      lastObjective = overallObjective;
-      overallObjective = 0;
-      currentFunction = 0;
+      if (i != actualMaxIterations)
+      {
+        lastObjective = overallObjective;
+        overallObjective = 0;
+        currentFunction = 0;
+      }
 
       if (shuffle) // Determine order of visitation.
         f.Shuffle();
diff --git a/inst/include/ensmallen_bits/sgd/update_policies/momentum_update.hpp b/inst/include/ensmallen_bits/sgd/update_policies/momentum_update.hpp
index 5c19e63..6d8555d 100644
--- a/inst/include/ensmallen_bits/sgd/update_policies/momentum_update.hpp
+++ b/inst/include/ensmallen_bits/sgd/update_policies/momentum_update.hpp
@@ -84,6 +84,8 @@ class MomentumUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This is called by the optimizer method before the start of the iteration
      * update process.
@@ -94,9 +96,10 @@ class MomentumUpdate
      */
     Policy(const MomentumUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
-        velocity(arma::zeros<MatType>(rows, cols))
+        velocity(rows, cols),
+        momentum(ElemType(parent.momentum))
     {
-      // Nothing to do.
+      // Nothing to do here.
     }
 
     /**
@@ -112,7 +115,7 @@ class MomentumUpdate
                 const double stepSize,
                 const GradType& gradient)
     {
-      velocity = parent.momentum * velocity - stepSize * gradient;
+      velocity = momentum * velocity - ElemType(stepSize) * gradient;
       iterate += velocity;
     }
 
@@ -121,6 +124,8 @@ class MomentumUpdate
     const MomentumUpdate& parent;
     // The velocity matrix.
     MatType velocity;
+    // The momentum, converted to the element type of the optimization.
+    ElemType momentum;
   };
 
  private:
diff --git a/inst/include/ensmallen_bits/sgd/update_policies/nesterov_momentum_update.hpp b/inst/include/ensmallen_bits/sgd/update_policies/nesterov_momentum_update.hpp
index 540cb55..d1ecc3c 100644
--- a/inst/include/ensmallen_bits/sgd/update_policies/nesterov_momentum_update.hpp
+++ b/inst/include/ensmallen_bits/sgd/update_policies/nesterov_momentum_update.hpp
@@ -58,6 +58,8 @@ class NesterovMomentumUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This is called by the optimizer method before the start of the iteration
      * update process.
@@ -70,9 +72,10 @@ class NesterovMomentumUpdate
            const size_t rows,
            const size_t cols) :
         parent(parent),
-        velocity(arma::zeros<MatType>(rows, cols))
+        velocity(rows, cols),
+        momentum(ElemType(parent.momentum))
     {
-      // Nothing to do.
+      // Nothing to do here.
     }
 
     /**
@@ -89,9 +92,8 @@ class NesterovMomentumUpdate
                 const double stepSize,
                 const GradType& gradient)
     {
-      velocity = parent.momentum * velocity - stepSize * gradient;
-
-      iterate += parent.momentum * velocity - stepSize * gradient;
+      velocity = momentum * velocity - ElemType(stepSize) * gradient;
+      iterate += momentum * velocity - ElemType(stepSize) * gradient;
     }
 
    private:
@@ -99,6 +101,8 @@ class NesterovMomentumUpdate
     const NesterovMomentumUpdate& parent;
     // The velocity matrix.
     MatType velocity;
+    // The momentum, converted to the element type of the optimization.
+    ElemType momentum;
   };
 
  private:
diff --git a/inst/include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp b/inst/include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp
index 788cd1f..c99df52 100644
--- a/inst/include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp
+++ b/inst/include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp
@@ -42,8 +42,7 @@ class QHUpdate
    */
   QHUpdate(const double v = 0.7,
            const double momentum = 0.999) :
-       momentum(momentum),
-       v(v)
+       momentum(momentum), v(v)
   {
     // Nothing to do.
   }
@@ -68,6 +67,8 @@ class QHUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -77,10 +78,12 @@ class QHUpdate
      * @param cols Number of columns in the gradient matrix.
      */
     Policy(QHUpdate& parent, const size_t rows, const size_t cols) :
-        parent(parent)
+        parent(parent),
+        velocity(rows, cols),
+        momentum(ElemType(parent.momentum)),
+        v(ElemType(parent.v))
     {
-      // Initialize an empty velocity matrix.
-      velocity.zeros(rows, cols);
+      // Nothing to do here.
     }
 
     /**
@@ -94,18 +97,22 @@ class QHUpdate
                 const double stepSize,
                 const GradType& gradient)
     {
-      velocity *= parent.momentum;
-      velocity += (1 - parent.momentum) * gradient;
+      velocity *= momentum;
+      velocity += (1 - momentum) * gradient;
 
-      iterate -= stepSize * ((1 - parent.v) * gradient + parent.v * velocity);
+      iterate -= ElemType(stepSize) * ((1 - v) * gradient + v * velocity);
     }
 
    private:
-    //! Instantiated parent object.
+    // Instantiated parent object.
     QHUpdate& parent;
 
-    //! The velocity matrix.
+    // The velocity matrix.
     GradType velocity;
+
+    // Parameters converted to the element type of the optimization.
+    ElemType momentum;
+    ElemType v;
   };
 
  private:
diff --git a/inst/include/ensmallen_bits/sgd/update_policies/vanilla_update.hpp b/inst/include/ensmallen_bits/sgd/update_policies/vanilla_update.hpp
index 41d75fe..8212f3a 100644
--- a/inst/include/ensmallen_bits/sgd/update_policies/vanilla_update.hpp
+++ b/inst/include/ensmallen_bits/sgd/update_policies/vanilla_update.hpp
@@ -37,6 +37,8 @@ class VanillaUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This is called by the optimizer method before the start of the iteration
      * update process.  The vanilla update doesn't initialize anything.
@@ -63,7 +65,7 @@ class VanillaUpdate
                 const GradType& gradient)
     {
       // Perform the vanilla SGD update.
-      iterate -= stepSize * gradient;
+      iterate -= ElemType(stepSize) * gradient;
     }
   };
 };
diff --git a/inst/include/ensmallen_bits/sgdr/cyclical_decay.hpp b/inst/include/ensmallen_bits/sgdr/cyclical_decay.hpp
index 6591e9a..3776adb 100644
--- a/inst/include/ensmallen_bits/sgdr/cyclical_decay.hpp
+++ b/inst/include/ensmallen_bits/sgdr/cyclical_decay.hpp
@@ -129,7 +129,7 @@ class CyclicalDecay
       {
         // n_t = n_min^i + 0.5(n_max^i - n_min^i)(1 + cos(T_cur/T_i * pi)).
         stepSize = 0.5 * parent.constStepSize *
-            (1 + cos((parent.batchRestart / parent.epochBatches)
+            (1 + std::cos((parent.batchRestart / parent.epochBatches)
             * arma::datum::pi));
 
         // Keep track of the number of batches since the last restart.
diff --git a/inst/include/ensmallen_bits/sgdr/sgdr.hpp b/inst/include/ensmallen_bits/sgdr/sgdr.hpp
index de2f6c2..30faaf2 100644
--- a/inst/include/ensmallen_bits/sgdr/sgdr.hpp
+++ b/inst/include/ensmallen_bits/sgdr/sgdr.hpp
@@ -103,7 +103,7 @@ class SGDR
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/sgdr/sgdr_impl.hpp b/inst/include/ensmallen_bits/sgdr/sgdr_impl.hpp
index defbad8..13e840e 100644
--- a/inst/include/ensmallen_bits/sgdr/sgdr_impl.hpp
+++ b/inst/include/ensmallen_bits/sgdr/sgdr_impl.hpp
@@ -49,8 +49,8 @@ SGDR<UpdatePolicyType>::SGDR(
 template<typename UpdatePolicyType>
 template<typename SeparableFunctionType, typename MatType, typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 SGDR<UpdatePolicyType>::Optimize(
     SeparableFunctionType& function,
     MatType& iterate,
diff --git a/inst/include/ensmallen_bits/sgdr/snapshot_ensembles.hpp b/inst/include/ensmallen_bits/sgdr/snapshot_ensembles.hpp
index a318c02..7f221b7 100644
--- a/inst/include/ensmallen_bits/sgdr/snapshot_ensembles.hpp
+++ b/inst/include/ensmallen_bits/sgdr/snapshot_ensembles.hpp
@@ -155,7 +155,7 @@ class SnapshotEnsembles
       {
         // n_t = n_min^i + 0.5(n_max^i - n_min^i)(1 + cos(T_cur/T_i * pi)).
         stepSize = 0.5 * parent.constStepSize *
-            (1 + cos((parent.batchRestart / parent.epochBatches)
+            (1 + std::cos((parent.batchRestart / parent.epochBatches)
             * arma::datum::pi));
 
         // Keep track of the number of batches since the last restart.
diff --git a/inst/include/ensmallen_bits/sgdr/snapshot_sgdr.hpp b/inst/include/ensmallen_bits/sgdr/snapshot_sgdr.hpp
index 0187584..4a30b67 100644
--- a/inst/include/ensmallen_bits/sgdr/snapshot_sgdr.hpp
+++ b/inst/include/ensmallen_bits/sgdr/snapshot_sgdr.hpp
@@ -120,7 +120,7 @@ class SnapshotSGDR
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
@@ -169,15 +169,37 @@ class SnapshotSGDR
   //! Modify whether or not the actual objective is calculated.
   bool& ExactObjective() { return optimizer.ExactObjective(); }
 
-  //! Get the snapshots.
-  std::vector<arma::mat> Snapshots() const
+  // Get the snapshots.  The template parameters must be the same as the last
+  // call to Optimize()!
+  template<typename MatType = arma::mat, typename GradType = MatType>
+  std::vector<MatType> Snapshots() const
   {
-    return optimizer.DecayPolicy().Snapshots();
+    if (!optimizer.InstDecayPolicy().template Has<
+        SnapshotEnsembles::Policy<MatType, GradType>>())
+    {
+      throw std::runtime_error("SnapshotSGDR::Snapshots(): got unexpected type;"
+          " make sure to call with the same matrix type as the previous "
+          "optimization!");
+    }
+
+    return optimizer.InstDecayPolicy().template As<
+        SnapshotEnsembles::Policy<MatType, GradType>>().Snapshots();
   }
-  //! Modify the snapshots.
-  std::vector<arma::mat>& Snapshots()
+  // Modify the snapshots.  The template parameters must be the same as the last
+  // call to Optimize()!
+  template<typename MatType = arma::mat, typename GradType = MatType>
+  std::vector<MatType>& Snapshots()
   {
-    return optimizer.DecayPolicy().Snapshots();
+    if (!optimizer.InstDecayPolicy().template Has<
+        SnapshotEnsembles::Policy<MatType, GradType>>())
+    {
+      throw std::runtime_error("SnapshotSGDR::Snapshots(): got unexpected type;"
+          " make sure to call with the same matrix type as the previous "
+          "optimization!");
+    }
+
+    return optimizer.InstDecayPolicy().template As<
+        SnapshotEnsembles::Policy<MatType, GradType>>().Snapshots();
   }
 
   //! Get whether or not to accumulate the snapshots.
diff --git a/inst/include/ensmallen_bits/sgdr/snapshot_sgdr_impl.hpp b/inst/include/ensmallen_bits/sgdr/snapshot_sgdr_impl.hpp
index 38f39c1..fd13c53 100644
--- a/inst/include/ensmallen_bits/sgdr/snapshot_sgdr_impl.hpp
+++ b/inst/include/ensmallen_bits/sgdr/snapshot_sgdr_impl.hpp
@@ -57,8 +57,8 @@ template<typename SeparableFunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 SnapshotSGDR<UpdatePolicyType>::Optimize(
     SeparableFunctionType& function,
     MatType& iterate,
diff --git a/inst/include/ensmallen_bits/smorms3/smorms3.hpp b/inst/include/ensmallen_bits/smorms3/smorms3.hpp
index e45f5ed..d603c96 100644
--- a/inst/include/ensmallen_bits/smorms3/smorms3.hpp
+++ b/inst/include/ensmallen_bits/smorms3/smorms3.hpp
@@ -92,14 +92,12 @@ class SMORMS3
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
            CallbackTypes&&... callbacks)
   {
-    // TODO: disallow sp_mat
-
     return optimizer.Optimize<SeparableFunctionType, MatType, GradType,
         CallbackTypes...>(function, iterate,
         std::forward<CallbackTypes>(callbacks)...);
diff --git a/inst/include/ensmallen_bits/smorms3/smorms3_update.hpp b/inst/include/ensmallen_bits/smorms3/smorms3_update.hpp
index 98f49b6..eb0520c 100644
--- a/inst/include/ensmallen_bits/smorms3/smorms3_update.hpp
+++ b/inst/include/ensmallen_bits/smorms3/smorms3_update.hpp
@@ -37,15 +37,14 @@ class SMORMS3Update
   /**
    * Construct the SMORMS3 update policy with given epsilon parameter.
    *
-   * @param epsilon Value used to initialise the mean squared gradient
-   *        parameter.
+   * @param epsilon Value used to avoid divisions by zero.
    */
   SMORMS3Update(const double epsilon = 1e-16) : epsilon(epsilon)
   { /* Do nothing. */ }
 
-  //! Get the value used to initialise the mean squared gradient parameter.
+  // Get the value used to avoid divisions by zero.
   double Epsilon() const { return epsilon; }
-  //! Modify the value used to initialise the mean squared gradient parameter.
+  // Modify the value used to avoid divisions by zero.
   double& Epsilon() { return epsilon; }
 
   /**
@@ -57,6 +56,8 @@ class SMORMS3Update
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This is called by the optimizer method before the start of the iteration
      * update process.
@@ -66,12 +67,17 @@ class SMORMS3Update
      * @param cols Number of columns in the gradient matrix.
      */
     Policy(SMORMS3Update& parent, const size_t rows, const size_t cols) :
-        parent(parent)
+        parent(parent),
+        epsilon(ElemType(parent.epsilon))
     {
       // Initialise the parameters mem, g and g2.
       mem.ones(rows, cols);
       g.zeros(rows, cols);
       g2.zeros(rows, cols);
+
+      // Attempt to detect underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -94,30 +100,30 @@ class SMORMS3Update
       g2 = (1 - r) % g2;
       g2 += r % (gradient % gradient);
 
-      MatType x = (g % g) / (g2 + parent.epsilon);
-
-      x.transform( [stepSize](typename MatType::elem_type &v)
-          { return std::min(v, (typename MatType::elem_type) stepSize); } );
+      MatType x = clamp((g % g) / (g2 + epsilon), ElemType(0),
+          ElemType(stepSize));
 
-      iterate -= gradient % x / (arma::sqrt(g2) + parent.epsilon);
+      iterate -= gradient % x / (sqrt(g2) + epsilon);
 
       mem %= (1 - x);
       mem += 1;
     }
 
    private:
-    // Instantiated parent object.
+    //! Instantiated parent object.
     SMORMS3Update& parent;
-    // Memory parameter.
+    //! Memory parameter.
     MatType mem;
-    // Gradient estimate parameter.
+    //! Gradient estimate parameter.
     GradType g;
-    // Squared gradient estimate parameter.
+    //! Squared gradient estimate parameter.
     GradType g2;
+    // Epsilon value converted to the element type of the optimization.
+    ElemType epsilon;
   };
 
  private:
-  //! The value used to initialise the mean squared gradient parameter.
+  // The value used to avoid divisions by zero.
   double epsilon;
 };
 
diff --git a/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd.hpp b/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd.hpp
index fad6f00..164f3c8 100644
--- a/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd.hpp
+++ b/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd.hpp
@@ -135,7 +135,7 @@ class SPALeRASGD
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd_impl.hpp b/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd_impl.hpp
index e0cac7d..8706806 100644
--- a/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd_impl.hpp
+++ b/inst/include/ensmallen_bits/spalera_sgd/spalera_sgd_impl.hpp
@@ -58,8 +58,8 @@ template<typename SeparableFunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 SPALeRASGD<DecayPolicyType>::Optimize(
     SeparableFunctionType& function,
     MatType& iterateIn,
@@ -213,9 +213,12 @@ SPALeRASGD<DecayPolicyType>::Optimize(
       }
 
       // Reset the counter variables.
-      lastObjective = overallObjective;
-      overallObjective = 0;
-      currentFunction = 0;
+      if (i != actualMaxIterations)
+      {
+        lastObjective = overallObjective;
+        overallObjective = 0;
+        currentFunction = 0;
+      }
 
       terminate |= Callback::BeginEpoch(*this, f, iterate, epoch,
           overallObjective, callbacks...);
diff --git a/inst/include/ensmallen_bits/spalera_sgd/spalera_stepsize.hpp b/inst/include/ensmallen_bits/spalera_sgd/spalera_stepsize.hpp
index 7b8e7ac..ea39b1e 100644
--- a/inst/include/ensmallen_bits/spalera_sgd/spalera_stepsize.hpp
+++ b/inst/include/ensmallen_bits/spalera_sgd/spalera_stepsize.hpp
@@ -87,6 +87,8 @@ class SPALeRAStepsize
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This is called by the optimizer method before the start of the iteration
      * update process.
@@ -106,12 +108,19 @@ class SPALeRAStepsize
         mn(0),
         relaxedObjective(0),
         phCounter(0),
-        eveCounter(0)
+        eveCounter(0),
+        alpha(ElemType(parent.alpha)),
+        epsilon(ElemType(parent.epsilon)),
+        adaptRate(ElemType(parent.adaptRate))
     {
       learningRates.ones(rows, cols);
       relaxedSums.zeros(rows, cols);
 
       parent.lambda = lambda;
+
+      // Attempt to detect underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -127,7 +136,7 @@ class SPALeRAStepsize
      * @return Stop or continue the learning process.
      */
     bool Update(const double stepSize,
-                const typename MatType::elem_type objective,
+                const ElemType objective,
                 const size_t batchSize,
                 const size_t numFunctions,
                 MatType& iterate,
@@ -135,7 +144,7 @@ class SPALeRAStepsize
     {
       // The ratio of mini-batch size to training set size; needed for the
       // Page-Hinkley relaxed objective computations.
-      const double mbRatio = batchSize / (double) numFunctions;
+      const ElemType mbRatio = batchSize / (ElemType) numFunctions;
 
       // Page-Hinkley iteration, check if we have to reset the parameter and
       // adjust the step size.
@@ -162,7 +171,7 @@ class SPALeRAStepsize
         mn = un;
 
       // If the condition is true we reset the parameter and update parameter.
-      if ((un - mn) > parent.lambda)
+      if ((un - mn) > ElemType(parent.lambda))
       {
         // Backtracking, reset the parameter.
         iterate = previousIterate;
@@ -172,7 +181,9 @@ class SPALeRAStepsize
         // Faster.
         learningRates /= 2;
 
-        if (arma::any(arma::vectorise(learningRates) <= 1e-15))
+        constexpr const ElemType eps =
+            10 * std::numeric_limits<ElemType>::epsilon();
+        if (learningRates.min() <= eps)
         {
           // Stop because learning rate too low.
           return false;
@@ -183,26 +194,26 @@ class SPALeRAStepsize
       }
       else
       {
-        const double paramMean = (parent.alpha / (2 - parent.alpha) *
-            (1 - std::pow(1 - parent.alpha, 2 * (eveCounter + 1)))) /
+        const ElemType paramMean = (alpha / (2 - alpha) *
+            (1 - std::pow(1 - alpha, ElemType(2 * (eveCounter + 1))))) /
             iterate.n_elem;
 
-        const double paramStd = (parent.alpha / std::sqrt(iterate.n_elem)) /
-            std::sqrt(iterate.n_elem);
+        const ElemType paramStd =
+            (alpha / std::sqrt(ElemType(iterate.n_elem))) /
+            std::sqrt(ElemType(iterate.n_elem));
 
-        const typename MatType::elem_type normGradient =
-            std::sqrt(arma::accu(arma::pow(gradient, 2)));
+        const ElemType normGradient = std::sqrt(accu(square(gradient)));
 
-        relaxedSums *= (1 - parent.alpha);
-        if (normGradient > parent.epsilon)
-          relaxedSums += gradient * (parent.alpha / normGradient);
+        relaxedSums *= (1 - alpha);
+        if (normGradient > epsilon)
+          relaxedSums += gradient * (alpha / normGradient);
 
-        learningRates %= arma::exp((arma::pow(relaxedSums, 2) - paramMean) *
-            (parent.adaptRate / paramStd));
+        learningRates %= exp((square(relaxedSums) - paramMean) *
+            (adaptRate / paramStd));
 
         previousIterate = iterate;
 
-        iterate -= stepSize * (learningRates % gradient);
+        iterate -= ElemType(stepSize) * (learningRates % gradient);
 
         // Keep track of the the number of evaluations and Page-Hinkley steps.
         eveCounter++;
@@ -216,25 +227,25 @@ class SPALeRAStepsize
     //! Instantiated parent object.
     SPALeRAStepsize& parent;
 
-    //! Page-Hinkley update parameter.
-    double mu0;
+    // Page-Hinkley update parameter.
+    ElemType mu0;
 
-    //! Page-Hinkley update parameter.
-    double un;
+    // Page-Hinkley update parameter.
+    ElemType un;
 
-    //! Page-Hinkley update parameter.
-    double mn;
+    // Page-Hinkley update parameter.
+    ElemType mn;
 
-    //! Page-Hinkley update parameter.
-    typename MatType::elem_type relaxedObjective;
+    // Page-Hinkley update parameter.
+    ElemType relaxedObjective;
 
-    //! Page-Hinkley step counter.
+    // Page-Hinkley step counter.
     size_t phCounter;
 
-    //! Evaluations step counter.
+    // Evaluations step counter.
     size_t eveCounter;
 
-    //! Locally-stored parameter wise learning rates.
+    // Locally-stored parameter wise learning rates.
     MatType learningRates;
 
     //! Locally-stored parameter wise sums.
@@ -242,6 +253,11 @@ class SPALeRAStepsize
 
     //! Locally-stored previous parameter matrix (backtracking).
     MatType previousIterate;
+
+    // Parameters converted to the element type of the optimization.
+    ElemType alpha;
+    ElemType epsilon;
+    ElemType adaptRate;
   };
 
  private:
diff --git a/inst/include/ensmallen_bits/spsa/spsa_impl.hpp b/inst/include/ensmallen_bits/spsa/spsa_impl.hpp
index 4cb0102..9a0ebf5 100644
--- a/inst/include/ensmallen_bits/spsa/spsa_impl.hpp
+++ b/inst/include/ensmallen_bits/spsa/spsa_impl.hpp
@@ -45,7 +45,8 @@ typename MatType::elem_type SPSA::Optimize(ArbitraryFunctionType& function,
 {
   // Convenience typedefs.
   typedef typename MatType::elem_type ElemType;
-  typedef typename MatTypeTraits<MatType>::BaseMatType BaseMatType;
+  typedef typename ForwardType<MatType>::bmat BaseMatType;
+  typedef typename ForwardType<MatType>::distr_param DistrParam;
 
   // Make sure that we have the methods that we need.
   traits::CheckArbitraryFunctionTypeAPI<ArbitraryFunctionType,
@@ -53,7 +54,7 @@ typename MatType::elem_type SPSA::Optimize(ArbitraryFunctionType& function,
   RequireFloatingPointType<MatType>();
 
   BaseMatType gradient(iterate.n_rows, iterate.n_cols);
-  arma::Mat<ElemType> spVector(iterate.n_rows, iterate.n_cols);
+  BaseMatType spVector(iterate.n_rows, iterate.n_cols);
 
   // To keep track of where we are and how things are going.
   ElemType overallObjective = 0;
@@ -90,21 +91,20 @@ typename MatType::elem_type SPSA::Optimize(ArbitraryFunctionType& function,
     lastObjective = overallObjective;
 
     // Gain sequences.
-    const double akLocal = stepSize / std::pow(k + 1 + ak, alpha);
-    const double ck = evaluationStepSize / std::pow(k + 1, gamma);
+    const ElemType akLocal = ElemType(stepSize / std::pow(k + 1 + ak, alpha));
+    const ElemType ck = ElemType(evaluationStepSize / std::pow(k + 1, gamma));
 
     // Choose stochastic directions.
-    spVector = arma::conv_to<arma::Mat<ElemType>>::from(
-        arma::randi(iterate.n_rows, iterate.n_cols,
-        arma::distr_param(0, 1))) * 2 - 1;
+    spVector = randi<BaseMatType>(
+        iterate.n_rows, iterate.n_cols, DistrParam(0, 1)) * 2 - 1;
 
     iterate += ck * spVector;
-    const double fPlus = function.Evaluate(iterate);
+    const ElemType fPlus = function.Evaluate(iterate);
     terminate |= Callback::Evaluate(*this, function, iterate, fPlus,
         callbacks...);
 
     iterate -= 2 * ck * spVector;
-    const double fMinus = function.Evaluate(iterate);
+    const ElemType fMinus = function.Evaluate(iterate);
     terminate |= Callback::Evaluate(*this, function, iterate, fMinus,
         callbacks...);
 
diff --git a/inst/include/ensmallen_bits/svrg/barzilai_borwein_decay.hpp b/inst/include/ensmallen_bits/svrg/barzilai_borwein_decay.hpp
index 49c437e..0937741 100644
--- a/inst/include/ensmallen_bits/svrg/barzilai_borwein_decay.hpp
+++ b/inst/include/ensmallen_bits/svrg/barzilai_borwein_decay.hpp
@@ -70,11 +70,20 @@ class BarzilaiBorweinDecay
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
      */
-    Policy(BarzilaiBorweinDecay& parent) : parent(parent) { /* Do nothing. */ }
+    Policy(BarzilaiBorweinDecay& parent) :
+        parent(parent),
+        epsilon(ElemType(parent.epsilon))
+    {
+      // Attempt to detect underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
+    }
 
     /**
      * Barzilai-Borwein update step for SVRG.
@@ -96,9 +105,9 @@ class BarzilaiBorweinDecay
       if (!fullGradient0.is_empty())
       {
         // Step size selection based on Barzilai-Borwein (BB).
-        stepSize = std::pow(arma::norm(iterate - iterate0), 2.0) /
-            (arma::dot(iterate - iterate0, fullGradient - fullGradient0) +
-             parent.epsilon) / (double) numBatches;
+        stepSize = std::pow(norm(iterate - iterate0), ElemType(2)) /
+            (dot(iterate - iterate0, fullGradient - fullGradient0) +
+             epsilon) / (ElemType) numBatches;
 
         stepSize = std::min(stepSize, parent.maxStepSize);
       }
@@ -107,11 +116,14 @@ class BarzilaiBorweinDecay
     }
 
    private:
-    //! Reference to instantiated parent object.
+    // Reference to instantiated parent object.
     BarzilaiBorweinDecay& parent;
 
-    //! Locally-stored full gradient.
+    // Locally-stored full gradient.
     GradType fullGradient0;
+
+    // Copy of epsilon parameter casted to the optimization element type.
+    ElemType epsilon;
   };
 
   //! The value used for numerical stability.
diff --git a/inst/include/ensmallen_bits/svrg/svrg.hpp b/inst/include/ensmallen_bits/svrg/svrg.hpp
index 0c121c7..d451fc2 100644
--- a/inst/include/ensmallen_bits/svrg/svrg.hpp
+++ b/inst/include/ensmallen_bits/svrg/svrg.hpp
@@ -143,7 +143,7 @@ class SVRGType
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/svrg/svrg_impl.hpp b/inst/include/ensmallen_bits/svrg/svrg_impl.hpp
index 032bfd4..65685ff 100644
--- a/inst/include/ensmallen_bits/svrg/svrg_impl.hpp
+++ b/inst/include/ensmallen_bits/svrg/svrg_impl.hpp
@@ -55,8 +55,8 @@ template<typename SeparableFunctionType,
          typename MatType,
          typename GradType,
          typename... CallbackTypes>
-typename std::enable_if<IsArmaType<GradType>::value,
-typename MatType::elem_type>::type
+typename std::enable_if<IsMatrixType<GradType>::value,
+    typename MatType::elem_type>::type
 SVRGType<UpdatePolicyType, DecayPolicyType>::Optimize(
     SeparableFunctionType& functionIn,
     MatType& iterateIn,
@@ -189,7 +189,7 @@ SVRGType<UpdatePolicyType, DecayPolicyType>::Optimize(
 
       f += effectiveBatchSize;
     }
-    fullGradient /= (double) numFunctions;
+    fullGradient /= (ElemType) numFunctions;
     if (terminate)
       break;
 
diff --git a/inst/include/ensmallen_bits/svrg/svrg_update.hpp b/inst/include/ensmallen_bits/svrg/svrg_update.hpp
index 09dacbe..7da343d 100644
--- a/inst/include/ensmallen_bits/svrg/svrg_update.hpp
+++ b/inst/include/ensmallen_bits/svrg/svrg_update.hpp
@@ -62,8 +62,8 @@ class SVRGUpdate
                 const double stepSize)
     {
       // Perform the vanilla SVRG update.
-      iterate -= stepSize * (fullGradient + (gradient - gradient0) /
-          (double) batchSize);
+      iterate -= typename MatType::elem_type(stepSize) *
+          (fullGradient + (gradient - gradient0) / batchSize);
     }
   };
 };
diff --git a/inst/include/ensmallen_bits/swats/swats.hpp b/inst/include/ensmallen_bits/swats/swats.hpp
index 1230d19..0fe5ba9 100644
--- a/inst/include/ensmallen_bits/swats/swats.hpp
+++ b/inst/include/ensmallen_bits/swats/swats.hpp
@@ -98,7 +98,7 @@ class SWATS
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/swats/swats_update.hpp b/inst/include/ensmallen_bits/swats/swats_update.hpp
index 0a80a77..dbc3672 100644
--- a/inst/include/ensmallen_bits/swats/swats_update.hpp
+++ b/inst/include/ensmallen_bits/swats/swats_update.hpp
@@ -96,6 +96,8 @@ class SWATSUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This is called by the optimizer method before the start of the iteration
      * update process.
@@ -106,12 +108,21 @@ class SWATSUpdate
      */
     Policy(SWATSUpdate& parent, const size_t rows, const size_t cols) :
         parent(parent),
-        iteration(0)
+        iteration(0),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2)),
+        sgdRate(ElemType(parent.sgdRate)),
+        sgdLambda(ElemType(parent.sgdLambda))
     {
       m.zeros(rows, cols);
       v.zeros(rows, cols);
 
       sgdV.zeros(rows, cols);
+
+      // Attempt to catch underflow.
+      if (epsilon == ElemType(0) && parent.epsilon != 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -132,35 +143,37 @@ class SWATSUpdate
       {
         // Note we reuse the exponential moving average parameter here instead
         // of introducing a new parameter (sgdV) as done in the paper.
-        v *= parent.beta1;
+        v *= beta1;
         v += gradient;
 
-        iterate -= (1 - parent.beta1) * parent.sgdRate * v;
+        iterate -= (1 - beta1) * sgdRate * v;
         return;
       }
 
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      v *= parent.beta2;
-      v += (1 - parent.beta2) * (gradient % gradient);
+      v *= beta2;
+      v += (1 - beta2) * (gradient % gradient);
 
-      const double biasCorrection1 = 1.0 - std::pow(parent.beta1, iteration);
-      const double biasCorrection2 = 1.0 - std::pow(parent.beta2, iteration);
+      const ElemType biasCorrection1 = 1 - std::pow(beta1, ElemType(iteration));
+      const ElemType biasCorrection2 = 1 - std::pow(beta2, ElemType(iteration));
 
-      GradType delta = stepSize * m / biasCorrection1 /
-          (arma::sqrt(v / biasCorrection2) + parent.epsilon);
+      GradType delta = ElemType(stepSize) * m / biasCorrection1 /
+          (sqrt(v / biasCorrection2) + epsilon);
       iterate -= delta;
 
-      const double deltaGradient = arma::dot(delta, gradient);
-      if (deltaGradient != 0)
+      const ElemType deltaGradient = dot(delta, gradient);
+      if (deltaGradient != ElemType(0))
       {
-        const double rate = arma::dot(delta, delta) / deltaGradient;
-        parent.sgdLambda = parent.beta2 * parent.sgdLambda +
-            (1 - parent.beta2) * rate;
-        parent.sgdRate = parent.sgdLambda / biasCorrection2;
+        const ElemType rate = dot(delta, delta) / deltaGradient;
+        sgdLambda = beta2 * sgdLambda + (1 - beta2) * rate;
+        sgdRate = sgdLambda / biasCorrection2;
 
-        if (std::abs(parent.sgdRate - rate) < parent.epsilon && iteration > 1)
+        parent.sgdLambda = (double) sgdLambda;
+        parent.sgdRate = (double) sgdRate;
+
+        if (std::abs(sgdRate - rate) < epsilon && iteration > 1)
         {
           parent.phaseSGD = true;
           v.zeros();
@@ -169,39 +182,46 @@ class SWATSUpdate
     }
 
    private:
-    //! Reference to instantiated parent object.
+    // Reference to instantiated parent object.
     SWATSUpdate& parent;
 
-    //! The exponential moving average of gradient values.
+    // The exponential moving average of gradient values.
     GradType m;
 
-    //! The exponential moving average of squared gradient values (Adam).
+    // The exponential moving average of squared gradient values (Adam).
     GradType v;
 
-    //! The exponential moving average of squared gradient values (SGD).
+    // The exponential moving average of squared gradient values (SGD).
     GradType sgdV;
 
-    //! The number of iterations.
+    // The number of iterations.
     size_t iteration;
+
+    // Parameters casted to the element type of the optimization.
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
+    ElemType sgdRate;
+    ElemType sgdLambda;
   };
 
  private:
-  //! The epsilon value used to initialise the squared gradient parameter.
+  // The epsilon value used to initialise the squared gradient parameter.
   double epsilon;
 
-  //! The smoothing parameter.
+  // The smoothing parameter.
   double beta1;
 
-  //! The second moment coefficient.
+  // The second moment coefficient.
   double beta2;
 
-  //! Wether to use the SGD or Adam update rule.
+  // Whether to use the SGD or Adam update rule.
   bool phaseSGD;
 
-  //! SGD scaling parameter.
+  // SGD scaling parameter.
   double sgdRate;
 
-  //! SGD learning rate.
+  // SGD learning rate.
   double sgdLambda;
 };
 
diff --git a/inst/include/ensmallen_bits/utility/detect_callbacks.hpp b/inst/include/ensmallen_bits/utility/detect_callbacks.hpp
new file mode 100644
index 0000000..1d87562
--- /dev/null
+++ b/inst/include/ensmallen_bits/utility/detect_callbacks.hpp
@@ -0,0 +1,41 @@
+/**
+ * @file ensmallen_bits/utility/detect_callbacks.hpp
+ * @author Ryan Curtin
+ *
+ * This provides the IsAllNonMatrix utility struct, meant to be used with SFINAE
+ * to ensure that template arguments are only non-Armadillo classes.  (This does
+ * not actually check that callback functions are implemented!)
+ *
+ * mlpack is free software; you may redistribute it and/or modify it under the
+ * terms of the 3-clause BSD license.  You should have received a copy of the
+ * 3-clause BSD license along with mlpack.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENS_CORE_UTIL_DETECT_CALLBACKS_HPP
+#define ENS_CORE_UTIL_DETECT_CALLBACKS_HPP
+
+namespace ens {
+
+template<typename... Ts>
+struct IsAllNonMatrix;
+
+template<typename T, typename... Ts>
+struct IsAllNonMatrix<T, Ts...>
+{
+  constexpr static bool tIsClass = std::is_class<typename std::remove_cv<
+          typename std::remove_reference<T>::type>::type>::value;
+
+  constexpr static bool value =
+      tIsClass && !IsMatrixType<T>::value &&
+      IsAllNonMatrix<Ts...>::value;
+};
+
+template<>
+struct IsAllNonMatrix<>
+{
+  constexpr static bool value = true;
+};
+
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/utility/arma_traits.hpp b/inst/include/ensmallen_bits/utility/function_traits.hpp
similarity index 58%
rename from inst/include/ensmallen_bits/utility/arma_traits.hpp
rename to inst/include/ensmallen_bits/utility/function_traits.hpp
index 0555ea5..4c94d23 100644
--- a/inst/include/ensmallen_bits/utility/arma_traits.hpp
+++ b/inst/include/ensmallen_bits/utility/function_traits.hpp
@@ -18,6 +18,35 @@ namespace ens {
 // Structs have public members by default (that's why they are chosen over
 // classes).
 
+template<typename MatType> struct IsArmaType;
+template<typename MatType> struct IsCootType;
+
+/**
+ * If value == true, then MatType is a matrix type matching the Armadillo API
+ * that is supported by ensmallen.
+ */
+template<typename MatType>
+struct IsMatrixType
+{
+  const static bool value = IsArmaType<MatType>::value ||
+                            IsCootType<MatType>::value;
+};
+
+/**
+ * If value == true, then MatType is an Armadillo sparse matrix.
+ */
+template<typename MatType>
+struct IsSparseMatrixType
+{
+  const static bool value = false;
+};
+
+template<typename eT>
+struct IsSparseMatrixType<arma::SpMat<eT>>
+{
+  const static bool value = true;
+};
+
 /**
  * If value == true, then MatType is some sort of Armadillo vector or subview.
  * You might use this struct like this:
@@ -40,58 +69,48 @@ struct IsArmaType
   const static bool value = false;
 };
 
-// Commenting out the first template per case, because
-// Visual Studio doesn't like this instantiaion pattern (error C2910).
-// template<>
 template<typename eT>
 struct IsArmaType<arma::Col<eT> >
 {
   const static bool value = true;
 };
 
-// template<>
 template<typename eT>
 struct IsArmaType<arma::SpCol<eT> >
 {
   const static bool value = true;
 };
 
-// template<>
 template<typename eT>
 struct IsArmaType<arma::Row<eT> >
 {
   const static bool value = true;
 };
 
-// template<>
 template<typename eT>
 struct IsArmaType<arma::SpRow<eT> >
 {
   const static bool value = true;
 };
 
-// template<>
 template<typename eT>
 struct IsArmaType<arma::subview<eT> >
 {
   const static bool value = true;
 };
 
-// template<>
 template<typename eT>
 struct IsArmaType<arma::subview_col<eT> >
 {
   const static bool value = true;
 };
 
-// template<>
 template<typename eT>
 struct IsArmaType<arma::subview_row<eT> >
 {
   const static bool value = true;
 };
 
-// template<>
 template<typename eT>
 struct IsArmaType<arma::SpSubview<eT> >
 {
@@ -110,34 +129,104 @@ struct IsArmaType<arma::SpSubview_row<eT> >
   const static bool value = true;
 };
 
-// template<>
 template<typename eT>
 struct IsArmaType<arma::Mat<eT> >
 {
   const static bool value = true;
 };
 
-// template<>
 template<typename eT>
 struct IsArmaType<arma::SpMat<eT> >
 {
   const static bool value = true;
 };
 
-// template<>
 template<typename eT>
 struct IsArmaType<arma::Cube<eT> >
 {
   const static bool value = true;
 };
 
-// template<>
 template<typename eT>
 struct IsArmaType<arma::subview_cube<eT> >
 {
   const static bool value = true;
 };
 
+/**
+ * If value == true, then MatType is some sort of Bandicoot vector or subview.
+ * You might use this struct like this:
+ *
+ * @code
+ * // Only accepts VecTypes that are actually Bandicoot vector types.
+ * template<typename MatType>
+ * void Function(const MatType& argumentA,
+ *               typename std::enable_if_t<IsCootType<MatType>::value>* = 0);
+ * @endcode
+ *
+ * The use of the enable_if_t object allows the compiler to instantiate
+ * Function() only if VecType is one of the Bandicoot vector types.  It has a
+ * default argument because it isn't meant to be used in either the function
+ * call or the function body.
+ */
+template<typename MatType>
+struct IsCootType
+{
+  const static bool value = false;
+};
+
+#ifdef ENS_HAVE_COOT
+
+template<typename eT>
+struct IsCootType<coot::Col<eT> >
+{
+  const static bool value = true;
+};
+
+template<typename eT>
+struct IsCootType<coot::Row<eT> >
+{
+  const static bool value = true;
+};
+
+template<typename eT>
+struct IsCootType<coot::subview<eT> >
+{
+  const static bool value = true;
+};
+
+template<typename eT>
+struct IsCootType<coot::subview_col<eT> >
+{
+  const static bool value = true;
+};
+
+template<typename eT>
+struct IsCootType<coot::subview_row<eT> >
+{
+  const static bool value = true;
+};
+
+template<typename eT>
+struct IsCootType<coot::Mat<eT> >
+{
+  const static bool value = true;
+};
+
+template<typename eT>
+struct IsCootType<coot::Cube<eT> >
+{
+  const static bool value = true;
+};
+
+template<typename eT>
+struct IsCootType<coot::subview_cube<eT> >
+{
+  const static bool value = true;
+};
+
+#endif
+
 
 template <int N, typename... T>
 struct tuple_element;
diff --git a/inst/include/ensmallen_bits/utility/indicators/epsilon.hpp b/inst/include/ensmallen_bits/utility/indicators/epsilon.hpp
index d33236d..e627689 100644
--- a/inst/include/ensmallen_bits/utility/indicators/epsilon.hpp
+++ b/inst/include/ensmallen_bits/utility/indicators/epsilon.hpp
@@ -20,11 +20,11 @@ namespace ens {
 
 /**
  * The epsilon indicator is one of the binary quality indicators that was proposed by
- * Zitzler et. al.. The indicator originally calculates a weak dominance relation 
- * between two approximation sets. It returns "epsilon" which is the factor by which 
- * the given approximation set is worse than the reference front with respect to 
+ * Zitzler et. al.. The indicator originally calculates a weak dominance relation
+ * between two approximation sets. It returns "epsilon" which is the factor by which
+ * the given approximation set is worse than the reference front with respect to
  * all the objectives.
- * 
+ *
  * \f[ I_{\epsilon}(A,B) = \max_{z^2 \in B} \
  *                \min_{z^1 \in A} \
  *                \max_{1 \leq i \leq n} \ \frac{z^1_i}{z^2_i}\
@@ -43,49 +43,50 @@ namespace ens {
  * }
  * @endcode
  */
-  class Epsilon
-  {
-   public:
-    /**
-     * Default constructor does nothing, but is required to satisfy the Indicator
-     * policy.
-     */
-    Epsilon() { }
+class Epsilon
+{
+  public:
+  /**
+   * Default constructor does nothing, but is required to satisfy the Indicator
+   * policy.
+   */
+  Epsilon() { }
 
-    /**
-     * Find the epsilon value of the front with respect to the given reference
-     * front.
-     *
-     * @tparam CubeType The cube data type of front.
-     * @param front The given approximation front.
-     * @param referenceFront The given reference front.
-     * @return The epsilon value of the front.
-     */
-    template<typename CubeType>
-    static typename CubeType::elem_type Evaluate(const CubeType& front,
-                                                 const CubeType& referenceFront)
+  /**
+   * Find the epsilon value of the front with respect to the given reference
+   * front.
+   *
+   * @tparam CubeType The cube data type of front.
+   * @param front The given approximation front.
+   * @param referenceFront The given reference front.
+   * @return The epsilon value of the front.
+   */
+  template<typename CubeType>
+  static typename CubeType::elem_type Evaluate(const CubeType& front,
+                                               const CubeType& referenceFront)
+  {
+    // Convenience typedefs.
+    typedef typename CubeType::elem_type ElemType;
+    ElemType eps = 0;
+    for (size_t i = 0; i < referenceFront.n_slices; i++)
     {
-      // Convenience typedefs.
-      typedef typename CubeType::elem_type ElemType;
-      ElemType eps = 0;
-      for (size_t i = 0; i < referenceFront.n_slices; i++)
+      ElemType epsjMin = std::numeric_limits<ElemType>::max();
+      for (size_t j = 0; j < front.n_slices; j++)
       {
-        ElemType epsjMin = std::numeric_limits<ElemType>::max();
-        for (size_t j = 0; j < front.n_slices; j++)
-        {
-          arma::Mat<ElemType> frontRatio = front.slice(j) / referenceFront.slice(i);
-          frontRatio.replace(arma::datum::inf, -1.); // Handle zero division case.
-          ElemType epsj = frontRatio.max();
-          if (epsj < epsjMin)
-            epsjMin = epsj;
-        }
-        if (epsjMin > eps)
-          eps = epsjMin;
+        arma::Mat<ElemType> frontRatio = front.slice(j) /
+            referenceFront.slice(i);
+        frontRatio.replace(arma::datum::inf, -1.); // Handle zero division case.
+        ElemType epsj = frontRatio.max();
+        if (epsj < epsjMin)
+          epsjMin = epsj;
       }
-
-      return eps;
+      if (epsjMin > eps)
+        eps = epsjMin;
     }
-  };
+
+    return eps;
+  }
+};
 
 } // namespace ens
 
diff --git a/inst/include/ensmallen_bits/utility/indicators/igd.hpp b/inst/include/ensmallen_bits/utility/indicators/igd.hpp
index 3adf1d7..3c309b2 100644
--- a/inst/include/ensmallen_bits/utility/indicators/igd.hpp
+++ b/inst/include/ensmallen_bits/utility/indicators/igd.hpp
@@ -18,8 +18,8 @@ namespace ens {
 /**
  * The inverted generational distance( IGD) is a metric for assessing the quality
  * of approximations to the Pareto front obtained by multi-objective optimization
- * algorithms.The IGD indicator returns the average distance from each point in 
- * the reference front to the nearest point to it's solution. 
+ * algorithms.The IGD indicator returns the average distance from each point in
+ * the reference front to the nearest point to it's solution.
  *
  * \f[ d(z,a) = \sqrt{\sum_{i = 1}^{n}(a_i - z_i)^2 \ } \
  *    \f]
@@ -28,67 +28,73 @@ namespace ens {
  *
  * @code
  * @inproceedings{coello2004study,
- * title={A study of the parallelization of a coevolutionary multi-objective evolutionary algorithm},
- * author={Coello Coello, Carlos A and Reyes Sierra, Margarita},
- * booktitle={MICAI 2004: Advances in Artificial Intelligence: Third Mexican International Conference on Artificial Intelligence, Mexico City, Mexico, April 26-30, 2004. Proceedings 3},
- * pages={688--697},
- * year={2004},
- * organization={Springer}
+ *   title        = {A study of the parallelization of a coevolutionary
+ *                   multi-objective evolutionary algorithm},
+ *   author       = {Coello Coello, Carlos A and Reyes Sierra, Margarita},
+ *   booktitle    = {MICAI 2004: Advances in Artificial Intelligence: Third
+ *                   Mexican International Conference on Artificial
+ *                   Intelligence, Mexico City, Mexico, April 26-30,
+ *                   2004. Proceedings 3},
+ *   pages        = {688--697},
+ *   year         = {2004},
+ *   organization = {Springer}
  * }
  * @endcode
  */
-  class IGD
+class IGD
+{
+  public:
+  /**
+   * Default constructor does nothing, but is required to satisfy the Indicator
+   * policy.
+   */
+  IGD() { /* Nothing to do here. */ }
+
+  /**
+   * Find the IGD value of the front with respect to the given reference
+   * front.
+   *
+   * @tparam CubeType The cube data type of front.
+   * @param front The given approximation front.
+   * @param referenceFront The given reference front.
+   * @param p The power constant in the distance formula.
+   * @return The IGD value of the front.
+   */
+  template<typename CubeType>
+  static typename CubeType::elem_type Evaluate(const CubeType& front,
+                                               const CubeType& referenceFront,
+                                               double p)
   {
-   public:
-    /**
-     * Default constructor does nothing, but is required to satisfy the Indicator
-     * policy.
-     */
-    IGD() { }
+    // Convenience typedefs.
+    typedef typename CubeType::elem_type ElemType;
 
-    /**
-     * Find the IGD value of the front with respect to the given reference
-     * front.
-     *
-     * @tparam CubeType The cube data type of front.
-     * @param front The given approximation front.
-     * @param referenceFront The given reference front.
-     * @param p The power constant in the distance formula. 
-     * @return The IGD value of the front.
-     */
-    template<typename CubeType>
-    static typename CubeType::elem_type Evaluate(const CubeType& front,
-                                                 const CubeType& referenceFront,
-                                                 double p)
+    ElemType igd = 0;
+    for (size_t i = 0; i < referenceFront.n_slices; i++)
     {
-      // Convenience typedefs.
-      typedef typename CubeType::elem_type ElemType;
-      ElemType igd = 0;
-      for (size_t i = 0; i < referenceFront.n_slices; i++)
+      ElemType min = std::numeric_limits<ElemType>::max();
+      for (size_t j = 0; j < front.n_slices; j++)
       {
-        ElemType min = std::numeric_limits<ElemType>::max();
-        for (size_t j = 0; j < front.n_slices; j++)
+        ElemType dist = 0;
+        for (size_t k = 0; k < front.slice(j).n_rows; k++)
         {
-          ElemType dist = 0;
-          for (size_t k = 0; k < front.slice(j).n_rows; k++)
-          {
-            ElemType z = referenceFront(k, 0, i);
-            ElemType a = front(k, 0, j);
-            // Assuming minimization of all objectives.
-            //! IGD does not clip negative differences to 0
-            dist += std::pow(a - z, 2); 
-          }
-          dist = std::sqrt(dist);
-          if (dist < min)
-            min = dist;
+          ElemType z = referenceFront(k, 0, i);
+          ElemType a = front(k, 0, j);
+          // Assuming minimization of all objectives.
+          // IGD does not clip negative differences to 0.
+          dist += std::pow(a - z, 2);
         }
-        igd += std::pow(min,p);
+        dist = std::sqrt(dist);
+        if (dist < min)
+          min = dist;
       }
-      igd /= referenceFront.n_slices;
-      igd = std::pow(igd, 1.0 / p);
-      return igd;
+      igd += std::pow(min,p);
     }
-  };
+    igd /= referenceFront.n_slices;
+    igd = std::pow(igd, 1.0 / p);
+
+    return igd;
+  }
+};
 
 } // namespace ens
 
diff --git a/inst/include/ensmallen_bits/utility/indicators/igd_plus.hpp b/inst/include/ensmallen_bits/utility/indicators/igd_plus.hpp
index d8f674e..9ec887d 100644
--- a/inst/include/ensmallen_bits/utility/indicators/igd_plus.hpp
+++ b/inst/include/ensmallen_bits/utility/indicators/igd_plus.hpp
@@ -39,55 +39,56 @@ namespace ens {
  * }
  * @endcode
  */
-  class IGDPlus
+class IGDPlus
+{
+  public:
+  /**
+   * Default constructor does nothing, but is required to satisfy the Indicator
+   * policy.
+   */
+  IGDPlus() { /* Nothing to do here. */}
+
+  /**
+   * Find the IGD+ value of the front with respect to the given reference
+   * front.
+   *
+   * @tparam CubeType The cube data type of front.
+   * @param front The given approximation front.
+   * @param referenceFront The given reference front.
+   * @return The IGD value of the front.
+   */
+  template<typename CubeType>
+  static typename CubeType::elem_type Evaluate(const CubeType& front,
+                                               const CubeType& referenceFront)
   {
-   public:
-    /**
-     * Default constructor does nothing, but is required to satisfy the Indicator
-     * policy.
-     */
-    IGDPlus() { }
+    // Convenience typedefs.
+    typedef typename CubeType::elem_type ElemType;
 
-    /**
-     * Find the IGD+ value of the front with respect to the given reference
-     * front.
-     *
-     * @tparam CubeType The cube data type of front.
-     * @param front The given approximation front.
-     * @param referenceFront The given reference front.
-     * @return The IGD value of the front.
-     */
-    template<typename CubeType>
-    static typename CubeType::elem_type Evaluate(const CubeType& front,
-                                                 const CubeType& referenceFront)
+    ElemType igd = 0;
+    for (size_t i = 0; i < referenceFront.n_slices; i++)
     {
-      // Convenience typedefs.
-      typedef typename CubeType::elem_type ElemType;
-      ElemType igd = 0;
-      for (size_t i = 0; i < referenceFront.n_slices; i++)
+      ElemType min = std::numeric_limits<ElemType>::max();
+      for (size_t j = 0; j < front.n_slices; j++)
       {
-        ElemType min = std::numeric_limits<ElemType>::max();
-        for (size_t j = 0; j < front.n_slices; j++)
+        ElemType dist = 0;
+        for (size_t k = 0; k < front.slice(j).n_rows; k++)
         {
-          ElemType dist = 0;
-          for (size_t k = 0; k < front.slice(j).n_rows; k++)
-          {
-            ElemType z = referenceFront(k, 0, i);
-            ElemType a = front(k, 0, j);
-            // Assuming minimization of all objectives.
-            dist += std::pow(std::max<ElemType>(a - z, 0), 2);
-          }
-          dist = std::sqrt(dist);
-          if (dist < min)
-            min = dist;
+          ElemType z = referenceFront(k, 0, i);
+          ElemType a = front(k, 0, j);
+          // Assuming minimization of all objectives.
+          dist += std::pow(std::max<ElemType>(a - z, 0), 2);
         }
-        igd += min;
+        dist = std::sqrt(dist);
+        if (dist < min)
+          min = dist;
       }
-      igd /= referenceFront.n_slices;
-
-      return igd;
+      igd += min;
     }
-  };
+    igd /= referenceFront.n_slices;
+
+    return igd;
+  }
+};
 
 } // namespace ens
 
diff --git a/inst/include/ensmallen_bits/utility/proxies.hpp b/inst/include/ensmallen_bits/utility/proxies.hpp
new file mode 100644
index 0000000..418f3a6
--- /dev/null
+++ b/inst/include/ensmallen_bits/utility/proxies.hpp
@@ -0,0 +1,164 @@
+/**
+ * @file proxies.hpp
+ * @author Marcus Edel
+ *
+ * Simple proxies that based on the data type forwards to `coot` or `arma`.
+ *
+ * ensmallen is free software; you may redistribute it and/or modify it under
+ * the terms of the 3-clause BSD license.  You should have received a copy of
+ * the 3-clause BSD license along with ensmallen.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENSMALLEN_UTILITY_PROXIES_HPP
+#define ENSMALLEN_UTILITY_PROXIES_HPP
+
+#include "function_traits.hpp"
+
+namespace ens {
+
+template<typename ElemType, bool IsBandicootType>
+struct ForwardTypeHelper;
+
+/**
+ * Helper struct that based on the data type `MatType` forwards to the
+ * corresponding `coot` or `arma` types. For example:
+ * If `MatType` is an `arma::mat`, then `ForwardType<MatType, ElemType>::bmat`
+ * will be an `arma::Mat<ElemType>`.
+ * If `MatType` is a `coot::mat`, then `ForwardType<MatType, ElemType>::bmat`
+ * will be a `coot::Mat<ElemType>`.
+ *
+ * This allows for writing generic code that can work with both `coot` and
+ * `arma` types without needing to know which library is being used at compile
+ * time.
+ */
+template<typename MatType, typename ElemType = typename MatType::elem_type>
+struct ForwardType : public ForwardTypeHelper<ElemType,
+                                              IsCootType<MatType>::value> { };
+
+// Internal helper class that sets the typedefs to Armadillo types if Bandicoot
+// is not available or in use.
+template<typename ElemType, bool IsBandicootType = true>
+struct ForwardTypeHelper
+{
+  // `uword` is a typedef for an unsigned integer type; it is used for matrix
+  // indices as well as all internal counters and loops.
+  typedef arma::uword uword;
+
+  // `vec` is a typedef for column vectors (dense matrices with one column).
+  typedef arma::vec vec;
+
+  // `bvec` (base vector) is a typedef for a vector type, in comparison to
+  // `vec`, `bvec` uses the given element type `ElemType`.
+  typedef arma::Col<ElemType> bvec;
+
+  // `bcol` (base col) is a typedef for a column vector type, in comparison to
+  // `col`, `bcol` uses the given element type `ElemType`.
+  typedef arma::Col<ElemType> bcol;
+
+  // `brow` (base row) is a typedef for a row vector type, in comparison to
+  // `row`, `brow` uses the given element type `ElemType`.
+  typedef arma::Row<ElemType> brow;
+
+  // `mat` is a typedef for dense matrices, with elements stored in
+  // column-major ordering (ie. column by column).
+  typedef arma::mat mat;
+
+  // `bmat` (base matrix) is a typedef for a matrix type, in comparison to
+  // `mat`, `bmat` uses the given element type `ElemType`.
+  typedef arma::Mat<ElemType> bmat;
+
+  // `cube` is a typedef for 3D matrices (cubes), with elements stored in
+  // column-major ordering (ie. column by column, then page by page).
+  typedef arma::cube cube;
+
+  // `bcube` (base cube) is a typedef for a cube type, in comparison to `cube`,
+  // `bcube` uses the given element type `ElemType`.
+  typedef arma::Cube<ElemType> bcube;
+
+  // `umat` is a typedef for unsigned integer matrices, with elements stored in
+  // column-major ordering (ie. column by column).
+  typedef arma::umat umat;
+
+  // `uvec` is a typedef for unsigned integer vectors (dense matrices with one
+  // column).
+  typedef arma::uvec uvec;
+
+  // `ucolvec` is a typedef for unsigned integer column vectors (dense matrices
+  // with one column).
+  typedef arma::ucolvec ucolvec;
+
+  // `urowvec` is a typedef for unsigned integer row vectors (dense matrices
+  // with one row).
+  typedef arma::urowvec urowvec;
+
+  // `distr_param` is a typedef for the distribution parameters used in
+  // random number generation.
+  typedef arma::distr_param distr_param;
+};
+
+// Internal helper class that sets the typedefs to Bandicoot types if Bandicoot
+// is available and in use.
+#ifdef ENS_HAVE_COOT
+template<typename ElemType>
+struct ForwardTypeHelper<ElemType, true>
+{
+  // `uword` is a typedef for an unsigned integer type; it is used for matrix
+  // indices as well as all internal counters and loops.
+  typedef coot::uword uword;
+
+  // `vec` is a typedef for column vectors (dense matrices with one column).
+  typedef coot::vec vec;
+
+  // `bvec` (base vector) is a typedef for a vector type, in comparison to
+  // `vec`, `bvec` uses the given element type `ElemType`.
+  typedef coot::Col<ElemType> bvec;
+
+  // `bcol` (base col) is a typedef for a column vector type, in comparison to
+  // `col`, `bcol` uses the given element type `ElemType`.
+  typedef coot::Col<ElemType> bcol;
+
+  // `brow` (base row) is a typedef for a row vector type, isn comparison to
+  // `row`, brow uses the given element type ElemType.
+  typedef coot::Row<ElemType> brow;
+
+  // `mat` is a typedef for dense matrices, with elements stored in
+  // column-major ordering (ie. column by column).
+  typedef coot::mat mat;
+
+  // `bmat` (base matrix) is a typedef for a matrix type, in comparison to
+  // `mat`, `bmat` uses the given element type `ElemType`.
+  typedef coot::Mat<ElemType> bmat;
+
+  // `cube` is a typedef for 3D matrices (cubes), with elements stored in
+  // column-major ordering (ie. column by column, then page by page).
+  typedef coot::cube cube;
+
+  // `bcube` (base cube) is a typedef for a cube type, in comparison to `cube`,
+  // `bcube` uses the given element type `ElemType`.
+  typedef coot::Cube<ElemType> bcube;
+
+  // `umat` is a typedef for unsigned integer matrices, with elements stored in
+  // column-major ordering (ie. column by column).
+  typedef coot::umat umat;
+
+  // `uvec` is a typedef for unsigned integer vectors (dense matrices with one
+  // column).
+  typedef coot::uvec uvec;
+
+  // `ucolvec` is a typedef for unsigned integer column vectors (dense matrices
+  // with one column).
+  typedef coot::ucolvec ucolvec;
+
+  // `urowvec` is a typedef for unsigned integer row vectors (dense matrices
+  // with one row).
+  typedef coot::urowvec urowvec;
+
+  // `distr_param` is a typedef for the distribution parameters used in
+  // random number generation.
+  typedef coot::distr_param distr_param;
+};
+#endif
+
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/utility/using.hpp b/inst/include/ensmallen_bits/utility/using.hpp
new file mode 100644
index 0000000..6f11f20
--- /dev/null
+++ b/inst/include/ensmallen_bits/utility/using.hpp
@@ -0,0 +1,174 @@
+/**
+ * @file ensmallen_bits/utility/using.hpp
+ * @author Omar Shrit
+ * @author Ryan Curtin
+ * @author Conrad Sanderson
+ *
+ * This is a set of `using` statements to mitigate any possible risks or
+ * conflicts with local functions. The compiler is supposed to proritise the
+ * following functions to be looked up first. This is to be considered as a
+ * replacement to the ADL solution that we had deployed earlier.
+ *
+ * mlpack is free software; you may redistribute it and/or modify it under the
+ * terms of the 3-clause BSD license.  You should have received a copy of the
+ * 3-clause BSD license along with mlpack.  If not, see
+ * http://www.opensource.org/licenses/BSD-3-Clause for more information.
+ */
+#ifndef ENS_CORE_UTIL_USING_HPP
+#define ENS_CORE_UTIL_USING_HPP
+
+#include "function_traits.hpp"
+
+namespace ens {
+
+#ifdef ENS_HAVE_COOT
+
+/* using for bandicoot namespace*/
+using coot::abs;
+using coot::accu;
+using coot::chol;
+using coot::clamp;
+using coot::conv_to;
+using coot::cos;
+using coot::dot;
+using coot::exp;
+using coot::join_cols;
+using coot::join_rows;
+using coot::linspace;
+using coot::log;
+using coot::max;
+using coot::mean;
+using coot::min;
+using coot::norm;
+using coot::normalise;
+using coot::ones;
+using coot::pow;
+using coot::randi;
+using coot::randn;
+using coot::randu;
+using coot::regspace;
+using coot::repmat;
+using coot::shuffle;
+using coot::sign;
+using coot::size;
+using coot::sort;
+using coot::sort_index;
+using coot::sqrt;
+using coot::square;
+using coot::sum;
+using coot::trans;
+using coot::vectorise;
+using coot::zeros;
+
+#endif
+
+/* using for armadillo namespace */
+using arma::abs;
+using arma::accu;
+using arma::chol;
+using arma::clamp;
+
+// If Bandicoot is used, using arma::conv_to is already
+// part of including bandicoot.
+#ifndef ENS_HAVE_COOT
+using arma::conv_to;
+#endif
+
+using arma::cos;
+using arma::dot;
+using arma::exp;
+using arma::join_cols;
+using arma::join_rows;
+using arma::linspace;
+using arma::log;
+using arma::max;
+using arma::mean;
+using arma::min;
+using arma::norm;
+using arma::normalise;
+using arma::ones;
+using arma::pow;
+using arma::randi;
+using arma::randn;
+using arma::randu;
+using arma::regspace;
+using arma::repmat;
+using arma::shuffle;
+using arma::sign;
+using arma::size;
+using arma::sort;
+using arma::sort_index;
+using arma::sqrt;
+using arma::square;
+using arma::sum;
+using arma::trans;
+using arma::vectorise;
+using arma::zeros;
+
+template<typename MatType, bool IsArma, bool IsCoot>
+struct GetFillTypeInternal
+{
+  // Default empty implementation
+};
+
+template<typename MatType>
+struct GetFillType : public GetFillTypeInternal<MatType,
+    IsArmaType<MatType>::value, IsCootType<MatType>::value> { };
+
+// By default, assume that we are using an Armadillo object.
+template<typename MatType>
+struct GetFillTypeInternal<MatType, true, false>
+{
+  static constexpr const decltype(arma::fill::none)& none   = arma::fill::none;
+  static constexpr const decltype(arma::fill::zeros)& zeros = arma::fill::zeros;
+  static constexpr const decltype(arma::fill::ones)& ones   = arma::fill::ones;
+  static constexpr const decltype(arma::fill::randu)& randu = arma::fill::randu;
+  static constexpr const decltype(arma::fill::randn)& randn = arma::fill::randn;
+  static constexpr const decltype(arma::fill::eye)& eye     = arma::fill::eye;
+};
+
+template<typename MatType, bool IsArma, bool IsCoot>
+struct GetProxyTypeInternal
+{
+  // Default empty implementation
+};
+
+template<typename MatType>
+struct GetProxyType : public GetProxyTypeInternal<MatType,
+    IsArmaType<MatType>::value, IsCootType<MatType>::value> { };
+
+// By default, assume that we are using an Armadillo object.
+template<typename MatType>
+struct GetProxyTypeInternal<MatType, true, false>
+{
+  using span = arma::span;
+  static constexpr const decltype(arma::span::all)& all = arma::span::all;
+};
+
+#ifdef ENS_HAVE_COOT
+// If the matrix type is a Bandicoot type, use Bandicoot fill objects instead.
+template<
+    typename MatType>
+struct GetFillTypeInternal<MatType, false, true>
+{
+  static constexpr const decltype(coot::fill::none)& none   = coot::fill::none;
+  static constexpr const decltype(coot::fill::zeros)& zeros = coot::fill::zeros;
+  static constexpr const decltype(coot::fill::ones)& ones   = coot::fill::ones;
+  static constexpr const decltype(coot::fill::randu)& randu = coot::fill::randu;
+  static constexpr const decltype(coot::fill::randn)& randn = coot::fill::randn;
+  static constexpr const decltype(coot::fill::eye)& eye     = coot::fill::eye;
+};
+
+// If the matrix type is a Bandicoot type, use Bandicoot types instead.
+template<typename MatType>
+struct GetProxyTypeInternal<MatType, false, true>
+{
+  using span = coot::span;
+  static constexpr const decltype(coot::span::all)& all = coot::span::all;
+};
+
+#endif
+
+} // namespace ens
+
+#endif
diff --git a/inst/include/ensmallen_bits/wn_grad/wn_grad.hpp b/inst/include/ensmallen_bits/wn_grad/wn_grad.hpp
index 9b31a3f..89f2aa9 100644
--- a/inst/include/ensmallen_bits/wn_grad/wn_grad.hpp
+++ b/inst/include/ensmallen_bits/wn_grad/wn_grad.hpp
@@ -87,7 +87,7 @@ class WNGrad
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
diff --git a/inst/include/ensmallen_bits/wn_grad/wn_grad_update.hpp b/inst/include/ensmallen_bits/wn_grad/wn_grad_update.hpp
index 502058c..0306fe8 100644
--- a/inst/include/ensmallen_bits/wn_grad/wn_grad_update.hpp
+++ b/inst/include/ensmallen_bits/wn_grad/wn_grad_update.hpp
@@ -56,6 +56,8 @@ class WNGradUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This is called by the optimizer method before the start of the iteration
      * update process.
@@ -67,7 +69,8 @@ class WNGradUpdate
     Policy(WNGradUpdate& parent,
            const size_t /* rows */,
            const size_t /* cols */) :
-        parent(parent)
+        parent(parent),
+        b(ElemType(parent.b))
     {
       /* Nothing to do. */
     }
@@ -83,18 +86,21 @@ class WNGradUpdate
                 const double stepSize,
                 const GradType& gradient)
     {
-      parent.b += std::pow(stepSize, 2.0) / parent.b *
-          std::pow(arma::norm(gradient), 2);
-      iterate -= stepSize * gradient / parent.b;
+      b += std::pow(ElemType(stepSize), ElemType(2)) / b *
+          std::pow(norm(gradient), ElemType(2));
+      parent.b = (double) b;
+      iterate -= ElemType(stepSize) * gradient / b;
     }
 
    private:
-    //! Reference to the instantiated parent object.
+    // Reference to the instantiated parent object.
     WNGradUpdate& parent;
+    // Learning rate adjustment using the element type of the optimization.
+    ElemType b;
   };
 
  private:
-  //! Learning rate adjustment.
+  // Learning rate adjustment.
   double b;
 };
 
diff --git a/inst/include/ensmallen_bits/yogi/yogi.hpp b/inst/include/ensmallen_bits/yogi/yogi.hpp
index 4529d24..8f7fbc9 100644
--- a/inst/include/ensmallen_bits/yogi/yogi.hpp
+++ b/inst/include/ensmallen_bits/yogi/yogi.hpp
@@ -1,6 +1,6 @@
 /**
  * @file yogi.hpp
- * @author Marcus Edel 
+ * @author Marcus Edel
  *
  * Class wrapper for the Yogi update Policy. Yogi is based on Adam with more
  * fine grained effective learning rate control.
@@ -42,7 +42,7 @@ namespace ens {
  * see the documentation on function types included with this distribution or
  * on the ensmallen website.
  */
-class Yogi 
+class Yogi
 {
  public:
   /**
@@ -100,15 +100,15 @@ class Yogi
            typename MatType,
            typename GradType,
            typename... CallbackTypes>
-  typename std::enable_if<IsArmaType<GradType>::value,
+  typename std::enable_if<IsMatrixType<GradType>::value,
       typename MatType::elem_type>::type
   Optimize(SeparableFunctionType& function,
            MatType& iterate,
            CallbackTypes&&... callbacks)
   {
-    return optimizer.Optimize<SeparableFunctionType, MatType, GradType,
-        CallbackTypes...>(function, iterate,
-        std::forward<CallbackTypes>(callbacks)...);
+    return optimizer.template Optimize<
+        SeparableFunctionType, MatType, GradType, CallbackTypes...>(
+        function, iterate, std::forward<CallbackTypes>(callbacks)...);
   }
 
   //! Forward the MatType as GradType.
@@ -176,7 +176,7 @@ class Yogi
   //! are reset before Optimize call.
   bool& ResetPolicy() { return optimizer.ResetPolicy(); }
 
-  private:
+ private:
   //! The Stochastic Gradient Descent object with Yogi policy.
   SGD<YogiUpdate> optimizer;
 };
diff --git a/inst/include/ensmallen_bits/yogi/yogi_update.hpp b/inst/include/ensmallen_bits/yogi/yogi_update.hpp
index cdba28d..1c3693f 100644
--- a/inst/include/ensmallen_bits/yogi/yogi_update.hpp
+++ b/inst/include/ensmallen_bits/yogi/yogi_update.hpp
@@ -45,8 +45,6 @@ class YogiUpdate
    *     parameter.
    * @param beta1 The smoothing parameter.
    * @param beta2 The second moment coefficient.
-   * @param v1 The first quasi-hyperbolic term.
-   * @param v1 The second quasi-hyperbolic term.
    */
   YogiUpdate(const double epsilon = 1e-8,
              const double beta1 = 0.9,
@@ -83,6 +81,8 @@ class YogiUpdate
   class Policy
   {
    public:
+    typedef typename MatType::elem_type ElemType;
+
     /**
      * This constructor is called by the SGD Optimize() method before the start
      * of the iteration update process.
@@ -92,10 +92,17 @@ class YogiUpdate
      * @param cols Number of columns in the gradient matrix.
      */
     Policy(YogiUpdate& parent, const size_t rows, const size_t cols) :
-        parent(parent)
+        parent(parent),
+        epsilon(ElemType(parent.epsilon)),
+        beta1(ElemType(parent.beta1)),
+        beta2(ElemType(parent.beta2))
     {
       m.zeros(rows, cols);
       v.zeros(rows, cols);
+
+      // Attempt to catch underflow.
+      if (epsilon == ElemType(0) && parent.epsilon == 0.0)
+        epsilon = 10 * std::numeric_limits<ElemType>::epsilon();
     }
 
     /**
@@ -109,25 +116,30 @@ class YogiUpdate
                 const double stepSize,
                 const GradType& gradient)
     {
-      m *= parent.beta1;
-      m += (1 - parent.beta1) * gradient;
+      m *= beta1;
+      m += (1 - beta1) * gradient;
 
-      const MatType gSquared = arma::square(gradient);
-      v -= (1 - parent.beta2) * arma::sign(v - gSquared) % gSquared;
+      const MatType gSquared = square(gradient);
+      v -= (1 - beta2) * sign(v - gSquared) % gSquared;
 
       // Now update the iterate.
-      iterate -= stepSize * m / (arma::sqrt(v) + parent.epsilon);
+      iterate -= ElemType(stepSize) * m / (sqrt(v) + epsilon);
     }
 
    private:
-    //! Instantiated parent object.
+    // Instantiated parent object.
     YogiUpdate& parent;
 
-    //! The exponential moving average of gradient values.
+    // The exponential moving average of gradient values.
     GradType m;
 
     // The exponential moving average of squared gradient values.
     GradType v;
+
+    // Parameters converted to the element type of the optimization.
+    ElemType epsilon;
+    ElemType beta1;
+    ElemType beta2;
   };
 
  private:
diff --git a/tools/HISTORYold.md b/tools/HISTORYold.md
index 5b33034..c9e759b 100644
--- a/tools/HISTORYold.md
+++ b/tools/HISTORYold.md
@@ -1,3 +1,79 @@
+### ensmallen 3.10.0: "Unexpected Rain"
+###### 2025-09-25
+ * SGD-like optimizers now all divide the step size by the batch size so that
+   step sizes don't need to be tuned in addition to batch sizes.  If you require
+   behavior from ensmallen 2, define the `ENS_OLD_SEPARABLE_STEP_BEHAVIOR` macro
+   before including `ensmallen.hpp`
+   ([#431](https://github.com/mlpack/ensmallen/pull/431)).
+
+ * Remove deprecated `ParetoFront()` and `ParetoSet()` from multi-objective
+   optimizers ([#435](https://github.com/mlpack/ensmallen/pull/435)).  Instead,
+   pass objects to the `Optimize()` function; see the documentation for each
+   multi-objective optimizer for more details.  A typical transition will change
+   code like:
+
+    ```c++
+    optimizer.Optimize(objectives, coordinates);
+    arma::cube paretoFront = optimizer.ParetoFront();
+    arma::cube paretoSet = optimizer.ParetoSet();
+    ```
+
+   to instead gather the Pareto front and set in the call:
+
+    ```c++
+    arma::cube paretoFront, paretoSet;
+    optimizer.Optimize(objectives, coordinates, paretoFront, paretoSet);
+    ```
+
+ * Remove deprecated constructor for Active CMA-ES that takes `lowerBound` and
+   `upperBound` ([#435](https://github.com/mlpack/ensmallen/pull/435)).
+   Instead, pass an instantiated `BoundaryBoxConstraint` to the constructor.  A
+   typical transition will change code like:
+
+    ```c++
+    ActiveCMAES<FullSelection, BoundaryBoxConstraint> opt(lambda,
+        lowerBound, upperBound, ...);
+    ```
+
+   into
+
+    ```c++
+    ActiveCMAES<FullSelection, BoundaryBoxConstraint> opt(lambda,
+        BoundaryBoxConstraint(lowerBound, upperBound), ...);
+    ```
+ * Add proximal gradient optimizers for L1-constrained and other related
+   problems: `FBS`, `FISTA`, and `FASTA`
+   ([#427](https://github.com/mlpack/ensmallen/pull/427)).  See the
+   documentation for more details.
+
+ * The `Lambda()` and `Sigma()` functions of the `AugLagrangian` optimizer,
+   which could be used to retrieve the Lagrange multipliers and penalty
+   parameter after optimization, are now deprecated
+   ([#439](https://github.com/mlpack/ensmallen/pull/439)).  Instead, pass a
+   vector and a double to the `Optimize()` function directly:
+
+    ```c++
+    augLag.Optimize(function, coordinates, lambda, sigma)
+    ```
+
+   and these will be filled with the final Lagrange multiplier estimates and
+   penalty parameters.
+
+### ensmallen 2.22.2: "E-Bike Excitement"
+###### 2025-04-30
+ * Fix include statement in `tests/de_test.cpp`
+   ([#419](https://github.com/mlpack/ensmallen/pull/419)).
+
+ * Fix `exactObjective` output for SGD-like optimizers when the number of
+   iterations is an even number of epochs
+   ([#417](https://github.com/mlpack/ensmallen/pull/417)).
+
+ * Increase tolerance in `demon_sgd_test.cpp`
+   ([#420](https://github.com/mlpack/ensmallen/pull/420)).
+
+ * Set cmake version range to 3.5...4.0
+   ([#422](https://github.com/mlpack/ensmallen/pull/422)).
+
 ### ensmallen 2.22.1: "E-Bike Excitement"
 ###### 2024-12-02
  * Remove unused variables to fix compiler warnings