Update docstring

lhnguyen-vn · lhnguyen-vn · commit 88ac22c1ae36 · 2021-07-18T00:28:07.000-04:00
diff --git a/src/strategies/basic.jl b/src/strategies/basic.jl
@@ -14,14 +14,14 @@ generated by referencing each member's and the swarm's best models so far.
 
 A single one-dimensional range or vector of one-dimensional ranges can be
 specified. `ParamRange` objects are constructed using the `range` method. If not
-paired with a prior, then one is fitted, as follows:
+paired with a prior, then one is fitted and truncated if bounded, as follows:
 
-| Range Types             | Default Distribution |
-|:----------------------- |:-------------------- |
-| `NominalRange`          | `Dirichlet`          |
-| Bounded `NumericRange`  | `Uniform`            |
-| Positive `NumericRange` | `Gamma`              |
-| Other `NumericRange`    | `Normal`             |
+| Range Types             | Default Distribution                        |
+|:----------------------- |:------------------------------------------- |
+| `NominalRange`          | `Dirichlet([1, 1, ..., 1])`                 |
+| Bounded `NumericRange`  | `Uniform(lower, upper)`                     |
+| Positive `NumericRange` | `Gamma(α=(origin/unit)^2, θ=unit^2/origin`) |
+| Other `NumericRange`    | `Normal(origin, unit)`                      |
 
 Specifically, in `ParticleSwarm`, the `range` field of a `TunedModel` instance
 can be:
@@ -68,21 +68,42 @@ each swarm particle. Velocity is initiated to be zeros, and in each iteration,
 every particle's position is updated to approach its personal best and the
 swarm's best models so far with the equations:
 
-\$vₖ₊₁ = w⋅vₖ + c₁⋅rand()⋅(pbest - x) + c₂⋅rand()⋅(gbest - x)\$
+\$vₖ₊₁ = w⋅vₖ + c₁⋅rand()⋅(pbest - xₖ) + c₂⋅rand()⋅(gbest - xₖ)\$
 
 \$xₖ₊₁ = xₖ + vₖ₊₁\$
 
 New models are then generated for evaluation by mutating the fields of a deep
 copy of `model`. If the corresponding range has a specified `scale` function,
-then the transformation is applied before the hyperparameter is returned. For
-integer `NumericRange`s, the hyperparameter is rounded; and for `NominalRange`s,
-the hyperparameter is sampled from the specified values with the probability
-weights given by each particle.
-
-Personal and social best models are then updated for the swarm. In order to
-replicate both the probability weights and the sampled value for `NominalRange`s
-of the best models, the weights of unselected values are shifted to the selected
-one by the `prob_shift` factor.
+then the transformation is applied before the hyperparameter is returned. If
+`scale` is a symbol (eg, `:log`), it is ignored.
+
+### Discrete Hyperparameter Handling
+
+Since particle swarm is an optimization method for continuous problems, integer
+and nominal hyperparameters require special handling: they are converted to
+continuous values, and transformed back to their original domains at each step
+for evaluation.
+
+For integer `NumericRange`s, a continuous distribution is fitted to generate
+initial values for the swarm. They are then rounded when each particle is mapped
+to the corresponding candidate model.
+
+`NominalRange`s on the other hand are represented as categorical distributions
+over their values. Hence, we use Dirichlet prior distributions to initialize a
+probability vector for each particle, defaulting to the uniform distribution
+Dirichlet([1, 1, ..., 1]). The same velocity and position updates apply, but
+probability values are further clamped in the range [0, 1] and normalized to sum
+up to 1. When a better model is found, we replicate both its probability vector
+and sampled value by shifting unchosen categories' weights towards the selected
+one for pbest and gbest models:
+
+\$pᵢ = (1 - prob_shift) * pᵢ\$
+
+\$pₛ = pₛ + prob_shift\$
+
+where pₛ is the probability of the sampled hyperparameter value. For more
+information, refer to "A New Discrete Particle Swarm Optimization Algorithm" by
+Strasser, Goodman, Sheppard, and Butcher.
 """
 mutable struct ParticleSwarm{R<:AbstractRNG} <: AbstractParticleSwarm
     n_particles::Int