JuliaDynamics
diff --git a/‎CHANGELOG.md
Lines changed: 2 additions & 0 deletions b/‎CHANGELOG.md
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/src/entropies.md
Lines changed: 1 addition & 3 deletions b/‎docs/src/entropies.md
Lines changed: 1 addition & 3 deletions
diff --git a/‎docs/src/examples.md
Lines changed: 62 additions & 75 deletions b/‎docs/src/examples.md
Lines changed: 62 additions & 75 deletions
diff --git a/‎src/entropies/estimators/nearest_neighbors/KozachenkoLeonenko.jl
Lines changed: 22 additions & 9 deletions b/‎src/entropies/estimators/nearest_neighbors/KozachenkoLeonenko.jl
Lines changed: 22 additions & 9 deletions
diff --git a/‎src/entropies/estimators/nearest_neighbors/Kraskov.jl
Lines changed: 17 additions & 7 deletions b/‎src/entropies/estimators/nearest_neighbors/Kraskov.jl
Lines changed: 17 additions & 7 deletions
diff --git a/‎src/entropies/estimators/nearest_neighbors/Zhu.jl
Lines changed: 18 additions & 9 deletions b/‎src/entropies/estimators/nearest_neighbors/Zhu.jl
Lines changed: 18 additions & 9 deletions
diff --git a/‎src/entropies/estimators/nearest_neighbors/ZhuSingh.jl
Lines changed: 11 additions & 0 deletions b/‎src/entropies/estimators/nearest_neighbors/ZhuSingh.jl
Lines changed: 11 additions & 0 deletions
@@ -9,6 +9,8 @@ The API for Entropies.jl has been completely overhauled. Major changes are:
 - Common generic interfaces `entropy`, `entropy_normalized` and `maximum` (maximum entropy) that dispatches on different types of entropies (e.g `Renyi()` `Shannon()`, `Tsallis()`).
 - Convenience functions for common entropies, such as permutation entropy and dispersion entropy.
 - No more deprecation warnings for using the old keyword `α` for Renyi entropy.
+- The `base` of the entropy is now a field of the `Entropy` type, not the estimator. 
+    You'll now have to do `entropy(Shannon(; base = 2), est, x)`.
 - An entirely new section of entropy-like complexity measures, such as the reverse dispersion entropy.
 - Many new estimators, such as `SpatialPermutation` and `PowerSpectrum`.
 - Check the online documentation for a comprehensive overview of the changes.
 
@@ -53,9 +53,7 @@ rely on estimating some density functional.
 
 Each [`EntropyEstimator`](@ref)s uses a specialized technique to approximating relevant
 densities/integrals, and is often tailored to one or a few types of generalized entropy.
-For example, [`Kraskov`](@ref) estimates the [`Shannon`](@ref) entropy, while
-[`LeonenkoProzantoSavani`](@ref) estimates [`Shannon`](@ref), [`Renyi`](@ref), and
-[`Tsallis`](@ref) entropies.
+For example, [`Kraskov`](@ref) estimates the [`Shannon`](@ref) entropy.
 
 | Estimator                    | Principle         | Input data | [`Shannon`](@ref) | [`Renyi`](@ref) | [`Tsallis`](@ref) | [`Kaniadakis`](@ref) | [`Curado`](@ref) | [`StretchedExponential`](@ref) |
 | ---------------------------- | ----------------- | ---------- | :---------------: | :-------------: | :---------------: | :------------------: | :--------------: | :----------------------------: |
 
@@ -25,119 +25,105 @@ ax.zticklabelsvisible = false
 fig
 ```
 
-## Differential entropy: nearest neighbors estimators
+## Differential entropy: estimator comparison
 
-Here, we reproduce Figure 1 in Charzyńska & Gambin (2016)[^Charzyńska2016]. Their example
-demonstrates how the [`Kraskov`](@ref) and [`KozachenkoLeonenko`](@ref) nearest neighbor
-based estimators converge towards the true entropy value for increasing time series length.
-We extend their example with [`Zhu`](@ref) and [`ZhuSingh`](@ref) estimators, which are also
-based on nearest neighbor searches.
+Here, we compare how the nearest neighbor differential entropy estimators
+([`Kraskov`](@ref), [`KozachenkoLeonenko`](@ref), [`Zhu`](@ref) and [`ZhuSingh`](@ref))
+converge towards the true entropy value for increasing time series length.
 
-Input data are from a uniform 1D distribution ``U(0, 1)``, for which the true entropy is
-`ln(1 - 0) = 0`).
+Entropies.jl also provides entropy estimators based on
+[order statistics](https://en.wikipedia.org/wiki/Order_statistic). These estimators
+are only defined for scalar-valued vectors, in this example, so we compute these
+estimates separately, and add these estimators ([`Vasicek`](@ref), [`Ebrahimi`](@ref),
+[`AlizadehArghami`](@ref) and [`Correa`](@ref)) to the comparison.
+
+Input data are from a normal 1D distribution ``\mathcal{N}(0, 1)``, for which the true
+entropy is `0.5*log(2π) + 0.5` nats when using natural logarithms.
 
 ```@example MAIN
 using Entropies
 using DynamicalSystemsBase, CairoMakie, Statistics
-using Distributions: Uniform, Normal
+nreps = 30
+Ns = [100:100:500; 1000:1000:10000]
+e = Shannon(; base = MathConstants.e)
 
-# Define estimators
-base = MathConstants.e # shouldn't really matter here, because the target entropy is 0.
+# --------------------------
+# kNN estimators
+# --------------------------
 w = 0 # Theiler window of 0 (only exclude the point itself during neighbor searches)
-estimators = [
+knn_estimators = [
     # with k = 1, Kraskov is virtually identical to
     # Kozachenko-Leonenko, so pick a higher number of neighbors for Kraskov
     Kraskov(; k = 3, w),
     KozachenkoLeonenko(; w),
     Zhu(; k = 3, w),
     ZhuSingh(; k = 3, w),
 ]
-labels = ["KozachenkoLeonenko", "Kraskov", "Zhu", "ZhuSingh"]
 
 # Test each estimator `nreps` times over time series of varying length.
-nreps = 50
-Ns = [100:100:500; 1000:1000:10000]
-
-Hs_uniform = [[zeros(nreps) for N in Ns] for e in estimators]
-for (i, e) in enumerate(estimators)
+Hs_uniform_knn = [[zeros(nreps) for N in Ns] for e in knn_estimators]
+for (i, est) in enumerate(knn_estimators)
     for j = 1:nreps
-        pts = rand(Uniform(0, 1), maximum(Ns)) |> Dataset
+        pts = randn(maximum(Ns)) |> Dataset
         for (k, N) in enumerate(Ns)
-            Hs_uniform[i][k][j] = entropy(e, pts[1:N])
+            Hs_uniform_knn[i][k][j] = entropy(e, est, pts[1:N])
         end
     end
 end
 
-fig = Figure(resolution = (600, length(estimators) * 200))
-for (i, e) in enumerate(estimators)
-    Hs = Hs_uniform[i]
-    ax = Axis(fig[i,1]; ylabel = "h (nats)")
-    lines!(ax, Ns, mean.(Hs); color = Cycled(i), label = labels[i])
-    band!(ax, Ns, mean.(Hs) .+ std.(Hs), mean.(Hs) .- std.(Hs);
-    color = (Main.COLORS[i], 0.5))
-    ylims!(-0.25, 0.25)
-    axislegend()
-end
-
-fig
-```
-
-## Differential entropy: order statistics estimators
-
-Entropies.jl also provides entropy estimators based on
-[order statistics](https://en.wikipedia.org/wiki/Order_statistic). These estimators
-are only defined for scalar-valued vectors, so we pass the data as `Vector{<:Real}`s instead
-of `Dataset`s, as we did for the nearest-neighbor estimators above.
+# --------------------------
+# Order statistic estimators
+# --------------------------
 
-Here, we show how the [`Vasicek`](@ref), [`Ebrahimi`](@ref), [`AlizadehArghami`](@ref) 
-and [`Correa`](@ref) direct [`Shannon`](@ref) entropy estimators, with increasing sample size,
-approach zero for samples from a uniform distribution on  `[0, 1]`. The true entropy value in
-nats for this distribution is `ln(1 - 0) = 0`.
-
-```@example MAIN
-using Entropies
-using Statistics
-using Distributions: Uniform
-using CairoMakie
-
-# Define estimators
-base = MathConstants.e # shouldn't really matter here, because the target entropy is 0.
-# just provide types here, they are instantiated inside the loop
-estimators = [Vasicek, Ebrahimi, AlizadehArghami, Correa]
-labels = ["Vasicek", "Ebrahimi", "AlizadehArghami", "Correa"]
-
-# Test each estimator `nreps` times over time series of varying length.
-Ns = [100:100:500; 1000:1000:10000]
-nreps = 30
-
-Hs_uniform = [[zeros(nreps) for N in Ns] for e in estimators]
-for (i, e) in enumerate(estimators)
+# Just provide types here, they are instantiated inside the loop
+estimators_os = [Vasicek, Ebrahimi, AlizadehArghami, Correa]
+Hs_uniform_os = [[zeros(nreps) for N in Ns] for e in estimators_os]
+for (i, est_os) in enumerate(estimators_os)
     for j = 1:nreps
-        pts = rand(Uniform(0, 1), maximum(Ns)) # raw timeseries, not a `Dataset`
+        pts = randn(maximum(Ns)) # raw timeseries, not a `Dataset`
         for (k, N) in enumerate(Ns)
             m = floor(Int, N / 100) # Scale `m` to timeseries length
-            est = e(; m, base) # Instantiate estimator with current `m`
-            Hs_uniform[i][k][j] = entropy(est, pts[1:N])
+            est = est_os(; m) # Instantiate estimator with current `m`
+            Hs_uniform_os[i][k][j] = entropy(e, est, pts[1:N])
         end
     end
 end
 
-fig = Figure(resolution = (600, length(estimators) * 200))
-for (i, e) in enumerate(estimators)
-    Hs = Hs_uniform[i]
+# -------------
+# Plot results
+# -------------
+fig = Figure(resolution = (700, 8 * 200))
+labels_knn = ["KozachenkoLeonenko", "Kraskov", "Zhu", "ZhuSingh"]
+labels_os = ["Vasicek", "Ebrahimi", "AlizadehArghami", "Correa"]
+
+for (i, e) in enumerate(knn_estimators)
+    Hs = Hs_uniform_knn[i]
     ax = Axis(fig[i,1]; ylabel = "h (nats)")
-    lines!(ax, Ns, mean.(Hs); color = Cycled(i), label = labels[i])
-    band!(ax, Ns, mean.(Hs) .+ std.(Hs), mean.(Hs) .- std.(Hs);
-    color = (Main.COLORS[i], 0.5))
-    ylims!(-0.25, 0.25)
+    lines!(ax, Ns, mean.(Hs); color = Cycled(i), label = labels_knn[i])
+    band!(ax, Ns, mean.(Hs) .+ std.(Hs), mean.(Hs) .- std.(Hs); alpha = 0.5,
+        color = (Main.COLORS[i], 0.5))
+    hlines!(ax, [(0.5*log(2π) + 0.5)], color = :black, lw = 5, linestyle = :dash)
+
+    ylims!(1.2, 1.6)
+    axislegend()
+end
+
+for (i, e) in enumerate(estimators_os)
+    Hs = Hs_uniform_os[i]
+    ax = Axis(fig[i + length(knn_estimators),1]; ylabel = "h (nats)")
+    lines!(ax, Ns, mean.(Hs); color = Cycled(i), label = labels_os[i])
+    band!(ax, Ns, mean.(Hs) .+ std.(Hs), mean.(Hs) .- std.(Hs), alpha = 0.5,
+        color = (Main.COLORS[i], 0.5))
+    hlines!(ax, [(0.5*log(2π) + 0.5)], color = :black, lw = 5, linestyle = :dash)
+    ylims!(1.2, 1.6)
     axislegend()
 end
 
 fig
 ```
 
-As for the nearest neighbor estimators, both estimators also approach the
-true entropy value for this example, but is negatively biased for small sample sizes.
+All estimators approach the true differential entropy, but those based on order statistics
+are negatively biased for small sample sizes.
 
 ## Discrete entropy: permutation entropy
 
@@ -315,6 +301,7 @@ using Entropies
 using DynamicalSystemsBase
 using Random
 using CairoMakie
+using Distributions: Normal
 
 n = 1000
 ts = 1:n
 
@@ -2,36 +2,49 @@ export KozachenkoLeonenko
 
 """
     KozachenkoLeonenko <: EntropyEstimator
-    KozachenkoLeonenko(; k::Int = 1, w::Int = 1, base = 2)
+    KozachenkoLeonenko(; k::Int = 1, w::Int = 1)
 
 The `KozachenkoLeonenko` estimator computes the [`Shannon`](@ref) differential
-[`entropy`](@ref) of `x` (a multi-dimensional `Dataset`) to the given `base`, based on
-nearest neighbor searches using the method from Kozachenko & Leonenko
-(1987)[^KozachenkoLeonenko1987], as described in Charzyńska and Gambin[^Charzyńska2016].
+[`entropy`](@ref) of `x` (a multi-dimensional `Dataset`).
+
+## Description
+
+Assume we have samples ``\\{\\bf{x}_1, \\bf{x}_2, \\ldots, \\bf{x}_N \\}`` from a
+continuous random variable ``X \\in \\mathbb{R}^d`` with support ``\\mathcal{X}`` and
+density function``f : \\mathbb{R}^d \\to \\mathbb{R}``. `KozachenkoLeonenko` estimates
+the [Shannon](@ref) differential entropy
+
+```math
+H(X) = \\int_{\\mathcal{X}} f(x) \\log f(x) dx = \\mathbb{E}[-\\log(f(X))]
+```
+
+using the nearest neighbor method from Kozachenko &
+Leonenko (1987)[^KozachenkoLeonenko1987], as described in Charzyńska and
+Gambin[^Charzyńska2016].
 
 `w` is the Theiler window, which determines if temporal neighbors are excluded
 during neighbor searches (defaults to `0`, meaning that only the point itself is excluded
 when searching for neighbours).
 
 In contrast to [`Kraskov`](@ref), this estimator uses only the *closest* neighbor.
 
-See also: [`entropy`](@ref).
+
+See also: [`entropy`](@ref), [`Kraskov`](@ref), [`EntropyEstimator`](@ref).
 
 [^Charzyńska2016]: Charzyńska, A., & Gambin, A. (2016). Improvement of the k-NN entropy
     estimator with applications in systems biology. Entropy, 18(1), 13.
 [^KozachenkoLeonenko1987]: Kozachenko, L. F., & Leonenko, N. N. (1987). Sample estimate of
     the entropy of a random vector. Problemy Peredachi Informatsii, 23(2), 9-16.
 """
-@Base.kwdef struct KozachenkoLeonenko{B} <: EntropyEstimator
+@Base.kwdef struct KozachenkoLeonenko <: EntropyEstimator
     w::Int = 1
-    base::B = 2
 end
 
 function entropy(e::Renyi, est::KozachenkoLeonenko, x::AbstractDataset{D, T}) where {D, T}
     e.q == 1 || throw(ArgumentError(
         "Renyi entropy with q = $(e.q) not implemented for $(typeof(est)) estimator"
     ))
-    (; w, base) = est
+    (; w) = est
 
     N = length(x)
     ρs = maximum_neighbor_distances(x, w, 1)
@@ -40,5 +53,5 @@ function entropy(e::Renyi, est::KozachenkoLeonenko, x::AbstractDataset{D, T}) wh
         log(MathConstants.e, ball_volume(D)) +
         MathConstants.eulergamma +
         log(MathConstants.e, N - 1)
-    return h / log(base, MathConstants.e) # Convert to target unit
+    return h / log(e.base, MathConstants.e) # Convert to target unit
 end
@@ -2,38 +2,48 @@ export Kraskov
 
 """
     Kraskov <: EntropyEstimator
-    Kraskov(; k::Int = 1, w::Int = 1, base = 2)
+    Kraskov(; k::Int = 1, w::Int = 1)
 
 The `Kraskov` estimator computes the [`Shannon`](@ref) differential [`entropy`](@ref) of `x`
-(a multi-dimensional `Dataset`) to the given `base`, using the `k`-th nearest neighbor
+(a multi-dimensional `Dataset`) using the `k`-th nearest neighbor
 searches method from [^Kraskov2004].
 
 `w` is the Theiler window, which determines if temporal neighbors are excluded
 during neighbor searches (defaults to `0`, meaning that only the point itself is excluded
 when searching for neighbours).
 
-See also: [`entropy`](@ref), [`KozachenkoLeonenko`](@ref).
+## Description
+
+Assume we have samples ``\\{\\bf{x}_1, \\bf{x}_2, \\ldots, \\bf{x}_N \\}`` from a
+continuous random variable ``X \\in \\mathbb{R}^d`` with support ``\\mathcal{X}`` and
+density function``f : \\mathbb{R}^d \\to \\mathbb{R}``. `Kraskov` estimates the
+[Shannon](@ref) differential entropy
+
+```math
+H(X) = \\int_{\\mathcal{X}} f(x) \\log f(x) dx = \\mathbb{E}[-\\log(f(X))].
+```
+
+See also: [`entropy`](@ref), [`KozachenkoLeonenko`](@ref), [`EntropyEstimator`](@ref).
 
 [^Kraskov2004]:
     Kraskov, A., Stögbauer, H., & Grassberger, P. (2004).
     Estimating mutual information. Physical review E, 69(6), 066138.
 """
-Base.@kwdef struct Kraskov{B} <: EntropyEstimator
+Base.@kwdef struct Kraskov <: EntropyEstimator
     k::Int = 1
     w::Int = 1
-    base::B = 2
 end
 
 function entropy(e::Renyi, est::Kraskov, x::AbstractDataset{D, T}) where {D, T}
     e.q == 1 || throw(ArgumentError(
         "Renyi entropy with q = $(e.q) not implemented for $(typeof(est)) estimator"
     ))
-    (; k, w, base) = est
+    (; k, w) = est
     N = length(x)
     ρs = maximum_neighbor_distances(x, w, k)
     # The estimated entropy has "unit" [nats]
     h = -digamma(k) + digamma(N) +
         log(MathConstants.e, ball_volume(D)) +
         D/N*sum(log.(MathConstants.e, ρs))
-    return h / log(base, MathConstants.e) # Convert to target unit
+    return h / log(e.base, MathConstants.e) # Convert to target unit
 end
@@ -4,18 +4,27 @@ export Zhu
     Zhu <: EntropyEstimator
     Zhu(k = 1, w = 0)
 
-The `Zhu` estimator (Zhu et al., 2015)[^Zhu2015] computes the [`Shannon`](@ref)
-differential [`entropy`](@ref) of `x` (a multi-dimensional `Dataset`), by
-approximating probabilities within hyperrectangles surrounding each point `xᵢ ∈ x` using
-using `k` nearest neighbor searches.
+The `Zhu` estimator (Zhu et al., 2015)[^Zhu2015] is an extension to
+[`KozachenkoLeonenko`](@ref), and computes the [`Shannon`](@ref)
+differential [`entropy`](@ref) of `x` (a multi-dimensional `Dataset`).
 
-`w` is the Theiler window, which determines if temporal neighbors are excluded
-during neighbor searches (defaults to `0`, meaning that only the point itself is excluded
-when searching for neighbours).
+## Description
 
-This estimator is an extension to [`KozachenkoLeonenko`](@ref).
+Assume we have samples ``\\{\\bf{x}_1, \\bf{x}_2, \\ldots, \\bf{x}_N \\}`` from a
+continuous random variable ``X \\in \\mathbb{R}^d`` with support ``\\mathcal{X}`` and
+density function``f : \\mathbb{R}^d \\to \\mathbb{R}``. `Zhu` estimates the [Shannon](@ref)
+differential entropy
 
-See also: [`entropy`](@ref).
+```math
+H(X) = \\int_{\\mathcal{X}} f(x) \\log f(x) dx = \\mathbb{E}[-\\log(f(X))]
+```
+
+by approximating densities within hyperrectangles surrounding each point `xᵢ ∈ x` using
+using `k` nearest neighbor searches. `w` is the Theiler window, which determines if
+temporal neighbors are excluded during neighbor searches (defaults to `0`, meaning that
+only the point itself is excluded when searching for neighbours).
+
+See also: [`entropy`](@ref), [`KozachenkoLeonenko`](@ref), [`EntropyEstimator`](@ref).
 
 [^Zhu2015]:
     Zhu, J., Bellanger, J. J., Shu, H., & Le Bouquin Jeannès, R. (2015). Contribution to
 
@@ -12,6 +12,17 @@ export ZhuSingh
 The `ZhuSingh` estimator (Zhu et al., 2015)[^Zhu2015] computes the [`Shannon`](@ref)
 differential [`entropy`](@ref) of `x` (a multi-dimensional `Dataset`).
 
+## Description
+
+Assume we have samples ``\\{\\bf{x}_1, \\bf{x}_2, \\ldots, \\bf{x}_N \\}`` from a
+continuous random variable ``X \\in \\mathbb{R}^d`` with support ``\\mathcal{X}`` and
+density function``f : \\mathbb{R}^d \\to \\mathbb{R}``. `ZhuSingh` estimates the
+[Shannon](@ref) differential entropy
+
+```math
+H(X) = \\int_{\\mathcal{X}} f(x) \\log f(x) dx = \\mathbb{E}[-\\log(f(X))].
+```
+
 Like [`Zhu`](@ref), this estimator approximates probabilities within hyperrectangles
 surrounding each point `xᵢ ∈ x` using using `k` nearest neighbor searches. However,
 it also considers the number of neighbors falling on the borders of these hyperrectangles.