Skip to content

Commit 78f555c

Browse files
kahaagaDatseris
andauthored
Dispersion and reverse dispersion probability estimators (#96)
* Dispersion and reverse dispersion probability estimators * Fix tests * ReverseDispersion should be a complexity measure, plus addressing some comments * Export reverse_dispersion * Reverse dispersion * Remove reference to `ReverseDispersion` * Update docstring * Add analytical tests * Improve tests and add doc examples * Fix tests * Better docs and doctests * Remove file * Fix tests * Update src/complexity_measures/reverse_dispersion_entropy.jl Co-authored-by: George Datseris <[email protected]> * Addressing review comments (#99) * Address review comments * Throw error * Remove non-used field * Fix tests * Merge docs and move individual methods to their respective files * Tsallis reduces to Shannon entropy for q -> 1. * Normalized entropy API, including utility functions * Analytical test cases for Tsallis * Include `k` in `maxentropy_tsallis` Co-authored-by: George Datseris <[email protected]>
1 parent 2aa89a9 commit 78f555c

26 files changed

+660
-125
lines changed

docs/src/complexity_measures.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,14 @@
1-
# Complexity measures
1+
# [Complexity measures](@id complexity_measures)
22

33
## Sample entropy
44

55
## Approximate entropy
66

7+
## Reverse dispersion entropy
8+
9+
```@docs
10+
reverse_dispersion
11+
distance_to_whitenoise
12+
```
13+
714
## Disequilibrium

docs/src/entropies.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,21 @@ entropy_tsallis
1313
```
1414

1515
## Shannon entropy (convenience)
16+
1617
```@docs
1718
entropy_shannon
1819
```
1920

21+
## Normalization
22+
23+
The generic [`entropy_normalized`](@ref) normalizes any entropy value to the entropy of a
24+
uniform distribution. We also provide [maximum entropy](@ref maximum_entropy) functions
25+
that are useful for manual normalization.
26+
27+
```@docs
28+
entropy_normalized
29+
```
30+
2031
## Indirect entropies
2132
Here we list functions which compute Shannon entropies via alternate means, without explicitly computing some probability distributions and then using the Shannon formulat.
2233

@@ -33,4 +44,4 @@ entropy_permutation
3344
entropy_spatial_permutation
3445
entropy_wavelet
3546
entropy_dispersion
36-
```
47+
```

docs/src/examples.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -145,3 +145,60 @@ for a in (ax, ay, az); axislegend(a); end
145145
for a in (ax, ay); hidexdecorations!(a; grid=false); end
146146
fig
147147
```
148+
149+
## [Dispersion and reverse dispersion entropy](@id dispersion_examples)
150+
151+
Here we reproduce parts of figure 3 in Li et al. (2019), computing reverse and regular dispersion entropy for a time series consisting of normally distributed noise with a single spike in the middle of the signal. We compute the entropies over a range subsets of the data, using a sliding window consisting of 70 data points, stepping the window 10 time steps at a time.
152+
153+
Note: the results here are not exactly the same as in the original paper, because Li et
154+
al. (2019) base their examples on randomly generated numbers and do not provide code that
155+
specify random number seeds.
156+
157+
```@example
158+
using Entropies, DynamicalSystems, Random, CairoMakie, Distributions
159+
160+
n = 1000
161+
ts = 1:n
162+
x = [i == n ÷ 2 ? 50.0 : 0.0 for i in ts]
163+
rng = Random.default_rng()
164+
s = rand(rng, Normal(0, 1), n)
165+
y = x .+ s
166+
167+
ws = 70
168+
windows = [t:t+ws for t in 1:10:n-ws]
169+
rdes = zeros(length(windows))
170+
des = zeros(length(windows))
171+
pes = zeros(length(windows))
172+
173+
m, c = 2, 6
174+
est_de = Dispersion(symbolization = GaussianSymbolization(c), m = m, τ = 1)
175+
176+
for (i, window) in enumerate(windows)
177+
rdes[i] = reverse_dispersion(y[window], est_de; normalize = true)
178+
des[i] = entropy_renyi_norm(y[window], est_de)
179+
end
180+
181+
fig = Figure()
182+
183+
a1 = Axis(fig[1,1]; xlabel = "Time step", ylabel = "Value")
184+
lines!(a1, ts, y)
185+
display(fig)
186+
187+
a2 = Axis(fig[2, 1]; xlabel = "Time step", ylabel = "Value")
188+
p_rde = scatterlines!([first(w) for w in windows], rdes,
189+
label = "Reverse dispersion entropy",
190+
color = :black,
191+
markercolor = :black, marker = '●')
192+
p_de = scatterlines!([first(w) for w in windows], des,
193+
label = "Dispersion entropy",
194+
color = :red,
195+
markercolor = :red, marker = 'x', markersize = 20)
196+
197+
axislegend(position = :rc)
198+
ylims!(0, max(maximum(pes), 1))
199+
fig
200+
```
201+
202+
[^Rostaghi2016]: Rostaghi, M., & Azami, H. (2016). Dispersion entropy: A measure for time-series analysis. IEEE Signal Processing Letters, 23(5), 610-614.
203+
[^Li2019]: Li, Y., Gao, X., & Wang, L. (2019). Reverse dispersion entropy: a new
204+
complexity measure for sensor signal. Sensors, 19(23), 5203.

docs/src/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ Thus, any of the implemented [probabilities estimators](@ref estimators) can be
4242

4343

4444
### Complexity measures
45-
Other complexity measures, which strictly speaking don't compute entropies, and may or may not explicitly compute probability distributions, appear in the [Complexity measures](@ref) section.
45+
Other complexity measures, which strictly speaking don't compute entropies, and may or may not explicitly compute probability distributions, appear in the [Complexity measures](@ref complexity_measures) section.
4646

4747

4848
## Input data

docs/src/probabilities.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,12 @@ SymbolicPermutation
2424
SpatialSymbolicPermutation
2525
```
2626

27+
## Dispersion (symbolic)
28+
29+
```@docs
30+
Dispersion
31+
```
32+
2733
## Visitation frequency (binning)
2834

2935
```@docs

docs/src/utils.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,3 +26,17 @@ OrdinalPattern
2626
```@docs
2727
Entropies.encode_motif
2828
```
29+
30+
### Normalization
31+
32+
```@docs
33+
alphabet_length
34+
```
35+
36+
#### [Maximum entropy](@id maximum_entropy)
37+
38+
```@docs
39+
maxentropy_tsallis
40+
maxentropy_renyi
41+
maxentropy_shannon
42+
```

src/Entropies.jl

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ include("symbolization/symbolize.jl")
1616
include("probabilities.jl")
1717
include("probabilities_estimators/probabilities_estimators.jl")
1818
include("entropies/entropies.jl")
19+
include("complexity_measures/complexity_measures.jl")
20+
1921
include("deprecations.jl")
2022

2123

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
include("reverse_dispersion_entropy.jl")
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
export reverse_dispersion
2+
export distance_to_whitenoise
3+
4+
# Note: this is not an entropy estimator, so we don't use the entropy_xxx_norm interface
5+
# for normalization, even though we rely on `alphabet_length`.
6+
"""
7+
distance_to_whitenoise(p::Probabilities, estimator::Dispersion; normalize = false)
8+
9+
Compute the distance of the probability distribution `p` from a uniform distribution,
10+
given the parameters of `estimator` (which must be known beforehand).
11+
12+
If `normalize == true`, then normalize the value to the interval `[0, 1]` by using the
13+
parameters of `estimator`.
14+
15+
Used to compute reverse dispersion entropy([`reverse_dispersion`](@ref);
16+
Li et al., 2019[^Li2019]).
17+
18+
[^Li2019]: Li, Y., Gao, X., & Wang, L. (2019). Reverse dispersion entropy: a new
19+
complexity measure for sensor signal. Sensors, 19(23), 5203.
20+
"""
21+
function distance_to_whitenoise(p::Probabilities, est::Dispersion; normalize = false)
22+
# We can safely skip non-occurring symbols, because they don't contribute
23+
# to the sum in eq. 3 in Li et al. (2019)
24+
Hrde = sum(abs2, p) - (1 / alphabet_length(est))
25+
26+
if normalize
27+
return Hrde / (1 - (1 / alphabet_length(est)))
28+
else
29+
return Hrde
30+
end
31+
end
32+
33+
# Note again: this is a *complexity measure*, not an entropy estimator, so we don't use
34+
# the entropy_xxx_norm interface for normalization, even though we rely on `alphabet_length`.
35+
"""
36+
reverse_dispersion(x::AbstractVector{T}, est::Dispersion = Dispersion();
37+
normalize = true) where T <: Real
38+
39+
Compute the reverse dispersion entropy complexity measure (Li et al., 2019)[^Li2019].
40+
41+
## Description
42+
43+
Li et al. (2021)[^Li2019] defines the reverse dispersion entropy as
44+
45+
```math
46+
H_{rde} = \\sum_{i = 1}^{c^m} \\left(p_i - \\dfrac{1}{{c^m}} \\right)^2 =
47+
\\left( \\sum_{i=1}^{c^m} p_i^2 \\right) - \\dfrac{1}{c^{m}}
48+
```
49+
where the probabilities ``p_i`` are obtained precisely as for the [`Dispersion`](@ref)
50+
probability estimator. Relative frequencies of dispersion patterns are computed using the
51+
given `symbolization` scheme , which defaults to symbolization using the normal cumulative
52+
distribution function (NCDF), as implemented by [`GaussianSymbolization`](@ref), using
53+
embedding dimension `m` and embedding delay `τ`.
54+
Recommended parameter values[^Li2018] are `m ∈ [2, 3]`, `τ = 1` for the embedding, and
55+
`c ∈ [3, 4, …, 8]` categories for the Gaussian mapping.
56+
57+
If `normalize == true`, then the reverse dispersion entropy is normalized to `[0, 1]`.
58+
59+
The minimum value of ``H_{rde}`` is zero and occurs precisely when the dispersion
60+
pattern distribution is flat, which occurs when all ``p_i``s are equal to ``1/c^m``.
61+
Because ``H_{rde} \\geq 0``, ``H_{rde}`` can therefore be said to be a measure of how far
62+
the dispersion pattern probability distribution is from white noise.
63+
64+
[^Li2019]: Li, Y., Gao, X., & Wang, L. (2019). Reverse dispersion entropy: a new
65+
complexity measure for sensor signal. Sensors, 19(23), 5203.
66+
"""
67+
function reverse_dispersion(x::AbstractVector{T}, est::Dispersion = Dispersion();
68+
normalize = true) where T <: Real
69+
70+
p = probabilities(x, est)
71+
72+
# The following step combines distance information with the probabilities, so
73+
# from here on, it is not possible to use `renyi_entropy` or similar methods, because
74+
# we're not dealing with probabilities anymore.
75+
Hrde = distance_to_whitenoise(p, est, normalize = normalize)
76+
end

src/entropies/convenience_definitions.jl

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,20 @@ function entropy_wavelet(x; wavelet = Wavelets.WT.Daubechies{12}(), base = MathC
5353
entropy_renyi(x, est; base, q = 1)
5454
end
5555

56-
function entropy_dispersion(args...)
56+
"""
57+
entropy_dispersion(x; m = 2, τ = 1, s = GaussianSymbolization(3),
58+
base = MathConstants.e)
59+
60+
Compute the dispersion entropy. This function is just a convenience call to:
61+
```julia
62+
est = Dispersion(m = m, τ = τ, s = s)
63+
entropy_shannon(x, est; base)
64+
```
65+
See [`Dispersion`](@ref) for more info.
66+
"""
67+
function entropy_dispersion(x; wavelet = Wavelets.WT.Daubechies{12}(),
68+
base = MathConstants.e)
5769

58-
end
70+
est = Dispersion(m = m, τ = τ, s = s)
71+
entropy_renyi(x, est; base, q = 1)
72+
end

0 commit comments

Comments
 (0)