Skip to content

Commit cbb3319

Browse files
Datseriskahaaga
andauthored
Review of codebase and docs - Probabilities and Encodings- Datseris (#213)
* update probabilities table * CountOccurrences works with `Any` input * better terminology header * simpler headers in probabilities * Add encodings page * simplify SymbolicPermutation docstring * reference complexity measures * correct dosctring to reference isrand * more organized tests for symbolic permutat * full rewrite of `SymbolicPermutation` and proper `encode` for Ordinal. * type optimization in making the embedding * remove entropy! * simplifi probabilities! even more * move fasthist to encoding folder * complete unification of symbolic perm methods * docstring for weighted fversion * add docstring to amplkitude aware * delete ALL other files * fix all symbolic permutation tests * fix all permutation tests (and one file only) * clarify source code of encode Gaussian * better docstring for GaussEncod * simplify docstring of Dispersion * more tests for naivekernel * Zhu -> Correa * shorter docstring for spatial permutation * port spatial permutation example to Examples * re-write SpatialSymb to have encoding as field. All tests pass. * better display of exampels in decode * better doc for ordinal encoding * Some typos/nitpickery * Probabilities can't compute. Computations are done with probabilities as *input* * Don't duplicate `SpatialDispersion` * Clarify docstrings a bit * Typo * Cross-reference spatial estimators Co-authored-by: Kristian Haaga <[email protected]>
1 parent 3de9294 commit cbb3319

37 files changed

+725
-1033
lines changed

docs/Project.toml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
33
ChaosTools = "608a59af-f2a3-5ad4-90b4-758bdf3122a7"
44
CoordinateTransformations = "150eb455-5306-5404-9cee-2592286d6298"
5-
DelayEmbeddings = "5732040d-69e3-5649-938a-b6b4f237613f"
65
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
76
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
87
DocumenterTools = "35a29f4d-8980-5a13-9543-d66fff28ecb8"

docs/make.jl

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,10 @@ cd(@__DIR__)
22
using Pkg
33
CI = get(ENV, "CI", nothing) == "true" || get(ENV, "GITHUB_TOKEN", nothing) !== nothing
44
using Entropies
5-
using DelayEmbeddings
65
using Documenter
76
using DocumenterTools: Themes
87
using CairoMakie
8+
using Entropies.DelayEmbeddings
99
import Entropies.Wavelets
1010

1111
# %% JuliaDynamics theme
@@ -35,6 +35,7 @@ ENV["JULIA_DEBUG"] = "Documenter"
3535
ENTROPIES_PAGES = [
3636
"index.md",
3737
"probabilities.md",
38+
"encodings.md",
3839
"entropies.md",
3940
"complexity.md",
4041
"multiscale.md",

docs/src/devdocs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Good practices in developing a code base apply in every Pull Request. The [Good
1111
5. If suitable, the estimator may be able to operate based on [`Encoding`]s. If so, it is preferred to implement an `Encoding` subtype and extend the methods [`encode`](@ref) and [`decode`](@ref). This will allow your probabilities estimator to be used with a larger span of entropy and complexity methods without additional effort.
1212
6. Implement dispatch for [`probabilities_and_outcomes`](@ref) and your probabilities estimator type.
1313
7. Implement dispatch for [`outcome_space`](@ref) and your probabilities estimator type.
14-
8. Add your probabilities estimator type to the list in the docstring of [`ProbabilitiyEstimator`](@ref), and if you also made an encoding, add it to the [`Encoding`](@ref) docstring.
14+
8. Add your probabilities estimator type to the table list in the documentation page of probabilities. If you made an encoding, also add it to corresponding table in the encodings section.
1515

1616
### Optional steps
1717
You may extend any of the following functions if there are potential performance benefits in doing so:

docs/src/encodings.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Encodings
2+
3+
## Encoding API
4+
5+
Some probability estimators first "encode" input data into an intermediate representation indexed by the positive integers. This intermediate representation is called an "encoding" and its API is defined by the following:
6+
7+
```@docs
8+
Encoding
9+
encode
10+
decode
11+
```
12+
13+
## Available encodings
14+
15+
```@docs
16+
OrdinalPatternEncoding
17+
GaussianCDFEncoding
18+
RectangularBinEncoding
19+
```
20+

docs/src/examples.md

Lines changed: 36 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -233,7 +233,7 @@ fig
233233

234234
### Kaniadakis entropy
235235

236-
Here, we show how [`Kaniadakis`](@ref) entropy changes as function of the parameter `a` for
236+
Here, we show how [`Kaniadakis`](@ref) entropy changes as function of the parameter `a` for
237237
a range of two-element probability distributions given by
238238
`Probabilities([p, 1 - p] for p in 1:0.0:0.01:1.0)`.
239239

@@ -370,11 +370,41 @@ end
370370
You see that while the direct entropy values of the chaotic and noisy signals change massively with `N` but they are almost the same for the normalized version.
371371
For the regular signals, the entropy decreases nevertheless because the noise contribution of the Fourier computation becomes less significant.
372372

373+
## Spatiotemporal permutation entropy
374+
375+
Usage of a [``SpatialSymbolicPermutation`](@ref) estimator is straightforward.
376+
Here we get the spatial permutation entropy of a 2D array (e.g., an image):
377+
378+
```@example MAIN
379+
using Entropies
380+
x = rand(50, 50) # some image
381+
stencil = [1 1; 0 1] # or one of the other ways of specifying stencils
382+
est = SpatialSymbolicPermutation(stencil, x)
383+
h = entropy(est, x)
384+
```
385+
386+
To apply this to timeseries of spatial data, simply loop over the call, e.g.:
387+
388+
```@example MAIN
389+
data = [rand(50, 50) for i in 1:10] # e.g., evolution of a 2D field of a PDE
390+
est = SpatialSymbolicPermutation(stencil, first(data))
391+
h_vs_t = map(d -> entropy(est, d), data)
392+
```
393+
394+
Computing any other generalized spatiotemporal permutation entropy is trivial, e.g. with [`Renyi`](@ref):
395+
396+
```@example MAIN
397+
x = reshape(repeat(1:5, 500) .+ 0.1*rand(500*5), 50, 50)
398+
est = SpatialSymbolicPermutation(stencil, x)
399+
entropy(Renyi(q = 2), est, x)
400+
```
401+
402+
373403
## Spatial discrete entropy: Fabio
374404

375405
Let's see how the normalized permutation and dispersion entropies increase for an image that gets progressively more noise added to it.
376406

377-
```@example
407+
```@example MAIN
378408
using Entropies
379409
using Distributions
380410
using CairoMakie
@@ -386,11 +416,11 @@ rot = warp(img, recenter(RotMatrix(-3pi/2), center(img));)
386416
original = Float32.(rot)
387417
noise_levels = collect(0.0:0.25:1.0) .* std(original) * 5 # % of 1 standard deviation
388418
389-
noisy_imgs = [i == 1 ? original : original .+ rand(Uniform(0, nL), size(original))
419+
noisy_imgs = [i == 1 ? original : original .+ rand(Uniform(0, nL), size(original))
390420
for (i, nL) in enumerate(noise_levels)]
391421
392422
# a 2x2 stencil (i.e. dispersion/permutation patterns of length 4)
393-
stencil = ((2, 2), (1, 1))
423+
stencil = ((2, 2), (1, 1))
394424
395425
est_disp = SpatialDispersion(stencil, original; c = 5, periodic = false)
396426
est_perm = SpatialSymbolicPermutation(stencil, original; periodic = false)
@@ -399,8 +429,8 @@ hs_perm = [entropy_normalized(est_perm, img) for img in noisy_imgs]
399429
400430
# Plot the results
401431
fig = Figure(size = (800, 1000))
402-
ax = Axis(fig[1, 1:length(noise_levels)],
403-
xlabel = "Noise level",
432+
ax = Axis(fig[1, 1:length(noise_levels)],
433+
xlabel = "Noise level",
404434
ylabel = "Normalized entropy")
405435
scatterlines!(ax, noise_levels, hs_disp, label = "Dispersion")
406436
scatterlines!(ax, noise_levels, hs_perm, label = "Permutation")

docs/src/index.md

Lines changed: 19 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -8,23 +8,22 @@ Entropies
88
You are reading the development version of the documentation of Entropies.jl,
99
that will become version 2.0.
1010

11-
## API & terminology
11+
## Terminology
1212

1313
!!! note
1414
The documentation here follows (loosely) chapter 5 of
1515
[Nonlinear Dynamics](https://link.springer.com/book/10.1007/978-3-030-91032-7),
1616
Datseris & Parlitz, Springer 2022.
1717

1818
In the literature, the term "entropy" is used (and abused) in multiple contexts.
19-
The API and documentation of Entropies.jl aim to clarify some aspects of its usage, and
20-
to provide a simple way to obtain probabilities, entropies, or other complexity measures.
19+
The API and documentation of Entropies.jl aim to clarify some aspects of its usage, and to provide a simple way to obtain probabilities, entropies, or other complexity measures.
2120

2221
### Probabilities
2322

2423
Entropies and other complexity measures are typically computed based on _probability distributions_.
25-
These are obtained from [Input data for Entropies.jl](@ref) in a plethora of different ways.
26-
The central API function that returns a probability distribution (in fact, just a vector of probabilities) is [`probabilities`](@ref), which takes in a subtype of [`ProbabilitiesEstimator`](@ref) to specify how the probabilities are computed.
27-
All estimators available in Entropies.jl can be found in the [estimators page](@ref probabilities_estimators).
24+
These can be obtained from input data in a plethora of different ways.
25+
The central API function that returns a probability distribution (or more precisely a probability mass function) is [`probabilities`](@ref), which takes in a subtype of [`ProbabilitiesEstimator`](@ref) to specify how the probabilities are computed.
26+
All available estimators can be found in the [estimators page](@ref probabilities_estimators).
2827

2928
### Entropies
3029

@@ -40,24 +39,28 @@ Thus, any of the implemented [probabilities estimators](@ref probabilities_estim
4039

4140
These names are commonplace, and so in Entropies.jl we provide convenience functions like [`entropy_wavelet`](@ref). However, it should be noted that these functions really aren't anything more than 2-lines-of-code wrappers that call [`entropy`](@ref) with the appropriate [`ProbabilitiesEstimator`](@ref).
4241

43-
In addition to `ProbabilitiesEstimators`, we also provide [`EntropyEstimator`](@ref)s,
44-
which compute entropies via alternate means, without explicitly computing some
42+
In addition to `ProbabilitiesEstimators`, we also provide [`EntropyEstimator`](@ref)s,
43+
which compute entropies via alternate means, without explicitly computing some
4544
probability distribution. Differential/continuous entropy, for example, is computed
46-
using a dedicated [`EntropyEstimator`](@ref). For example, the [`Kraskov`](@ref)
47-
estimator computes Shannon differential entropy via a nearest neighbor algorithm, while
48-
the [`Zhu`](@ref) estimator computes Shannon differential entropy using order statistics.
45+
using a dedicated [`EntropyEstimator`](@ref). For example, the [`Kraskov`](@ref)
46+
estimator computes Shannon differential entropy via a nearest neighbor algorithm, while
47+
the [`Correa`](@ref) estimator computes Shannon differential entropy using order statistics.
4948

5049
### Other complexity measures
5150

52-
Other complexity measures, which strictly speaking don't compute entropies, and may or may
53-
not explicitly compute probability distributions, are found in
54-
[Complexity.jl](https://github.com/JuliaDynamics/Complexity.jl) package. This includes
55-
measures like sample entropy and approximate entropy.
51+
Other complexity measures, which strictly speaking don't compute entropies, and may or may not explicitly compute probability distributions, are found in
52+
[Complexity measures](@ref) page.
53+
This includes measures like sample entropy and approximate entropy.
5654

5755
## [Input data for Entropies.jl](@id input_data)
5856

59-
The input data type typically depend on the probability estimator chosen. In general though, the standard DynamicalSystems.jl approach is taken and as such we have three types of input data:
57+
The input data type typically depend on the probability estimator chosen.
58+
In general though, the standard DynamicalSystems.jl approach is taken and as such we have three types of input data:
6059

6160
- _Timeseries_, which are `AbstractVector{<:Real}`, used in e.g. with [`WaveletOverlap`](@ref).
6261
- _Multi-dimensional timeseries, or datasets, or state space sets_, which are [`Dataset`](@ref), used e.g. with [`NaiveKernel`](@ref).
6362
- _Spatial data_, which are higher dimensional standard `Array`s, used e.g. with [`SpatialSymbolicPermutation`](@ref).
63+
64+
```@docs
65+
Dataset
66+
```

docs/src/probabilities.md

Lines changed: 33 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# [Probabilities](@id probabilities_estimators)
1+
# Probabilities
22

33
## Probabilities API
44

@@ -8,67 +8,77 @@ The probabilities API is defined by
88
- [`probabilities`](@ref)
99
- [`probabilities_and_outcomes`](@ref)
1010

11+
and related functions that you will find in the following documentation blocks:
12+
13+
### Probabilitities
14+
1115
```@docs
1216
ProbabilitiesEstimator
1317
probabilities
1418
probabilities!
1519
Probabilities
20+
```
21+
22+
### Outcomes
23+
24+
```@docs
1625
probabilities_and_outcomes
1726
outcomes
1827
outcome_space
1928
total_outcomes
2029
missing_outcomes
2130
```
2231

23-
## Overview
32+
## [Overview of probabilities estimators](@id probabilities_estimators)
2433

25-
Any of the following estimators can be used with [`probabilities`](@ref).
34+
Any of the following estimators can be used with [`probabilities`](@ref)
35+
(in the column "input data" it is assumed that the `eltype` of the input is `<: Real`).
2636

2737
| Estimator | Principle | Input data |
28-
| ------------------------------------------- | --------------------------- | ------------------- |
29-
| [`CountOccurrences`](@ref) | Frequencies | `Vector`, `Dataset` |
38+
|:--------------------------------------------|:----------------------------|:--------------------|
39+
| [`CountOccurrences`](@ref) | Count of unique elements | `Any` |
3040
| [`ValueHistogram`](@ref) | Binning (histogram) | `Vector`, `Dataset` |
3141
| [`TransferOperator`](@ref) | Binning (transfer operator) | `Vector`, `Dataset` |
3242
| [`NaiveKernel`](@ref) | Kernel density estimation | `Dataset` |
33-
| [`SymbolicPermutation`](@ref) | Ordinal patterns | `Vector` |
34-
| [`SymbolicWeightedPermutation`](@ref) | Ordinal patterns | `Vector` |
35-
| [`SymbolicAmplitudeAwarePermutation`](@ref) | Ordinal patterns | `Vector` |
43+
| [`SymbolicPermutation`](@ref) | Ordinal patterns | `Vector`, `Dataset` |
44+
| [`SymbolicWeightedPermutation`](@ref) | Ordinal patterns | `Vector`, `Dataset` |
45+
| [`SymbolicAmplitudeAwarePermutation`](@ref) | Ordinal patterns | `Vector`, `Dataset` |
46+
| [`SpatialSymbolicPermutation`](@ref) | Ordinal patterns in space | `Array` |
3647
| [`Dispersion`](@ref) | Dispersion patterns | `Vector` |
48+
| [`SpatialDispersion`](@ref) | Dispersion patterns in space | `Array` |
3749
| [`Diversity`](@ref) | Cosine similarity | `Vector` |
3850
| [`WaveletOverlap`](@ref) | Wavelet transform | `Vector` |
39-
| [`PowerSpectrum`](@ref) | Fourier spectra | `Vector`, `Dataset` |
51+
| [`PowerSpectrum`](@ref) | Fourier transform | `Vector` |
4052

41-
## Count occurrences (counting)
53+
## Count occurrences
4254

4355
```@docs
4456
CountOccurrences
4557
```
4658

47-
## Visitation frequency (histograms)
59+
## Histograms
4860

4961
```@docs
5062
ValueHistogram
5163
RectangularBinning
5264
FixedRectangularBinning
5365
```
5466

55-
## Permutation (symbolic)
67+
## Symbolic permutations
5668

5769
```@docs
5870
SymbolicPermutation
5971
SymbolicWeightedPermutation
6072
SymbolicAmplitudeAwarePermutation
61-
SpatialSymbolicPermutation
6273
```
6374

64-
## Dispersion (symbolic)
75+
## Dispersion patterns
6576

6677
```@docs
6778
Dispersion
68-
SpatialDispersion
6979
```
7080

71-
## Transfer operator (binning)
81+
## Transfer operator
7282

7383
```@docs
7484
TransferOperator
@@ -100,3 +110,10 @@ PowerSpectrum
100110
```@docs
101111
Diversity
102112
```
113+
114+
## Spatial estimators
115+
116+
```@docs
117+
SpatialSymbolicPermutation
118+
SpatialDispersion
119+
```

src/Entropies.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,10 @@ include("complexity.jl")
2424
include("multiscale.jl")
2525

2626
# Library implementations (files include other files)
27+
include("encoding/all_encodings.jl") # other structs depend on these
2728
include("probabilities_estimators/probabilities_estimators.jl")
2829
include("entropies/entropies.jl")
29-
include("encoding/all_encodings.jl")
30-
include("complexity/complexity_measures.jl") # relies on encodings, so include after
30+
include("complexity/complexity_measures.jl")
3131
include("deprecations.jl")
3232

3333

src/encoding/all_encodings.jl

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,4 @@
1+
include("fasthist.jl")
2+
include("rectangular_binning.jl")
13
include("gaussian_cdf.jl")
24
include("ordinal_pattern.jl")

0 commit comments

Comments
 (0)