diff --git a/CHANGELOG.md b/CHANGELOG.md index 6dbdd303b..7701e5f89 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,7 +2,8 @@ Changelog is kept with respect to version 0.11 of Entropies.jl. -## 1.3 +## main +* New probability estimator `SpatialSymbolicPermutation` suitable for computing spatial permutation entropies * Introduce Tsallis entropy. ## 1.2 diff --git a/src/core.jl b/src/core.jl index 6c9850ae3..ddf66da7b 100644 --- a/src/core.jl +++ b/src/core.jl @@ -6,7 +6,7 @@ export probabilities, probabilities! export genentropy, genentropy! export Dataset, dimension -const Vector_or_Dataset = Union{AbstractVector, AbstractDataset} +const Array_or_Dataset = Union{<:AbstractArray, <:AbstractDataset} """ Probabilities(x) → p @@ -49,7 +49,7 @@ abstract type ProbabilitiesEstimator end const ProbEst = ProbabilitiesEstimator # shorthand """ - probabilities(x::Vector_or_Dataset, est::ProbabilitiesEstimator) → p::Probabilities + probabilities(x::Array_or_Dataset, est::ProbabilitiesEstimator) → p::Probabilities Calculate probabilities representing `x` based on the provided estimator and return them as a [`Probabilities`](@ref) container (`Vector`-like). @@ -58,7 +58,7 @@ documentation of the individual estimators for more. The configuration options are always given as arguments to the chosen estimator. - probabilities(x::Vector_or_Dataset, ε::AbstractFloat) → p::Probabilities + probabilities(x::Array_or_Dataset, ε::AbstractFloat) → p::Probabilities Convenience syntax which provides probabilities for `x` based on rectangular binning (i.e. performing a histogram). In short, the state space is divided into boxes of length @@ -71,11 +71,11 @@ This allows computation of probabilities (histograms) of high-dimensional datasets and with small box sizes `ε` without memory overflow and with maximum performance. To obtain the bin information along with `p`, use [`binhist`](@ref). - probabilities(x::Vector_or_Dataset, n::Integer) → p::Probabilities + probabilities(x::Array_or_Dataset, n::Integer) → p::Probabilities Same as the above method, but now each dimension of the data is binned into `n::Int` equal sized bins instead of bins of length `ε::AbstractFloat`. - probabilities(x::Vector_or_Dataset) → p::Probabilities + probabilities(x::Array_or_Dataset) → p::Probabilities Directly count probabilities from the elements of `x` without any discretization, binning, or other processing (mostly useful when `x` contains categorical or integer data). """ @@ -97,7 +97,7 @@ Compute the generalized order-`q` entropy of some probabilities returned by the [`probabilities`](@ref) function. Alternatively, compute entropy from pre-computed `Probabilities`. - genentropy(x::Vector_or_Dataset, est; q = 1.0, base) + genentropy(x::Array_or_Dataset, est; q = 1.0, base) A convenience syntax, which calls first `probabilities(x, est)` and then calculates the entropy of the result (and thus `est` can be a @@ -149,10 +149,10 @@ function genentropy(prob::Probabilities; q = 1.0, α = nothing, base = MathConst end end -genentropy(x::AbstractArray{<:Real}) = +genentropy(::AbstractArray{<:Real}) = error("For single-argument input, do `genentropy(Probabilities(x))` instead.") -function genentropy(x::Vector_or_Dataset, est; q = 1.0, α = nothing, base = MathConstants.e) +function genentropy(x::Array_or_Dataset, est; q = 1.0, α = nothing, base = MathConstants.e) if α ≠ nothing @warn "Keyword `α` is deprecated in favor of `q`." q = α diff --git a/src/symbolic/spatial_permutation.jl b/src/symbolic/spatial_permutation.jl new file mode 100644 index 000000000..fa6ca1e96 --- /dev/null +++ b/src/symbolic/spatial_permutation.jl @@ -0,0 +1,117 @@ +export SpatialSymbolicPermutation + +""" + SpatialSymbolicPermutation(stencil, x, periodic = true) +A symbolic, permutation-based probabilities/entropy estimator for spatiotemporal systems. +The data are a high-dimensional array `x`, such as 2D [^Ribeiro2012] or 3D [^Schlemmer2018]. +This approach is also known as _Spatiotemporal Permutation Entropy_. +`x` is given because we need to know its size for optimization and bound checking. + +A _stencil_ defines what local area around each pixel to +consider, and compute the ordinal pattern within the stencil. Stencils are given as +vectors of `CartesianIndex` which encode the _offsets_ of the pixes to include in the +stencil, with respect to the current pixel. For example +```julia +data = [rand(50, 50) for _ in 1:50] +x = data[1] # first "time slice" of a spatial system evolution +stencil = CartesianIndex.([(0,1), (1,1), (1,0)]) +est = SpatialSymbolicPermutation(stencil, x) +``` +Here the stencil creates a 2x2 square extending to the bottom and right of the pixel +(directions here correspond to the way Julia prints matrices by default). +Notice that no offset (meaning the pixel itself) is always included automatically. +The length of the stencil decides the order of the permutation entropy, and the ordering +within the stencil dictates the order that pixels are compared with. +The pixel without any offset is always first in the order. + +After having defined `est`, one calculates the permutation entropy of ordinal patterns +by calling [`genentropy`](@ref) with `est`, and with the array data. +To apply this to timeseries of spatial data, simply loop over the call, e.g.: +```julia +entropy = genentropy(x, est) +entropy_vs_time = genentropy.(data, est) # broadcasting with `.` +``` + +The argument `periodic` decides whether the stencil should wrap around at the end of the +array. If `periodic = false`, pixels whose stencil exceeds the array bounds are skipped. + +[^Ribeiro2012]: + Ribeiro et al. (2012). Complexity-entropy causality plane as a complexity measure + for two-dimensional patterns. https://doi.org/10.1371/journal.pone.0040689 + +[^Schlemmer2018]: + Schlemmer et al. (2018). Spatiotemporal Permutation Entropy as a Measure for + Complexity of Cardiac Arrhythmia. https://doi.org/10.3389/fphy.2018.00039 +""" +struct SpatialSymbolicPermutation{D,P,V} <: ProbabilitiesEstimator + stencil::Vector{CartesianIndex{D}} + viewer::Vector{CartesianIndex{D}} + arraysize::Dims{D} + valid::V +end +function SpatialSymbolicPermutation( + stencil::Vector{CartesianIndex{D}}, x::AbstractArray, p::Bool = true + ) where {D} + # Ensure that no offset is part of the stencil + stencil = pushfirst!(copy(stencil), CartesianIndex{D}(zeros(Int, D)...)) + arraysize = size(x) + @assert length(arraysize) == D "Indices and input array must match dimensionality!" + # Store valid indices for later iteration + if p + valid = CartesianIndices(x) + else + # collect maximum offsets in each dimension for limiting ranges + maxoffsets = [maximum(s[i] for s in stencil) for i in 1:D] + # Safety check + minoffsets = [min(0, minimum(s[i] for s in stencil)) for i in 1:D] + ranges = Iterators.product( + [(1-minoffsets[i]):(arraysize[i]-maxoffsets[i]) for i in 1:D]... + ) + valid = Base.Generator(idxs -> CartesianIndex{D}(idxs), ranges) + end + SpatialSymbolicPermutation{D, p, typeof(valid)}(stencil, copy(stencil), arraysize, valid) +end + +# This source code is a modification of the code of Agents.jl that finds neighbors +# in grid-like spaces. It's the code of `nearby_positions` in `grid_general.jl`. +function pixels_in_stencil(pixel, spatperm::SpatialSymbolicPermutation{D,false}) where {D} + @inbounds for i in eachindex(spatperm.stencil) + spatperm.viewer[i] = spatperm.stencil[i] + pixel + end + return spatperm.viewer +end + +function pixels_in_stencil(pixel, spatperm::SpatialSymbolicPermutation{D,true}) where {D} + @inbounds for i in eachindex(spatperm.stencil) + # It's annoying that we have to change to tuple and then to CartesianIndex + # because iteration over cartesian indices is not allowed. But oh well. + spatperm.viewer[i] = CartesianIndex{D}( + mod1.(Tuple(spatperm.stencil[i] + pixel), spatperm.arraysize) + ) + end + return spatperm.viewer +end + +function Entropies.probabilities(x, est::SpatialSymbolicPermutation) + # TODO: This can be literally a call to `symbolize` and then + # calling probabilities on it. Should do once the `symbolize` refactoring is done. + s = zeros(Int, length(est.valid)) + probabilities!(s, x, est) +end + +function Entropies.probabilities!(s, x, est::SpatialSymbolicPermutation) + m = length(est.stencil) + for (i, pixel) in enumerate(est.valid) + pixels = pixels_in_stencil(pixel, est) + s[i] = Entropies.encode_motif(view(x, pixels), m) + end + @show s + return probabilities(s) +end + +# Pretty printing +function Base.show(io::IO, est::SpatialSymbolicPermutation{D}) where {D} + print(io, "Spatial permutation estimator for $D-dimensional data. Stencil:") + print(io, "\n") + print(io, est.stencil) +end \ No newline at end of file diff --git a/src/symbolic/symbolic.jl b/src/symbolic/symbolic.jl index 99be6dbf1..dec62c7aa 100644 --- a/src/symbolic/symbolic.jl +++ b/src/symbolic/symbolic.jl @@ -10,6 +10,7 @@ include("utils.jl") include("SymbolicPermutation.jl") include("SymbolicWeightedPermutation.jl") include("SymbolicAmplitudeAware.jl") +include("spatial_permutation.jl") """ permentropy(x; τ = 1, m = 3, base = MathConstants.e) diff --git a/test/runtests.jl b/test/runtests.jl index 850b7a072..99c320ede 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -382,3 +382,5 @@ end @assert round(tsallisentropy(p, q = -1/2, k = 1), digits = 2) ≈ 6.79 end end + +include("spatial_permutation_tests.jl") \ No newline at end of file diff --git a/test/spatial_permutation_tests.jl b/test/spatial_permutation_tests.jl new file mode 100644 index 000000000..ee49f08ea --- /dev/null +++ b/test/spatial_permutation_tests.jl @@ -0,0 +1,49 @@ +using Entropies, Test + +@testset "Spatial Permutation Entropy" begin + x = [1 2 1; 8 3 4; 6 7 5] + # Re-create the Ribeiro et al, 2012 using stencil + # (you get 4 symbols in a 3x3 matrix. For this matrix, the upper left + # and bottom right are the same symbol. So three probabilities in the end). + stencil = CartesianIndex.([(1,0), (0,1), (1,1)]) + est = SpatialSymbolicPermutation(stencil, x, false) + + # Generic tests + ps = probabilities(x, est) + @test ps isa Probabilities + @test length(ps) == 3 + @test sort(ps) == [0.25, 0.25, 0.5] + + h = genentropy(x, est, base = 2) + @test h == 1.5 + + # In fact, doesn't matter how we order the stencil, + # the symbols will always be equal in top-left and bottom-right + stencil = CartesianIndex.([(1,0), (1,1), (0,1)]) + est = SpatialSymbolicPermutation(stencil, x, false) + @test genentropy(x, est, base = 2) == 1.5 + + # But for sanity, let's ensure we get a different number + # for a different stencil + stencil = CartesianIndex.([(1,0), (2,0)]) + est = SpatialSymbolicPermutation(stencil, x, false) + ps = sort(probabilities(x, est)) + @test ps[1] == 1/3 + @test ps[2] == 2/3 + + # TODO: Symbolize tests once its part of the API. + + # Also test that it works for arbitrarily high-dimensional arrays + stencil = CartesianIndex.([(0,1,0), (0,0,1), (1,0,0)]) + z = reshape(1:125, (5,5,5)) + est = SpatialSymbolicPermutation(stencil, z, false) + # Analytically the total stencils are of length 4*4*4 = 64 + # but all of them given the same probabilities because of the layout + ps = probabilities(z, est) + @test ps == [1] + # if we shuffle, we get random stuff + using Random + w = shuffle!(Random.MersenneTwister(42), collect(z)) + ps = probabilities(w, est) + @test length(ps) > 1 +end \ No newline at end of file