Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/src/assets/dist_peaks.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/assets/prom_peaks.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/src/contents.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,5 +13,6 @@ Pages = ["periodograms.md",
"convolutions.md",
"lpc.md",
"index.md",
"findpeaks.md",
]
```
51 changes: 51 additions & 0 deletions docs/src/findpeaks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# `Findpeaks`

`findpeaks` function is inspired by MATLAB's findpeaks function.

The function allows you to get positions of local maxima of one-dimensional data.
Since real-world data are often noisy, there is usually a large number of local maxima that are of no interest.

`findpeaks` offers filtering based on 4 parameters (can be combined):
* peak height
* distance between peaks
* peak [prominence](https://en.wikipedia.org/wiki/Topographic_prominence)
* threshold - minimal difference between neighboring data points

```@docs
findpeaks
```

# Example

```julia
using Findpeaks

using DelimitedFiles

using Plots
default(legend=false)


data = readdlm("test/data/findpeaks_example_spectrum.txt")
x = data[:, 1]
y = data[:, 2]

peaks = findpeaks(y, x, min_prom=1000.)

plot(x, y, title="Prominent peaks")
scatter!(x[peaks], y[peaks])
```

![](assets/prom_peaks.png)

```julia
sep = 0.2

peaks = findpeaks(y, x, min_dist=sep)

plot(x, y, title="Peaks at least $sep units apart")
scatter!(x[peaks], y[peaks])
```

![](assets/dist_peaks.png)

3 changes: 2 additions & 1 deletion src/DSP.jl
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@ include("periodograms.jl")
include("Filters/Filters.jl")
include("lpc.jl")
include("estimation.jl")
include("findpeaks.jl")

using Reexport
@reexport using .Util, .Windows, .Periodograms, .Filters, .LPC, .Unwrap, .Estimation
@reexport using .Util, .Windows, .Periodograms, .Filters, .LPC, .Unwrap, .Estimation, .Findpeaks

include("deprecated.jl")
end
131 changes: 131 additions & 0 deletions src/findpeaks.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
module Findpeaks

export findpeaks

"""
`findpeaks(y::Array{T},
x::Array{S}=collect(1:length(y))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange things happen if x is, for example, all zeros. For many of the default filtering values below, I assume using the default value means "do not filter". However, in this (corner) case filtering is still done. I think using nothing as a default value for keyword parameters to indicate that no filtering should be done would make sense.

;min_height::T=minimum(y), min_prom::T=minimum(y),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default in the documentation for min_prom does not match the actual default.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I noticed just after submitting the PR. The correct version should be in the issue above.

min_dist::S=0, threshold::T=0 ) where {T<:Real,S}`\n
Returns indices of local maxima (sorted from highest peaks to lowest)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking over this PR, it seems like you're returning the peak indices sorted by their value because of the way you filter to satisfy min_dist. I personally think it makes more sense to return the indices in the order that they occur, which is also what Matlab does.

in 1D array of real numbers. Similar to MATLAB's findpeaks().\n
Copy link
Member

@galenlynch galenlynch Jun 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way plateaus are treated by this function differs from both Matlab or SciPy. If the input signal contains a number of adjacent indices with the same value that is flanked with lower values, then each of the indices in the "plateau" will be considered a peak. E.g. findpeaks([0, 1, 1, 1, 1, 1, 0]) returns [2, 3, 4, 5, 6]. In contrast, Matlab would return the first index of the plateau, and SciPy would return the middle index (rounded down if an even number of points). Of these behaviors, I think SciPy's makes the most sense. I think the current behavior should be changed to match SciPy's.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning the middle index also seems to me to be more reasonable than the first index.
But doing so loses information about the fact that there is a plateau there and, depending on the nature of data, technically not a peak. Also, for non-equally spaced x, the "middle" may not be so much a "middle" anymore.

Looking at scipy's find_peaks I also noticed they have a plateau_size parameter and a corresponding output.
This would be nice to have - then returning the middle index is OK, since there is also output about the size.

If you'll agree I could add this.

*Arguments*:\n
`y` -- data\n
*Optional*:\n
`x` -- x-data\n
*Keyword*:\n
`min_height` -- minimal peak height\n
Copy link
Member

@galenlynch galenlynch Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find "minimal" ambiguous here. What happens if the value is on the threshold? It looks like peaks that are exactly the same as any of these minimal criteria are excluded from the results. The name "min_height" implies, at least to me, that it would be the minimum value of the corresponding property of the result set, and should therefore be inclusive (i.e. values that exactly match the threshold should be included in the results).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what do you suggest here?

`min_prom` -- minimal peak prominence\n
`min_dist` -- minimal peak distance (keeping highest peaks)\n
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if the peaks are the same size?

`threshold` -- minimal difference (absolute value) between
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by absolute value here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not matter whether the value is negative or positive:

Suggested change
`threshold` -- minimal difference (absolute value) between
`threshold` -- minimal absolute value of difference between

... but perhaps there is a better (single) word for it

peak and neighboring points\n
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some information should be given about the behavior of findpeaks if the input array contains NaNs. Right now it seems like no peaks are returned in that case. SciPy's find_peaks has a disclaimer that the function will not behave as intended when operating on NaNs, which I think is sufficient.

function findpeaks(
y :: AbstractVector{T},
x :: AbstractVector{S} = collect(1:length(y))
;
min_height :: T = minimum(y),
min_prom :: T = zero(y[1]),
min_dist :: S = zero(x[1]),
threshold :: T = zero(y[1]),
Copy link
Member

@galenlynch galenlynch Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Each of these lines will throw an error if y is empty
  • The type signature is pretty restrictive. For example, if y is a Vector{Float64}, you cannot set min_height, min_prom, or threshold to 0 or 1. Does it really need to be that restrictive?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the first point, if we don't want to throw on empty data, this might be better:

function findpeaks(
                   y :: AbstractVector{T},
                   x :: AbstractVector{S} = collect(1:length(y))
                   ;
                   kwargs...
                  ) where {T <: Real, S}

    if isempty(y)
        return empty(x)
    end

    min_height = get(kwargs, :min_height, minimum(y))
    min_prom   = get(kwargs, :min_prom,   zero(y[1]))
    min_dist   = get(kwargs, :min_dist,   zero(x[1]))
    threshold  = get(kwargs, :threshold,  zero(y[1]))

    dy = diff(y)
...

) where {T <: Real, S}

dy = diff(y)

peaks = in_threshold(dy, threshold)

yP = y[peaks]
peaks = with_prominence(y, peaks, min_prom)

#minimal height refinement
peaks = peaks[y[peaks] .> min_height]
yP = y[peaks]

peaks = with_distance(peaks, x, y, min_dist)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x and y should be checked to be the same length.


peaks
end

function in_threshold(
dy :: AbstractVector{T},
threshold :: T,
) where {T <: Real}

peaks = 1:length(dy) |> collect

k = 0
for i = 2:length(dy)
if dy[i] <= -threshold && dy[i-1] >= threshold
k += 1
peaks[k] = i
end
end
peaks[1:k]
end

function with_prominence(
y :: AbstractVector{T},
peaks :: AbstractVector{Int},
min_prom::T,
) where {T <: Real}

#minimal prominence refinement
peaks[prominence(y, peaks) .> min_prom]
end


function prominence(y::AbstractVector{T}, peaks::AbstractVector{Int}) where {T <: Real}
yP = y[peaks]
proms = zero(yP)

for (i, p) in enumerate(peaks)
lP, rP = 1, length(y)
for j = (i-1):-1:1
if yP[j] > yP[i]
lP = peaks[j]
break
end
end
ml = minimum(y[lP:p])
for j = (i+1):length(yP)
if yP[j] > yP[i]
rP = peaks[j]
break
end
end
mr = minimum(y[p:rP])
ref = max(mr,ml)
proms[i] = yP[i] - ref
end

proms
end

"""
Select only peaks that are further apart than `min_dist`
"""
function with_distance(
peaks :: AbstractVector{Int},
x :: AbstractVector{S},
y :: AbstractVector{T},
min_dist::S,
) where {T <: Real, S}

peaks2del = zeros(Bool, length(peaks))
inds = sortperm(y[peaks], rev=true)
permute!(peaks, inds)
for i = 1:length(peaks)
for j = 1:(i-1)
if abs(x[peaks[i]] - x[peaks[j]]) <= min_dist
if !peaks2del[j]
peaks2del[i] = true
end
end
end
end

peaks[.!peaks2del]
end


end # module
Loading