-
Notifications
You must be signed in to change notification settings - Fork 115
add findpeaks #369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
add findpeaks #369
Changes from 1 commit
02c7940
7827da3
1b3a9df
8de60ee
84678d5
f6ded5b
3aaa850
fc628f4
9aa5e8a
917a98a
e042b49
842c090
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,5 +13,6 @@ Pages = ["periodograms.md", | |
"convolutions.md", | ||
"lpc.md", | ||
"index.md", | ||
"findpeaks.md", | ||
] | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# `Findpeaks` | ||
|
||
`findpeaks` function is inspired by MATLAB's findpeaks function. | ||
|
||
The function allows you to get positions of local maxima of one-dimensional data. | ||
Since real-world data are often noisy, there is usually a large number of local maxima that are of no interest. | ||
|
||
`findpeaks` offers filtering based on 4 parameters (can be combined): | ||
* peak height | ||
* distance between peaks | ||
* peak [prominence](https://en.wikipedia.org/wiki/Topographic_prominence) | ||
* threshold - minimal difference between neighboring data points | ||
|
||
```@docs | ||
findpeaks | ||
``` | ||
|
||
# Example | ||
|
||
```julia | ||
using Findpeaks | ||
|
||
using DelimitedFiles | ||
|
||
using Plots | ||
default(legend=false) | ||
|
||
|
||
data = readdlm("test/data/findpeaks_example_spectrum.txt") | ||
x = data[:, 1] | ||
y = data[:, 2] | ||
|
||
peaks = findpeaks(y, x, min_prom=1000.) | ||
|
||
plot(x, y, title="Prominent peaks") | ||
scatter!(x[peaks], y[peaks]) | ||
``` | ||
|
||
 | ||
|
||
```julia | ||
sep = 0.2 | ||
|
||
peaks = findpeaks(y, x, min_dist=sep) | ||
|
||
plot(x, y, title="Peaks at least $sep units apart") | ||
scatter!(x[peaks], y[peaks]) | ||
``` | ||
|
||
 | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,131 @@ | ||||||
module Findpeaks | ||||||
|
||||||
export findpeaks | ||||||
|
||||||
""" | ||||||
`findpeaks(y::Array{T}, | ||||||
x::Array{S}=collect(1:length(y)) | ||||||
;min_height::T=minimum(y), min_prom::T=minimum(y), | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the default in the documentation for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, I noticed just after submitting the PR. The correct version should be in the issue above. |
||||||
min_dist::S=0, threshold::T=0 ) where {T<:Real,S}`\n | ||||||
Returns indices of local maxima (sorted from highest peaks to lowest) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After looking over this PR, it seems like you're returning the peak indices sorted by their value because of the way you filter to satisfy |
||||||
in 1D array of real numbers. Similar to MATLAB's findpeaks().\n | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The way plateaus are treated by this function differs from both Matlab or SciPy. If the input signal contains a number of adjacent indices with the same value that is flanked with lower values, then each of the indices in the "plateau" will be considered a peak. E.g. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Returning the middle index also seems to me to be more reasonable than the first index. Looking at scipy's find_peaks I also noticed they have a If you'll agree I could add this. |
||||||
*Arguments*:\n | ||||||
`y` -- data\n | ||||||
*Optional*:\n | ||||||
`x` -- x-data\n | ||||||
*Keyword*:\n | ||||||
`min_height` -- minimal peak height\n | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find "minimal" ambiguous here. What happens if the value is on the threshold? It looks like peaks that are exactly the same as any of these minimal criteria are excluded from the results. The name "min_height" implies, at least to me, that it would be the minimum value of the corresponding property of the result set, and should therefore be inclusive (i.e. values that exactly match the threshold should be included in the results). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So what do you suggest here? |
||||||
`min_prom` -- minimal peak prominence\n | ||||||
`min_dist` -- minimal peak distance (keeping highest peaks)\n | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And if the peaks are the same size? |
||||||
`threshold` -- minimal difference (absolute value) between | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you mean by absolute value here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It does not matter whether the value is negative or positive:
Suggested change
... but perhaps there is a better (single) word for it |
||||||
peak and neighboring points\n | ||||||
""" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some information should be given about the behavior of |
||||||
function findpeaks( | ||||||
y :: AbstractVector{T}, | ||||||
x :: AbstractVector{S} = collect(1:length(y)) | ||||||
; | ||||||
min_height :: T = minimum(y), | ||||||
min_prom :: T = zero(y[1]), | ||||||
min_dist :: S = zero(x[1]), | ||||||
threshold :: T = zero(y[1]), | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As for the first point, if we don't want to throw on empty data, this might be better: function findpeaks(
y :: AbstractVector{T},
x :: AbstractVector{S} = collect(1:length(y))
;
kwargs...
) where {T <: Real, S}
if isempty(y)
return empty(x)
end
min_height = get(kwargs, :min_height, minimum(y))
min_prom = get(kwargs, :min_prom, zero(y[1]))
min_dist = get(kwargs, :min_dist, zero(x[1]))
threshold = get(kwargs, :threshold, zero(y[1]))
dy = diff(y)
... |
||||||
) where {T <: Real, S} | ||||||
|
||||||
dy = diff(y) | ||||||
|
||||||
peaks = in_threshold(dy, threshold) | ||||||
|
||||||
yP = y[peaks] | ||||||
peaks = with_prominence(y, peaks, min_prom) | ||||||
|
||||||
#minimal height refinement | ||||||
peaks = peaks[y[peaks] .> min_height] | ||||||
yP = y[peaks] | ||||||
|
||||||
peaks = with_distance(peaks, x, y, min_dist) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||||||
|
||||||
peaks | ||||||
end | ||||||
|
||||||
function in_threshold( | ||||||
dy :: AbstractVector{T}, | ||||||
threshold :: T, | ||||||
) where {T <: Real} | ||||||
|
||||||
peaks = 1:length(dy) |> collect | ||||||
|
||||||
k = 0 | ||||||
for i = 2:length(dy) | ||||||
if dy[i] <= -threshold && dy[i-1] >= threshold | ||||||
k += 1 | ||||||
peaks[k] = i | ||||||
end | ||||||
end | ||||||
peaks[1:k] | ||||||
end | ||||||
|
||||||
function with_prominence( | ||||||
y :: AbstractVector{T}, | ||||||
peaks :: AbstractVector{Int}, | ||||||
min_prom::T, | ||||||
) where {T <: Real} | ||||||
|
||||||
#minimal prominence refinement | ||||||
peaks[prominence(y, peaks) .> min_prom] | ||||||
end | ||||||
|
||||||
|
||||||
function prominence(y::AbstractVector{T}, peaks::AbstractVector{Int}) where {T <: Real} | ||||||
yP = y[peaks] | ||||||
proms = zero(yP) | ||||||
|
||||||
for (i, p) in enumerate(peaks) | ||||||
lP, rP = 1, length(y) | ||||||
for j = (i-1):-1:1 | ||||||
if yP[j] > yP[i] | ||||||
lP = peaks[j] | ||||||
break | ||||||
end | ||||||
end | ||||||
ml = minimum(y[lP:p]) | ||||||
for j = (i+1):length(yP) | ||||||
if yP[j] > yP[i] | ||||||
rP = peaks[j] | ||||||
break | ||||||
end | ||||||
end | ||||||
mr = minimum(y[p:rP]) | ||||||
ref = max(mr,ml) | ||||||
proms[i] = yP[i] - ref | ||||||
end | ||||||
|
||||||
proms | ||||||
end | ||||||
|
||||||
""" | ||||||
Select only peaks that are further apart than `min_dist` | ||||||
""" | ||||||
function with_distance( | ||||||
peaks :: AbstractVector{Int}, | ||||||
x :: AbstractVector{S}, | ||||||
y :: AbstractVector{T}, | ||||||
min_dist::S, | ||||||
) where {T <: Real, S} | ||||||
|
||||||
peaks2del = zeros(Bool, length(peaks)) | ||||||
inds = sortperm(y[peaks], rev=true) | ||||||
permute!(peaks, inds) | ||||||
for i = 1:length(peaks) | ||||||
for j = 1:(i-1) | ||||||
if abs(x[peaks[i]] - x[peaks[j]]) <= min_dist | ||||||
if !peaks2del[j] | ||||||
peaks2del[i] = true | ||||||
end | ||||||
end | ||||||
end | ||||||
end | ||||||
|
||||||
peaks[.!peaks2del] | ||||||
end | ||||||
|
||||||
|
||||||
end # module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange things happen if
x
is, for example, all zeros. For many of the default filtering values below, I assume using the default value means "do not filter". However, in this (corner) case filtering is still done. I think usingnothing
as a default value for keyword parameters to indicate that no filtering should be done would make sense.