-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Closed
Labels
EnhancementGroupbyNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further actionWindowrolling, ewma, expandingrolling, ewma, expanding
Description
Is your feature request related to a problem?
I would like to compute the number of values in a sliding window like this:
import pandas as pd
df = pd.DataFrame({"group": ["A", "B"] * 5, "distance": [0.1, 0.3, 1.7, 3.2, 2.6, 3.7, 8.3, 0.6, 5.9, 6.3]})
df
Out[5]:
group distance
0 A 0.1
2 A 1.7
4 A 2.6
8 A 5.9
6 A 8.3
1 B 0.3
7 B 0.6
3 B 3.2
5 B 3.7
9 B 6.3
# possible API:
df.groupby("group").sliding(on="distance", window_size=3.0, step_size=2.0, min=0, max=10).count()
count
group distance
A [0, 3) 3
[2, 5) 1
[4, 7) 1
[6, 9) 1
[8, 11) 0
B [0, 3) 2
[2, 5) 2
[4, 7) 1
[6, 9) 0
[8, 11) 0
Describe the solution you'd like
There should be some sliding window function that slices by value instead of rows:
def sliding(
self,
on: str, #
window_size: numeric,
step_size: numeric=None, # equals to window size by default
min: numeric=None, # minimum value in group by default
max: numeric=None, # maximum value in group by default
):
pass
API breaking implications
No idea, maybe integrate to rolling
?
Describe alternatives you've considered
- Create IntervalArray:
intervals = pd.arrays.IntervalArray.from_arrays(np.arange(10), np.arange(10) + 2)
Out[53]:
<IntervalArray>
[(0, 2], (1, 3], (2, 4], (3, 5], (4, 6], (5, 7], (6, 8], (7, 9], (8, 10], (9, 11]]
- Join on
intervals
: No idea how to do this...
Additional background
This question also appeared on StackOverflow:
https://stackoverflow.com/questions/43538064/how-to-do-sliding-window-by-value-interval-on-non-time-index-in-pandas
Metadata
Metadata
Assignees
Labels
EnhancementGroupbyNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further actionWindowrolling, ewma, expandingrolling, ewma, expanding