You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix WindowAggregatingExtractor to handle timing jitter robustly
With timing noise (e.g., frames at [0.0001, 1, 2, 3, 4, 5]), a 5-second
window would incorrectly include 6 frames instead of 5 due to the
inclusive lower bound in label-based slicing.
Solution:
- Estimate frame period from median interval between frames
- Shift cutoff by +0.5 × median_interval to place window boundary
between frame slots, avoiding extra frames from timing jitter
- Clamp cutoff to latest_time for narrow windows (duration < median_interval)
- Continue using inclusive label-based slicing: data[time, cutoff:]
This automatically adapts to different frame rates and handles both
timing jitter and narrow windows correctly.
Add comprehensive tests for timing jitter scenarios:
- test_handles_timing_jitter_at_window_start
- test_handles_timing_jitter_at_window_end
- test_consistent_frame_count_with_perfect_timing
Original prompt: Please think about a conceptualy problem in WindowAggregatingExtractor:
- Data arrives in regular (but of source noisy) intervals, say once per second.
- User requests 5 second sliding window.
- Current extraction code will then often return to many frames.
Example:
Frames at [0.0001, 1,2,3,4,5]
=> we get 6 frames instead of 5. (I think even with 0.0 it is wrong, since label-based indexing used to extracted windowed_data is inclusive on the left).
The problem is that we can't simply reduce to, say, 4 seconds, or 5 frames, since frame rates can vary a lot. Can you think of a more stable approach?
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
0 commit comments