-
Notifications
You must be signed in to change notification settings - Fork 1
add polars rolling statistics benchmark #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks I see rust-polars has a naive rolling_mean derived from rolling_apply_ so it does not scale well with rolling window size to compute window from scratch each step. I know the randomForest loss function uses a trick to compute rolling sums where the value entering the window is added to running sum and the value exiting is substracted. Then not the entire window has to recalculated. Mean is just to divide the runnings sums with window size. For rolling_median the slow part is to sort every window before finding the mid value, and that does not scale well with window size. Maybe some variation of max-heap could be efficient. |
|
Thanks for heads-up @jangorecki, the performance gap of rolling_median() between |
|
Wow, if polars is recomputing window for each observation then those numbers are actually very low. max-heap is one of the two proper ways. What can be low hanging fruit, to not reimplement all, is to ensure you are doing partial ordering rather than full, as you only need middle value. Implementing quickselect (partial ordering) instead of shell sort (full ordering) for data.table reduced timing tremendously, between 2 to 10 times! It is being used when algo="exact" to handle NAs. Another "proper way" is what is now being used in data.table (default algo="fast"). You can read more about it in |
|
I wrote a crude rolling mean function in rust for r-polars. The implementation does not handle missing correctly yet. It is quite faster it takes ~500us to roll over |
|
Nice,
|
|
oh no that was too good to be true 🤣 this is a more fair comparison. And the two rolls are similar in speed, dt is slightly faster. > bench::mark(
+ width_e2 = fast_roll_mean_f64(s, width = 1E2),
+ width_e4 = fast_roll_mean_f64(s, width = 1E4),
+ width_e6 = fast_roll_mean_f64(s, width = 1E6),
+ dt_w_e2 = data.table::frollmean(x, n = 1E2),
+ dt_w_e4 = data.table::frollmean(x, n = 1E4),
+ dt_w_e6 = data.table::frollmean(x, n = 1E6),
+ check = FALSE
+ )
# A tibble: 6 × 13
expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory time gc
<bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list> <list> <list>
1 width_e2 1.1s 1.1s 0.911 280B 0 1 0 1.1s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
2 width_e4 1.14s 1.14s 0.878 280B 0 1 0 1.14s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
3 width_e6 1.15s 1.15s 0.872 280B 0 1 0 1.15s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
4 dt_w_e2 1.08s 1.08s 0.921 763MB 0.921 1 1 1.08s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
5 dt_w_e4 959.47ms 959.47ms 1.04 763MB 0 1 0 959.47ms <NULL> <Rprofmem> <bench_tm [1]> <tibble>
6 dt_w_e6 1.05s 1.05s 0.950 763MB 0.950 1 1 1.05s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
Warning message:
Some expressions had a GC in every iteration; so filtering is disabled.
|
|
sry for not addressing all your suggestions for benchmarking. I think we might include roll function in r-polars, and we can do a more fair comparison later. The complete way to do it is to add it directly into rust-polars and support any datatype. It will take a lot of time to write that PR for me, and I only work a little these days on r-polars. |
Adding polars to rollbench, a mini rolling functions benchmark.
FYI @etiennebacher @sorhawell @eitsupi in case there is something to optimize. AFAIK in the next release there will be
$rolling()that could be applied to quadruple rolling computation which now is made by calling rolling function twice.polars is not yet very competitive in this field, still it is often faster than pandas. I understand rolling statistics functions are still marked as experimental and yet to be optimized for performance. Looking forward for future improvements.