add polars rolling statistics benchmark #1

jangorecki · 2023-11-21T18:55:59Z

Adding polars to rollbench, a mini rolling functions benchmark.

FYI @etiennebacher @sorhawell @eitsupi in case there is something to optimize. AFAIK in the next release there will be $rolling() that could be applied to quadruple rolling computation which now is made by calling rolling function twice.

polars is not yet very competitive in this field, still it is often faster than pandas. I understand rolling statistics functions are still marked as experimental and yet to be optimized for performance. Looking forward for future improvements.

sorhawell · 2023-11-21T20:11:08Z

Thanks I see rust-polars has a naive rolling_mean derived from rolling_apply_ so it does not scale well with rolling window size to compute window from scratch each step.

I know the randomForest loss function uses a trick to compute rolling sums where the value entering the window is added to running sum and the value exiting is substracted. Then not the entire window has to recalculated. Mean is just to divide the runnings sums with window size.

For rolling_median the slow part is to sort every window before finding the mid value, and that does not scale well with window size. Maybe some variation of max-heap could be efficient.

etiennebacher · 2023-11-21T20:16:40Z

Thanks for heads-up @jangorecki, the performance gap of rolling_median() between data.table and polars is surprising, I've reported it here: pola-rs/polars#12609

jangorecki · 2023-11-21T20:19:16Z

Wow, if polars is recomputing window for each observation then those numbers are actually very low.

max-heap is one of the two proper ways.

What can be low hanging fruit, to not reimplement all, is to ensure you are doing partial ordering rather than full, as you only need middle value. Implementing quickselect (partial ordering) instead of shell sort (full ordering) for data.table reduced timing tremendously, between 2 to 10 times! It is being used when algo="exact" to handle NAs.

Another "proper way" is what is now being used in data.table (default algo="fast"). You can read more about it in ?frollmedian (using rollmedian branch) or in a Rdatatable/data.table#5692.

sorhawell · 2023-11-26T16:59:33Z

I wrote a crude rolling mean function in rust for r-polars. The implementation does not handle missing correctly yet. It is quite faster it takes ~500us to roll over 1E8 with width 1E2 1E4 versus about 1s for data.table. Surprisingly width 1E6 takes 5000us, not sure why that is, but still fast enough.

> x = rnorm(1E8)
> s <- pl$Series(x)
> bench::mark(
+   width_e2 = fast_roll_mean_f64(s,  width = 1E2),
+   width_e4 = fast_roll_mean_f64(s, width = 1E4),
+   width_e6 = fast_roll_mean_f64(s, width = 1E6),
+   dt_w_e4 = data.table::frollmean(x, n = 1E4),
+   check = FALSE
+ )
# A tibble: 4 × 13
  expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory     time           gc      
  <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>     <list>         <list>  
1 width_e2   154.91µs 186.66µs  3256.         280B        0  1628     0   499.94ms <NULL> <Rprofmem> <bench_tm>     <tibble>
2 width_e4   672.42µs 766.95µs  1234.         280B        0   618     0   500.74ms <NULL> <Rprofmem> <bench_tm>     <tibble>
3 width_e6     5.52ms   5.98ms   160.         280B        0    80     0   500.02ms <NULL> <Rprofmem> <bench_tm>     <tibble>
4 dt_w_e4       1.52s    1.52s     0.658     763MB        0     1     0      1.52s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
>

jangorecki · 2023-11-26T17:11:07Z

Nice,

:: adds some overhead which in very fast computations can be significant. Therefore loading library should be preferred. Especially when it's a first call to this namespace, it's a lot that has to happen rather than just this single function call.
It seems that DT call has been called once while remaining ones 80-1600 times, it would be useful if you could include max statistic rather than only min and median because possibly it is the first call that has the overhead mentioned. Then we could at least compare max-to-max and not max-to-median as it is now.

sorhawell · 2023-11-26T17:57:27Z

oh no that was too good to be true 🤣 this is a more fair comparison. And the two rolls are similar in speed, dt is slightly faster.

> bench::mark(
+   width_e2 = fast_roll_mean_f64(s,  width = 1E2),
+   width_e4 = fast_roll_mean_f64(s, width = 1E4),
+   width_e6 = fast_roll_mean_f64(s, width = 1E6),
+   dt_w_e2 = data.table::frollmean(x, n = 1E2),
+   dt_w_e4 = data.table::frollmean(x, n = 1E4),
+   dt_w_e6 = data.table::frollmean(x, n = 1E6),
+   check = FALSE
+ )
# A tibble: 6 × 13
  expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory     time           gc      
  <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>     <list>         <list>  
1 width_e2       1.1s     1.1s     0.911      280B    0         1     0       1.1s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
2 width_e4      1.14s    1.14s     0.878      280B    0         1     0      1.14s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
3 width_e6      1.15s    1.15s     0.872      280B    0         1     0      1.15s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
4 dt_w_e2       1.08s    1.08s     0.921     763MB    0.921     1     1      1.08s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
5 dt_w_e4    959.47ms 959.47ms     1.04      763MB    0         1     0   959.47ms <NULL> <Rprofmem> <bench_tm [1]> <tibble>
6 dt_w_e6       1.05s    1.05s     0.950     763MB    0.950     1     1      1.05s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
Warning message:
Some expressions had a GC in every iteration; so filtering is disabled.

sorhawell · 2023-11-26T21:58:55Z

sry for not addressing all your suggestions for benchmarking. I think we might include roll function in r-polars, and we can do a more fair comparison later.

The complete way to do it is to add it directly into rust-polars and support any datatype. It will take a lot of time to write that PR for me, and I only work a little these days on r-polars.

add polars

baff06c

jangorecki force-pushed the polars branch from 33573c6 to baff06c Compare November 21, 2023 18:59

jangorecki merged commit 2a41d70 into master Nov 21, 2023

jangorecki deleted the polars branch November 21, 2023 19:00

sorhawell mentioned this pull request Nov 27, 2023

polars slow rolling pola-rs/r-polars#549

Closed

jangorecki mentioned this pull request Dec 15, 2023

Transpose(dt) allows to return list without promoting elements to maxtype Rdatatable/data.table#5805

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add polars rolling statistics benchmark #1

add polars rolling statistics benchmark #1

Uh oh!

jangorecki commented Nov 21, 2023

Uh oh!

sorhawell commented Nov 21, 2023

Uh oh!

etiennebacher commented Nov 21, 2023

Uh oh!

jangorecki commented Nov 21, 2023 •

edited

Loading

Uh oh!

sorhawell commented Nov 26, 2023

Uh oh!

jangorecki commented Nov 26, 2023 •

edited

Loading

Uh oh!

sorhawell commented Nov 26, 2023 •

edited

Loading

Uh oh!

sorhawell commented Nov 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

add polars rolling statistics benchmark #1

add polars rolling statistics benchmark #1

Uh oh!

Conversation

jangorecki commented Nov 21, 2023

Uh oh!

sorhawell commented Nov 21, 2023

Uh oh!

etiennebacher commented Nov 21, 2023

Uh oh!

jangorecki commented Nov 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sorhawell commented Nov 26, 2023

Uh oh!

jangorecki commented Nov 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sorhawell commented Nov 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sorhawell commented Nov 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jangorecki commented Nov 21, 2023 •

edited

Loading

jangorecki commented Nov 26, 2023 •

edited

Loading

sorhawell commented Nov 26, 2023 •

edited

Loading