Skip to content

enh: add (optional) order_by to rolling_*, cum_*, first, ... #3363

@MarcoGorelli

Description

@MarcoGorelli

Currently, we don't allow

nw.col('price').cum_sum()

in the lazy case, requiring instead

nw.col('price').cum_sum().over(order_by='date')

We could also allow

nw.col('price').cum_sum(order_by='date')

with the following restriction:

  • it's not allowed to do nw.col('price').cum_sum(order_by='date').over('asset', order_by='date'). specify order_by in either over or cum_sum but not both

What this enables:

  • we can add order_by to first, so then we you can do df.group_by('a').agg(nw.col('b').first(order_by='c'))
  • i think it's conceptually easier to teach nw.col('price').cum_sum(order_by='date') than it is to teach nw.col('price').cum_sum().over(order_by='date')

I'm just trying this out, but i think the impact on complexity is relatively small

Issues:

  • slight divergence from Polars API, but I'm ok with it, i think it still feels in the polars spirit

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions