Skip to content

Conversation

@dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Jan 2, 2026

Description

The size (2K+ LOC!) of (https://github.com/narwhals-dev/narwhals/blob/8d6101cdc56fd9d7d89e8a4665afbfdee8b6e9ec/narwhals/_plan/arrow/functions.py) really got out of hand and was long-overdue a cleanup.

The bad

  • Difficult to remember everything that's there
    • Not having docs made this part worse, while only really making the LOC appear less-bad
  • Having all this typing in one module degrades LSP performance
  • Unclear what dependencies there are between functions
  • Shadowing builtins gets more complicated the larger a module is

The (new) good

  • New modules have an __all__
  • Most exports have at-least a 1-line docstring
  • LSP is pretty fast now that you'd need to have ~20 modules open to have same number of symbols 😅
  • No modules have intra-(sub)package inline dependencies
    • Quite nice seeing all the relationships up top now
  • Shadowing (any, all, min, max, str, list) is a non-issue
  • list and str functions can freely use the same names

Related issues

@dangotbanned dangotbanned added internal pyarrow Issue is related to pyarrow backend labels Jan 2, 2026
dangotbanned and others added 9 commits January 2, 2026 17:06
- Was avoiding using `dtypes` before as *at some point* the `dtypes` module was imported into `functions`.
  - No longer an issue here though
- Found a couple more uses for `dtype_native` too
That was a lot more involved than I'd hoped for!

Kinda forced my hand on addressing some circular imports from #3384 (comment)
@dangotbanned dangotbanned added the documentation Improvements or additions to documentation label Jan 3, 2026
dangotbanned added a commit that referenced this pull request Jan 3, 2026
In #3384, I started tweaking the impl but realized I had only covered it indirectly

Merging ahead of that so I can be sure I don't break it
- Not planning to do this *everywhere*
- in `_vector` I want everything except `hist`
  - They're all good for composing other functions
  - So if we can skip re-wrapping as a `ChunkedArray`, then I think we should
Quite a lot of inconsistency, but inheriting from polars so idk
Will stick with the name for now
Makes a bit more sense without `drop_nulls` - and that didn't need to exist in the first place
@dangotbanned dangotbanned marked this pull request as ready for review January 3, 2026 22:24
@dangotbanned dangotbanned merged commit 6dd74fe into oh-nodes Jan 3, 2026
27 of 35 checks passed
@dangotbanned dangotbanned deleted the expr-ir/arrow-functions-split branch January 3, 2026 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation internal pyarrow Issue is related to pyarrow backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants