Skip to content

Conversation

@dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Apr 10, 2025

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

  • horizontal_concat
  • vertical_concat
  • diagonal_concat

dangotbanned added a commit that referenced this pull request Apr 10, 2025
@dangotbanned dangotbanned added the pandas-like Issue is related to pandas-like backends label Apr 10, 2025
@dangotbanned dangotbanned marked this pull request as ready for review April 10, 2025 18:39
@dangotbanned dangotbanned requested a review from FBruzzesi April 10, 2025 18:39
Copy link
Member

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dangotbanned πŸš€ I like this version a lot!
I might have spotted something in the type annotation?!

return concat(dfs, axis=0, copy=False)
return concat(dfs, axis=0)

def _concat_horizontal(self, dfs: Sequence[NDFrameT], /) -> NDFrameT:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the output type always be pd.DataFrame regardless of input in dfs?

Suggested change
def _concat_horizontal(self, dfs: Sequence[NDFrameT], /) -> NDFrameT:
def _concat_horizontal(self, dfs: Sequence[NDFrameT], /) -> pd.DataFrame:

Here a snippet:

import pandas as pd

s1 = pd.Series([1, 2, 3])
s2 = pd.Series(["a", "b", "c"])

type(pd.concat([s1], axis=1))
# pandas.core.frame.DataFrame

type(pd.concat([s1, s2], axis=1))
# pandas.core.frame.DataFrame

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted!

So I added that type for horizontal, which gets used with a Series as well (IIRC in group_by maybe?).

vertical used some property that is only available on DataFrame, so I had to make that one narrower.

diagonal (I think) should work with either type, but is currently only used for DataFrame.

Replying from my phone, hope that all makes sense πŸ˜…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see what you mean now!

Copy link
Member Author

@dangotbanned dangotbanned Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2368 (comment)
@FBruzzesi I think you'd also need to change the type of dfs to be:

Sequence[pd.DataFrame] | Sequence[pd.Series[Any]]

If you kept the TypeVar in pyright will tell you off πŸ˜‰

Edit: I forgot about this new thing I learned πŸ€¦β€β™‚οΈ #2283 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FBruzzesi quite happy with this now (b6d0711)

Thanks for the pandas 🧠

image

image

image

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks both

Comment on lines +241 to +245
if self._implementation.is_pandas() and self._backend_version < (3,):
if self._backend_version < (1,):
return self._concat(dfs, axis=VERTICAL, copy=False, sort=False)
return self._concat(dfs, axis=VERTICAL, copy=False)
return self._concat(dfs, axis=VERTICAL)
Copy link
Member Author

@dangotbanned dangotbanned Apr 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcoGorelli should _concat_vertical be matching this?

I just copied it over, but they have a subtle difference:

if self._implementation.is_pandas() and self._backend_version < (3,):
return self._concat(dfs, axis=VERTICAL, copy=False)
return self._concat(dfs, axis=VERTICAL)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably fine for now, perhaps let's make an issue and revisit?

@dangotbanned dangotbanned merged commit caabc0e into main Apr 12, 2025
28 of 29 checks passed
@dangotbanned dangotbanned deleted the refac-pandas-concat branch April 12, 2025 14:02
dangotbanned added a commit that referenced this pull request Apr 12, 2025
dangotbanned added a commit that referenced this pull request Apr 14, 2025
* refactor: refactor: Simplify `ArrowNamespace.concat`

Related #2368

* chore(typing): Ignore stub issues

Fixed in zen-xu/pyarrow-stubs#203

* cov

https://github.com/narwhals-dev/narwhals/actions/runs/14422362924/job/40446526457?pr=2381

* refactor: Implement `EagerNamespace.concat`

* refactor: Avoid redundant unique check

Has no coverage, would be caught later with a nicer error anyway w/ https://github.com/narwhals-dev/narwhals/blob/41871f775589359c563150526d03935449709d7d/narwhals/utils.py#L1466-L1475

* refactor: Consistently use `chain.from_iterable`

* chore: Remove notes

One became a type, the other is moving to github
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal pandas-like Issue is related to pandas-like backends

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants