Skip to content

Conversation

@dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Aug 10, 2025

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

Creates a common parent protocol for Compliant{Expr,Series}, spec-ing their shared parts

The diff isn't huge here, but it'll shrink it shrinks is_close and allow us to implement more features in a similar way

Only tweak was `unique` having a default, but it isn't used by `pyarrow`, `pandas` - only `polars`
Was a bit more involved since
- 4 rhs bin ops and `is_between` aren't defined at this level
- polars was missing even more
Eventually we can just require the two `_with_binary*` have defs and then avoid repeating each dunder in every backend
@dangotbanned dangotbanned marked this pull request as ready for review August 10, 2025 18:03
@dangotbanned dangotbanned requested a review from FBruzzesi August 10, 2025 18:03
Copy link
Member

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dangotbanned this looks promising 🎉 I left a couple of minor comments

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice one! 🎉

@dangotbanned dangotbanned marked this pull request as draft August 11, 2025 11:21
Comment on lines 95 to 107
def is_between(
self, lower_bound: Self, upper_bound: Self, closed: ClosedInterval
) -> Self:
if closed == "left":
return (self >= lower_bound) & (self < upper_bound)
if closed == "right":
return (self > lower_bound) & (self <= upper_bound)
if closed == "none":
return (self > lower_bound) & (self < upper_bound)
return (self >= lower_bound) & (self <= upper_bound)

def is_duplicated(self) -> Self:
return ~self.is_unique()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nice side effect is these are now fixed in the docs (related #2858)

image image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could fix Series.rename + Series.shape in the same way

Copy link
Member

@FBruzzesi FBruzzesi Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a really nice side effect. I know we are not too invested at the moment with the api completeness, but these changes are low hanging fruit, and might simplify a lot the logic there (api completeness) without the need of handling additional special cases

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could fix Series.rename + Series.shape in the same way

Thinking back, I'm not super concerned about these two for now

@dangotbanned dangotbanned marked this pull request as ready for review August 11, 2025 11:55
Copy link
Member

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dangotbanned - I am pretty much onboarded with these changes. I will wait for @MarcoGorelli approval on this one though!

Just for context, are you considering this PR a requirement for #2962 or viceversa or neither 😂?

Comment on lines 95 to 107
def is_between(
self, lower_bound: Self, upper_bound: Self, closed: ClosedInterval
) -> Self:
if closed == "left":
return (self >= lower_bound) & (self < upper_bound)
if closed == "right":
return (self > lower_bound) & (self <= upper_bound)
if closed == "none":
return (self > lower_bound) & (self < upper_bound)
return (self >= lower_bound) & (self <= upper_bound)

def is_duplicated(self) -> Self:
return ~self.is_unique()
Copy link
Member

@FBruzzesi FBruzzesi Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a really nice side effect. I know we are not too invested at the moment with the api completeness, but these changes are low hanging fruit, and might simplify a lot the logic there (api completeness) without the need of handling additional special cases

@dangotbanned
Copy link
Member Author

Thanks @dangotbanned - I am pretty much onboarded with these changes. I will wait for @MarcoGorelli approval on this one though!
Just for context, are you considering this PR a requirement for #2962 or viceversa or neither 😂?

Well we've done it this way round now regardless 😄

(refactor: Move is_close to CompliantColumn)

@MarcoGorelli
Copy link
Member

Thanks @dangotbanned - I am pretty much onboarded with these changes. I will wait for @MarcoGorelli approval on this one though!

from a look this seems good, i'm running out of time for the day but if you're happy with it feel free to ship it, thanks both!

Copy link
Member

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this refactor @dangotbanned 🙏🏼

@dangotbanned dangotbanned merged commit 411e6a2 into main Aug 13, 2025
30 of 31 checks passed
@dangotbanned dangotbanned deleted the compliant-column branch August 13, 2025 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants