SNOW-1524760: Fix Series.isin behavior#3973
Conversation
sfc-gh-helmeleegy
left a comment
There was a problem hiding this comment.
It seems that some query counts need to be fixed. But otherwise looks good to me. Thanks!
Done! We might actually be able to eliminate the joins with some changes to aggregation logic, and I'll look into it in a follow-up PR. |
|
|
||
|
|
||
| @sql_count_checker(query_count=3) | ||
| @sql_count_checker(query_count=3, join_count=2) |
There was a problem hiding this comment.
I'm a little surprised this changed; because you didn't need to change the join counts on other isin tests.
There was a problem hiding this comment.
It looks like the query generated in this test got significantly more complicated because of the added ARRAY_AGG operation, but I believe the previous version was incorrect for a lot of cases. The actual query text of the other isin tests have joins as well, but I guess they're not being parsed correctly by the SQL counter.
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes SNOW-1524760
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
This PR fixes the behavior of
Series.isin(other_series), which ignores indices instead of joining on row/column labels. It also adds a fast path forSeries.isin(dataframe), which should always return false at every index.Per @sfc-gh-mvashishtha's investigation in the linked ticket: