Skip to content

SNOW-2435290: Add support for groupby.agg/min/max/count/sum/mean/median/std/var in faster pandas#3908

Merged
sfc-gh-helmeleegy merged 6 commits intomainfrom
helmeleegy-SNOW-2435290
Oct 18, 2025
Merged

SNOW-2435290: Add support for groupby.agg/min/max/count/sum/mean/median/std/var in faster pandas#3908
sfc-gh-helmeleegy merged 6 commits intomainfrom
helmeleegy-SNOW-2435290

Conversation

@sfc-gh-helmeleegy
Copy link
Contributor

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-2435290

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
    • I acknowledge that I have ensured my changes to be thread-safe. Follow the link for more information: Thread-safe Developer Guidelines
    • If adding any arguments to public Snowpark APIs or creating new public Snowpark APIs, I acknowledge that I have ensured my changes include AST support. Follow the link for more information: AST Support Guidelines
  3. Please describe how your code solves the related issue.

    Add support for groupby.agg/min/max/count/sum/mean/median/std/var in faster pandas.

@sfc-gh-helmeleegy sfc-gh-helmeleegy requested a review from a team as a code owner October 16, 2025 22:47
@sfc-gh-helmeleegy sfc-gh-helmeleegy added the NO-PANDAS-CHANGEDOC-UPDATES This PR does not update Snowpark pandas docs label Oct 16, 2025
1 2
2 3
dtype: int64
dtype: int8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did data type change here?

Copy link
Contributor Author

@sfc-gh-helmeleegy sfc-gh-helmeleegy Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure. But the sql queries look a bit different in faster pandas compared to the original snowpark pandas. So the new order of sql operations may have caused the typing system on the server side to infer a slightly different final type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is not related to the snowflake-connector upgrade? This feels like an arrow change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. In this case, let's fix this in a separate PR since the root cause is not related to this PR.
Here is the separate fix PR: #3911
@sfc-gh-jkew @sfc-gh-joshi @sfc-gh-nkrishna

1 2
2 3
dtype: int64
dtype: int8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is not related to the snowflake-connector upgrade? This feels like an arrow change.

],
)
@sql_count_checker(query_count=6)
def test_groupby_agg(session, func):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a lot of the fasters pandas tests have some pretty similar setup/assert/teardown code. Could you add some of that logic to a shared logic/fixture in the future?

@sfc-gh-helmeleegy sfc-gh-helmeleegy merged commit aad530a into main Oct 18, 2025
29 of 30 checks passed
@sfc-gh-helmeleegy sfc-gh-helmeleegy deleted the helmeleegy-SNOW-2435290 branch October 18, 2025 04:58
@github-actions github-actions bot locked and limited conversation to collaborators Oct 18, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

NO-PANDAS-CHANGEDOC-UPDATES This PR does not update Snowpark pandas docs snowpark-pandas

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants