Skip to content

Add support for running pandas queries with cudf.pandas enabled#148

Merged
ritchie46 merged 4 commits intopola-rs:mainfrom
vyasr:feat/pandas_queries
Sep 18, 2025
Merged

Add support for running pandas queries with cudf.pandas enabled#148
ritchie46 merged 4 commits intopola-rs:mainfrom
vyasr:feat/pandas_queries

Conversation

@vyasr
Copy link
Contributor

@vyasr vyasr commented Apr 23, 2025

This PR makes it possible to run the pandas queries with GPU acceleration analogous to the support for the Polars GPU engine. To support this, cudf is added to the requirements list (which means we should also be able to run the Polars GPU engine benchmarks with the virtual environment now).

@ritchie46
Copy link
Member

This needs a rebase.

@vyasr vyasr force-pushed the feat/pandas_queries branch 2 times, most recently from d8baf65 to 306e6fb Compare April 28, 2025 19:52
@vyasr
Copy link
Contributor Author

vyasr commented Apr 28, 2025

Done. However, the last release of cudf has an upper bound on the supported version that bumps us back to 1.25 here. I don't know if that is compatible with the polars cloud bits that you recently added. If you prefer, I can revert the changes adding cudf to the environment and we can continue relying on the *-no-env variants of the Makefile targets for now for the GPU benchmarks, and revisit adding cudf to the environment at a later date.

Note that when we first added GPU benchmarks to this repo cudf was not yet available on PyPI, only NVIDIA's pip index, so there was an even stronger reason not to add it to the environment here. Now that we can get cudf from PyPI it is feasible to do this, with the main issue being if the upper bounds that we impose for stability reasons are prohibitive for your use cases in this repo. Ideally we'd be able to relax that bound eventually, but I don't think we're quite comfortable enough to do that yet.

@vyasr
Copy link
Contributor Author

vyasr commented May 24, 2025

The 25.06 release of cudf will support Polars 1.28, so perhaps the best option here is to wait for that release so that we don't have to change the supported Polars version here.

@vyasr vyasr force-pushed the feat/pandas_queries branch from 306e6fb to 7fd4cca Compare June 16, 2025 16:32
@vyasr
Copy link
Contributor Author

vyasr commented Sep 15, 2025

@ritchie46 The open question on this PR is that since cudf-polars currently place an upper bound on polars, if cudf-polars is part of the environment then it upper bounds the version of polars we can have installed until the next release. If that is OK, I can update this PR with latest main. Otherwise I can simplify this PR by removing the requirements changes and the run-pandas-gpu target (I'll just leave the *-no-env targets).

@ritchie46
Copy link
Member

Otherwise I can simplify this PR by removing the requirements changes and the run-pandas-gpu target (I'll just leave the *-no-env targets).

I think I'd prefer that. Can you also rebase? I think that should satisfy mypi.

@vyasr vyasr force-pushed the feat/pandas_queries branch from 7fd4cca to 0a6c208 Compare September 17, 2025 21:48
@vyasr vyasr mentioned this pull request Sep 17, 2025
@vyasr
Copy link
Contributor Author

vyasr commented Sep 18, 2025

I think I'd prefer that. Can you also rebase? I think that should satisfy mypi.

Both done. Unfortunately still seeing mypy errors. I opened #173 to resolve the outstanding issues with CI.

@ritchie46 ritchie46 merged commit 5492c9d into pola-rs:main Sep 18, 2025
2 of 3 checks passed
@vyasr vyasr deleted the feat/pandas_queries branch September 18, 2025 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants