-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
Hi there,
Just wondering if there's scope for a to_cudf type functionality so that users can read Parquet files directly into GPU memory (bypassing the CPU). This would be using the cudf.read_parquet function.
Happy to submit a Pull Request for this, but would like to have a discussion around the implementation, whether it should be handled as a to_cudf method, or via something like engine="cudf" (though cudf also has a "pyarrow" engine like pandas).
One issue though is that cudf cannot read multi-file Parquet folders yet (see rapidsai/cudf#1688), only single binary parquet files. This might get implemented in the future (v0.16?) cudf release though.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels