-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Closed
Closed
Copy link
Labels
EnhancementIO Parquetparquet, featherparquet, featherNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action
Description
Problem description
The pandas.read_parquet
and pandas.to_parquet
methods defer operation to either pyarrow
(first priority) or fastparquet
. The issue is that these libraries have a slightly different interface:
For reading:
pyarrow
acceptsIOBase
and notbytes
fastparquet
acceptsbytes
and notIOBase
Pandas should support one or both and do the conversion automatically.
For writing:
pyarrow
acceptsIOBase
fastparquet
doesn't really support writing to an ephemeral buffer because the stream is closed when using theopen_with
argument (see FR: Accept a file-like object in addition to a path infastparquet.write
dask/fastparquet#408)
sjdemartini, ospikovets and jeffcarrico
Metadata
Metadata
Assignees
Labels
EnhancementIO Parquetparquet, featherparquet, featherNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action