Skip to content

Conversation

@roelentless
Copy link
Contributor

Tables with null columns crashed PyArrow/Polars with "ran out of field metadata", both for stream/file.

The bug, eg: a Schema defined 3 columns, but the record batch only wrote metadata for 2 columns because it skipped null columns. PyArrow expected the counts to match.

Fixed by writing metadata for all columns, including null ones (they just don't get any actual data buffers).

Also added a test + test data files (generated by pyarrow) and encode/decode tests using those.

Also: awesome library to get going, thanks for kicking off a better JS lib for arrow.

Tables with null columns crashed PyArrow/Polars with "ran out of field metadata", both for stream/file.

The bug, eg: a Schema defined 3 columns, but the record batch only wrote metadata for 2 columns because it skipped null columns. PyArrow expected the counts to match.

Fixed by writing metadata for all columns, including null ones (they just don't get any actual data buffers).

Also added a test + test data files (generated by pyarrow) and encode/decode tests using those.
@roelentless
Copy link
Contributor Author

@jheer can you have a look? Thanks!

@jheer jheer merged commit 7b9b5eb into uwdata:main Oct 2, 2025
2 checks passed
@jheer
Copy link
Member

jheer commented Oct 2, 2025

Looks great. Thanks @roelentless!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants