Skip to content

Conversation

@HeartLinked
Copy link
Contributor

Introduces a new method FileScanTask::ToArrow() which returns a standard struct ArrowArrayStream. This makes iceberg-cpp a first-class citizen in the Arrow ecosystem.

  • FileScanTask::ToArrow() is added. It takes the projected schema, filter, and FileIO as arguments to create and configure a file-format-specific reader (ParquetReader, etc.).

  • A new factory function, MakeArrowArrayStream, is introduced. It takes an internal C++ Reader instance and wraps it in an ArrowArrayStream, correctly managing state and resource lifecycle via the C interface callbacks (get_schema, get_next, release).

@HeartLinked HeartLinked marked this pull request as ready for review September 5, 2025 10:22
@wgtmac
Copy link
Member

wgtmac commented Sep 9, 2025

Please also change the title to feat: add data from FileScanTask as Arrow C Stream

@HeartLinked HeartLinked changed the title feat: export scan tasks as Arrow C ABI streams feat: add data from FileScanTask as Arrow C Stream Sep 9, 2025
@wgtmac
Copy link
Member

wgtmac commented Sep 9, 2025

Please also change the title to feat: add data from FileScanTask as Arrow C Stream

My bad. There is a typo. It should be feat: read data from FileScanTask as Arrow C Stream

@HeartLinked HeartLinked changed the title feat: add data from FileScanTask as Arrow C Stream feat: read data from FileScanTask as Arrow C Stream Sep 9, 2025
@Fokko Fokko merged commit 9b69e45 into apache:main Sep 15, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants