Skip to content

Conversation

@HuaHuaY
Copy link
Contributor

@HuaHuaY HuaHuaY commented Aug 28, 2025

Add parquet writer factory and basic parquet writer without metrics.

Copilot AI review requested due to automatic review settings August 28, 2025 03:35
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a basic ParquetWriter class for writing ArrowArray data to Parquet files. The implementation follows the Writer interface pattern and uses Apache Arrow/Parquet libraries for the underlying functionality.

  • Adds ParquetWriter class with PIMPL pattern for encapsulation
  • Implements core writer operations: Open, Write, Close with Arrow/Parquet integration
  • Provides placeholder implementations for metrics, length, and split_offsets methods

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/iceberg/parquet/parquet_writer.h Header file defining the ParquetWriter class interface with PIMPL pattern
src/iceberg/parquet/parquet_writer.cc Implementation of ParquetWriter with Arrow/Parquet integration and factory registration
src/iceberg/CMakeLists.txt Adds parquet_writer.cc to the build configuration

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@HuaHuaY HuaHuaY force-pushed the basic_parquet_writer branch 2 times, most recently from 602b681 to 978dfcf Compare September 1, 2025 06:23
@HuaHuaY HuaHuaY changed the title feat: implement basic parquet writer feat: implement basic parquet writer and add roundtrip tests Sep 1, 2025
@HuaHuaY HuaHuaY force-pushed the basic_parquet_writer branch 2 times, most recently from 3039e80 to 90039fb Compare September 1, 2025 11:14
@HuaHuaY HuaHuaY force-pushed the basic_parquet_writer branch from 1f9507a to 02a05cc Compare September 2, 2025 02:30
Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left some nits.

@zeroshade zeroshade merged commit 1002270 into apache:main Sep 3, 2025
7 checks passed
@HuaHuaY HuaHuaY deleted the basic_parquet_writer branch September 4, 2025 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants