Skip to content

Conversation

@lovisaberggren
Copy link
Collaborator

@lovisaberggren lovisaberggren commented Jan 28, 2025

Proposed changes

The parquet file previously created in IPA metrics collection wasn't a valid parquet file, but an arrow file. This PR adds parquet-wasm which can coverts an arrow table in memory to parquet which is then written to file. The file is compressed with gzip.

Tested locally and confirmed the created parquet and the dumped parquet in s3 are readable as parquet.

Jira ticket: CLOUDP-297242

@lovisaberggren lovisaberggren marked this pull request as ready for review January 28, 2025 16:21
@lovisaberggren lovisaberggren requested a review from a team as a code owner January 28, 2025 16:22
Copy link
Collaborator

@yelizhenden-mdb yelizhenden-mdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ❤️

@lovisaberggren lovisaberggren merged commit 4853a2d into main Jan 28, 2025
13 checks passed
@lovisaberggren lovisaberggren deleted the CLOUDP-297242 branch January 28, 2025 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants