Skip to content

snowflake: use new parquet-go parallelized row group construction#3940

Merged
rockwotj merged 3 commits intoredpanda-data:mainfrom
rockwotj:snowpar
Jan 29, 2026
Merged

snowflake: use new parquet-go parallelized row group construction#3940
rockwotj merged 3 commits intoredpanda-data:mainfrom
rockwotj:snowpar

Conversation

@rockwotj
Copy link
Contributor

@rockwotj rockwotj commented Jan 29, 2026

Based on parquet-go/parquet-go#339

Test Old build_µs New build_µs Speedup Old rows/sec New rows/sec Speedup
1K_rows_1_worker 7,109 2,418 2.9x 142K 500K 3.5x
1K_rows_2_workers 2,826 1,448 2.0x 500K 1,000K 2.0x
1K_rows_4_workers 3,134 1,134 2.8x 333K 1,000K 3.0x
10K_rows_1_worker 31,581 20,969 1.5x 323K 500K 1.5x
10K_rows_2_workers 18,549 14,387 1.3x 556K 714K 1.3x
10K_rows_4_workers 15,873 8,178 1.9x 667K 1,250K 1.9x
10K_rows_8_workers 23,504 8,325 2.8x 435K 1,250K 2.9x
50K_rows_1_worker 103,469 103,873 ~1.0x 485K 485K ~1.0x
50K_rows_2_workers 71,121 62,706 1.1x 704K 806K 1.1x
50K_rows_4_workers 57,291 38,316 1.5x 877K 1,316K 1.5x
50K_rows_8_workers 51,514 16,406 3.1x 980K 3,125K 3.2x
100K_rows_1_worker 208,778 203,142 ~1.0x 481K 493K ~1.0x
100K_rows_4_workers 101,990 75,440 1.4x 990K 1,333K 1.3x
100K_rows_8_workers 76,764 45,790 1.7x 1,316K 2,222K 1.7x

@mmatczuk
Copy link
Collaborator

Cool

@rockwotj
Copy link
Contributor Author

pushed a fix for the parquet-go changes when upgraded the library

@rockwotj rockwotj merged commit 27e03ce into redpanda-data:main Jan 29, 2026
5 checks passed
@rockwotj rockwotj deleted the snowpar branch January 29, 2026 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants