Commit b9a3b9f
Record sort order when writing Parquet with WITH ORDER (apache#19595)
## Which issue does this PR close?
Part of apache#19433
## Rationale for this change
When writing data to a table created with `CREATE EXTERNAL TABLE ...
WITH ORDER`, the sorting columns should be recorded in the Parquet
file's row group metadata. This allows downstream readers to know the
data is sorted and potentially skip sorting operations.
## What changes are included in this PR?
- Add `sort_expr_to_sorting_column()` and
`lex_ordering_to_sorting_columns()` functions in `metadata.rs` to
convert DataFusion ordering to Parquet `SortingColumn`
- Add `sorting_columns` field to `ParquetSink` with
`with_sorting_columns()` builder method
- Update `create_writer_physical_plan()` to pass order requirements to
`ParquetSink`
- Update `create_writer_props()` to set sorting columns on
`WriterProperties`
- Add test verifying `sorting_columns` metadata is written correctly
## Are these changes tested?
Yes, added `test_create_table_with_order_writes_sorting_columns` that:
1. Creates an external table with `WITH ORDER (a ASC NULLS FIRST, b DESC
NULLS LAST)`
2. Inserts data
3. Reads the Parquet file and verifies the `sorting_columns` metadata
matches the expected order
## Are there any user-facing changes?
No user-facing API changes. Parquet files written via `INSERT INTO` or
`COPY` for tables with `WITH ORDER` will now contain `sorting_columns`
metadata in the row group.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>1 parent 0cf45ca commit b9a3b9f
File tree
4 files changed
+185
-4
lines changed- datafusion
- core/tests/parquet
- datasource-parquet/src
4 files changed
+185
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
| 53 | + | |
53 | 54 | | |
54 | 55 | | |
55 | 56 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
57 | | - | |
| 57 | + | |
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
61 | | - | |
| 61 | + | |
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
| |||
81 | 81 | | |
82 | 82 | | |
83 | 83 | | |
84 | | - | |
| 84 | + | |
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| |||
500 | 500 | | |
501 | 501 | | |
502 | 502 | | |
503 | | - | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
504 | 519 | | |
505 | 520 | | |
506 | 521 | | |
| |||
1088 | 1103 | | |
1089 | 1104 | | |
1090 | 1105 | | |
| 1106 | + | |
| 1107 | + | |
1091 | 1108 | | |
1092 | 1109 | | |
1093 | 1110 | | |
| |||
1119 | 1136 | | |
1120 | 1137 | | |
1121 | 1138 | | |
| 1139 | + | |
1122 | 1140 | | |
1123 | 1141 | | |
1124 | 1142 | | |
| 1143 | + | |
| 1144 | + | |
| 1145 | + | |
| 1146 | + | |
| 1147 | + | |
| 1148 | + | |
| 1149 | + | |
| 1150 | + | |
| 1151 | + | |
1125 | 1152 | | |
1126 | 1153 | | |
1127 | 1154 | | |
| |||
1145 | 1172 | | |
1146 | 1173 | | |
1147 | 1174 | | |
| 1175 | + | |
| 1176 | + | |
| 1177 | + | |
| 1178 | + | |
| 1179 | + | |
| 1180 | + | |
1148 | 1181 | | |
1149 | 1182 | | |
1150 | 1183 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
| 39 | + | |
38 | 40 | | |
39 | 41 | | |
40 | 42 | | |
| |||
43 | 45 | | |
44 | 46 | | |
45 | 47 | | |
| 48 | + | |
46 | 49 | | |
47 | 50 | | |
48 | 51 | | |
| |||
614 | 617 | | |
615 | 618 | | |
616 | 619 | | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
617 | 661 | | |
618 | 662 | | |
619 | 663 | | |
| |||
0 commit comments