Commit fd77ec6
[SPARK-53291][SQL] Fix nullability for value column
### What changes were proposed in this pull request?
For shredded Variant, we currently always set the `value` column to be nullable. But when there is no corresponding `typed_value`, and the value doesn't represent an object field (where null implies missing from the object), the `value` is never null, and we can set the column to be required.
### Why are the changes needed?
This shouldn't affect results as read by Spark, but it may cause the parquet file to be marginally larger, and the [spec](https://github.com/apache/parquet-format/blob/master/VariantShredding.md) wording indicates that `value` must be required in these situations, so a strict reader could reject the schema as it's currently being produced.
### Does this PR introduce _any_ user-facing change?
Variant parquet file schema may change slightly.
### How was this patch tested?
Unit test extended to cover this case.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #52043 from cashmand/fix_nullability.
Authored-by: cashmand <david.cashman@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>1 parent 923d70f commit fd77ec6
File tree
2 files changed
+48
-7
lines changed- sql/core/src
- main/scala/org/apache/spark/sql/execution/datasources/parquet
- test/scala/org/apache/spark/sql
2 files changed
+48
-7
lines changedLines changed: 9 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
473 | 473 | | |
474 | 474 | | |
475 | 475 | | |
476 | | - | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
477 | 479 | | |
478 | 480 | | |
479 | 481 | | |
480 | 482 | | |
481 | 483 | | |
482 | | - | |
| 484 | + | |
483 | 485 | | |
484 | 486 | | |
485 | 487 | | |
| |||
489 | 491 | | |
490 | 492 | | |
491 | 493 | | |
492 | | - | |
| 494 | + | |
493 | 495 | | |
494 | 496 | | |
495 | 497 | | |
496 | 498 | | |
497 | 499 | | |
498 | | - | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
499 | 503 | | |
500 | | - | |
| 504 | + | |
501 | 505 | | |
502 | 506 | | |
503 | 507 | | |
| |||
Lines changed: 39 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
80 | 86 | | |
81 | 87 | | |
82 | 88 | | |
| |||
86 | 92 | | |
87 | 93 | | |
88 | 94 | | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
89 | 104 | | |
90 | 105 | | |
91 | | - | |
92 | | - | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
93 | 111 | | |
94 | 112 | | |
95 | 113 | | |
| |||
185 | 203 | | |
186 | 204 | | |
187 | 205 | | |
| 206 | + | |
| 207 | + | |
188 | 208 | | |
189 | 209 | | |
190 | 210 | | |
| |||
210 | 230 | | |
211 | 231 | | |
212 | 232 | | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
213 | 238 | | |
214 | 239 | | |
215 | 240 | | |
| |||
230 | 255 | | |
231 | 256 | | |
232 | 257 | | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
233 | 261 | | |
234 | 262 | | |
235 | 263 | | |
| |||
254 | 282 | | |
255 | 283 | | |
256 | 284 | | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
257 | 294 | | |
258 | 295 | | |
259 | 296 | | |
0 commit comments