Commit b5840e1
[SPARK-52930][CONNECT] Use DataType.Array/Map for Array/Map Literals
### What changes were proposed in this pull request?
This PR introduces a transition to use `DataType.Array` and `DataType.Map` for array and map literals throughout the Spark Connect codebase for Array/Map Literals.
While the Spark Connect server supports both new and old data type fields, in this change, the new data type fields are set only in ColumnNodeToProtoConverter for the Spark Connect Scala client. All other components (e.g., ML, Python) still use the old data type fields because literal values are used not only in requests but also in responses, making it difficult to maintain compatibility—clients using older versions may not recognize the new fields in the response. Deprecation and the transition to the new fields require a gradual migration. The key changes include:
**Protocol Buffer Updates:**
- Modified `expressions.proto` to add new `data_type` fields for `Array` and `Map` messages
- Deprecated existing `element_type`, `key_type`, and `value_type` fields in favor of the unified `data_type` approach
- Updated generated protocol buffer files (`expressions_pb2.py`, `expressions_pb2.pyi`) to reflect these changes
**Core Implementation Changes:**
- Enhanced `LiteralValueProtoConverter.scala` with new internal method `toLiteralProtoBuilderInternal` that accepts `ToLiteralProtoOptions`
- Updated `LiteralExpressionProtoConverter.scala` to support inference of array and map data types
- Modified `columnNodeSupport.scala` to use the new `toLiteralProtoBuilderWithOptions` method with `useDeprecatedDataTypeFields` set to `false`
### Why are the changes needed?
The changes are needed to improve Spark's data type handling for array and map literals:
- **Nullability of Array/Map literals are now included in the DataType.Array/Map**: This ensures that nullability information is properly captured and handled within the data type structure itself.
- **Work better with type inference by including all type information in one field**: By consolidating all type information into a single field, it is easier to infer data types for complex data structures.
### Does this PR introduce _any_ user-facing change?
Yes. Previously, the nullability of arrays and map values using typedlit was not preserved (which I believe was a bug). It is now preserved. Please see the changes in ClientE2ETestSuite for details.
### How was this patch tested?
`build/sbt "connect/testOnly *LiteralExpressionProtoConverterSuite"`
`build/sbt "connect-client-jvm/testOnly *ClientE2ETestSuite -- -z SPARK-52930"`
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor 1.4.5
Closes #51653 from heyihong/SPARK-52930.
Authored-by: Yihong He <heyihong.cn@gmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>1 parent 79a0ca7 commit b5840e1
File tree
14 files changed
+1102
-570
lines changed- python/pyspark/sql/connect/proto
- sql/connect
- client/jvm/src/test/scala/org/apache/spark/sql/connect
- common/src
- main
- protobuf/spark/connect
- scala/org/apache/spark/sql/connect
- common
- test/resources/query-tests/queries
- server/src
- main/scala/org/apache/spark/sql/connect/planner
- test/scala/org/apache/spark/sql/connect/planner
14 files changed
+1102
-570
lines changedLarge diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
474 | 474 | | |
475 | 475 | | |
476 | 476 | | |
| 477 | + | |
477 | 478 | | |
478 | | - | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
479 | 485 | | |
480 | 486 | | |
481 | 487 | | |
482 | 488 | | |
483 | 489 | | |
484 | | - | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
485 | 500 | | |
486 | 501 | | |
487 | 502 | | |
488 | 503 | | |
489 | 504 | | |
| 505 | + | |
490 | 506 | | |
491 | 507 | | |
492 | | - | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
493 | 512 | | |
494 | 513 | | |
495 | 514 | | |
496 | 515 | | |
497 | | - | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
498 | 522 | | |
499 | 523 | | |
500 | 524 | | |
| |||
505 | 529 | | |
506 | 530 | | |
507 | 531 | | |
| 532 | + | |
508 | 533 | | |
509 | | - | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
510 | 540 | | |
511 | | - | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
512 | 547 | | |
513 | 548 | | |
514 | 549 | | |
515 | 550 | | |
516 | 551 | | |
517 | | - | |
| 552 | + | |
| 553 | + | |
518 | 554 | | |
519 | 555 | | |
520 | 556 | | |
521 | 557 | | |
522 | 558 | | |
523 | | - | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
524 | 569 | | |
525 | 570 | | |
526 | 571 | | |
527 | 572 | | |
528 | 573 | | |
529 | 574 | | |
530 | 575 | | |
| 576 | + | |
531 | 577 | | |
532 | 578 | | |
533 | 579 | | |
534 | 580 | | |
535 | | - | |
| 581 | + | |
536 | 582 | | |
537 | 583 | | |
538 | 584 | | |
539 | 585 | | |
540 | 586 | | |
| 587 | + | |
| 588 | + | |
541 | 589 | | |
542 | 590 | | |
543 | 591 | | |
| |||
Lines changed: 29 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1687 | 1687 | | |
1688 | 1688 | | |
1689 | 1689 | | |
| 1690 | + | |
| 1691 | + | |
| 1692 | + | |
| 1693 | + | |
| 1694 | + | |
| 1695 | + | |
| 1696 | + | |
| 1697 | + | |
| 1698 | + | |
| 1699 | + | |
| 1700 | + | |
| 1701 | + | |
| 1702 | + | |
| 1703 | + | |
| 1704 | + | |
| 1705 | + | |
| 1706 | + | |
| 1707 | + | |
| 1708 | + | |
| 1709 | + | |
| 1710 | + | |
| 1711 | + | |
| 1712 | + | |
| 1713 | + | |
| 1714 | + | |
| 1715 | + | |
| 1716 | + | |
| 1717 | + | |
| 1718 | + | |
1690 | 1719 | | |
1691 | 1720 | | |
1692 | 1721 | | |
| |||
Lines changed: 36 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
218 | | - | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
219 | 225 | | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
220 | 233 | | |
221 | 234 | | |
222 | 235 | | |
223 | | - | |
224 | | - | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
225 | 249 | | |
| 250 | + | |
| 251 | + | |
226 | 252 | | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
227 | 260 | | |
228 | 261 | | |
229 | 262 | | |
| |||
Lines changed: 7 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
73 | 74 | | |
74 | 75 | | |
75 | 76 | | |
| |||
0 commit comments