Commit d52b8e2
### What changes were proposed in this pull request?
When creating a `DataFrame` from Python using `spark.createDataFrame`, infer the type of any `VariantVal` objects as `VariantType`. This is implemented by adding a case mapping `VariantVal` to `VariantType` in the `pyspark.sql.types._infer_type` function.
### Why are the changes needed?
Currently, when creating a `DataFrame` that includes locally-instantiated `VariantVal` objects in Python, the type is inferred as `struct<metadata:binary,value:binary>` rather than `VariantType`. This leads to unintended behavior when creating a `DataFrame` locally, or in certain situations like `df.rdd.map(...).toDF` which call `createDataFrame` under the hood. The bug only occurs when the schema of the `DataFrame` is not passed explicitly.
### Does this PR introduce _any_ user-facing change?
Yes, fixes the bug described above.
### How was this patch tested?
Added a test in `python/pyspark/sql/tests/test_types.py` that checks the inferred type is `VariantType`, as well as ensuring the `VariantVal` has the correct `value` and `metadata` after inference.
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #51065 from austinrwarner/SPARK-52355.
Authored-by: Austin Warner <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
1 parent e9a822b commit d52b8e2
2 files changed
+21
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
477 | 477 | | |
478 | 478 | | |
479 | 479 | | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
480 | 499 | | |
481 | 500 | | |
482 | 501 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2307 | 2307 | | |
2308 | 2308 | | |
2309 | 2309 | | |
| 2310 | + | |
| 2311 | + | |
2310 | 2312 | | |
2311 | 2313 | | |
2312 | 2314 | | |
| |||
0 commit comments