Commit c863717
committed
[SPARK-53358] Improve arrow Python UDTF output type mismatch error message
### What changes were proposed in this pull request?
This PR updates the error message when output type mismatch required type for arrow python UDTFs to make it more user friendly.
### Why are the changes needed?
Improve error message to make it more actionable.
Before this change:
```pyspark.errors.exceptions.base.PySparkRuntimeError: [UDTF_ARROW_TYPE_CONVERSION_ERROR] Cannot convert the output value of the input '[
0
]' with type 'struct<x:int>' to the specified return type of the column: 'struct<x: int32>'. Please check if the data types match and try again.
```
After this change:
```
PyArrow UDTF must return an iterator of pyarrow.Table or pyarrow.RecordBatch objects.
```
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Existing UTs
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #52103 from allisonwang-db/spark-53358-arrow-udtf-err-msg.
Authored-by: Allison Wang <allison.wang@databricks.com>
Signed-off-by: Allison Wang <allison.wang@databricks.com>1 parent dab3464 commit c863717
2 files changed
+2
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1146 | 1146 | | |
1147 | 1147 | | |
1148 | 1148 | | |
1149 | | - | |
| 1149 | + | |
1150 | 1150 | | |
1151 | 1151 | | |
1152 | 1152 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2066 | 2066 | | |
2067 | 2067 | | |
2068 | 2068 | | |
2069 | | - | |
2070 | | - | |
2071 | | - | |
2072 | | - | |
2073 | | - | |
| 2069 | + | |
2074 | 2070 | | |
2075 | 2071 | | |
2076 | 2072 | | |
| |||
0 commit comments