You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-30091][SQL][PYTHON] Document mergeSchema option directly in the PySpark Parquet APIs
### What changes were proposed in this pull request?
This change properly documents the `mergeSchema` option directly in the Python APIs for reading Parquet data.
### Why are the changes needed?
The docstring for `DataFrameReader.parquet()` mentions `mergeSchema` but doesn't show it in the API. It seems like a simple oversight.
Before this PR, you'd have to do this to use `mergeSchema`:
```python
spark.read.option('mergeSchema', True).parquet('test-parquet').show()
```
After this PR, you can use the option as (I believe) it was intended to be used:
```python
spark.read.parquet('test-parquet', mergeSchema=True).show()
```
### Does this PR introduce any user-facing change?
Yes, this PR changes the signatures of `DataFrameReader.parquet()` and `DataStreamReader.parquet()` to match their docstrings.
### How was this patch tested?
Testing the `mergeSchema` option directly seems to be left to the Scala side of the codebase. I tested my change manually to confirm the API works.
I also confirmed that setting `spark.sql.parquet.mergeSchema` at the session does not get overridden by leaving `mergeSchema` at its default when calling `parquet()`:
```
>>> spark.conf.set('spark.sql.parquet.mergeSchema', True)
>>> spark.range(3).write.parquet('test-parquet/id')
>>> spark.range(3).withColumnRenamed('id', 'name').write.parquet('test-parquet/name')
>>> spark.read.option('recursiveFileLookup', True).parquet('test-parquet').show()
+----+----+
| id|name|
+----+----+
|null| 1|
|null| 2|
|null| 0|
| 1|null|
| 2|null|
| 0|null|
+----+----+
>>> spark.read.option('recursiveFileLookup', True).parquet('test-parquet', mergeSchema=False).show()
+----+
| id|
+----+
|null|
|null|
|null|
| 1|
| 2|
| 0|
+----+
```
Closesapache#26730 from nchammas/parquet-merge-schema.
Authored-by: Nicholas Chammas <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
0 commit comments