Commit 778b386
David Cournapeau
BUG: fix read_parquet with dataset=True when the first partition is empty.
When reading a set of parquet files with dataset=True, if the first
partition is empty the current logic for dtype inference will fail. It
ill raise exceptions as follows:
```
pyarrow.lib.ArrowTypeError: Unable to merge: Field col0 has incompatible
types: dictionary<values=null, indices=int32, ordered=0> vs
dictionary<values=string, indices=int32, ordered=0
```
To fix this, we filter out empty table(s) before merging them into one
parquet file.1 parent 635f6d5 commit 778b386
2 files changed
+28
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
308 | 308 | | |
309 | 309 | | |
310 | 310 | | |
| 311 | + | |
311 | 312 | | |
312 | 313 | | |
313 | 314 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
485 | 485 | | |
486 | 486 | | |
487 | 487 | | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
488 | 515 | | |
489 | 516 | | |
490 | 517 | | |
| |||
0 commit comments