Commit ef718a7
### Rationale for this change
In Python, `pyarrow.Schema` before was not hashable when it has `metadata` set.
```
>>> import pyarrow
>>> schema = pyarrow.schema([], metadata={b"1": b"1"})
>>> hash(schema)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyarrow/types.pxi", line 2921, in pyarrow.lib.Schema.__hash__
TypeError: unhashable type: 'dict'
```
This is because the metadata (which is a dict) was tried to be hashed as-is, which doesn't work.
### What changes are included in this PR?
Slightly change how hashes are computed for Schema, by converting the `dict[str, str]` to the frozenset of key- and value tuples.
For reference, this is faster than computing the hash of a sorted tuple of key- and value tuples (https://stackoverflow.com/a/6014481/10070873).
### Are these changes tested?
Yes.
### Are there any user-facing changes?
Besides that `Schema` now correctly is hashable, no.
* GitHub Issue: #47602
Lead-authored-by: Jonas Dedden <[email protected]>
Co-authored-by: Alenka Frim <[email protected]>
Signed-off-by: Raúl Cumplido <[email protected]>
1 parent cde3f6a commit ef718a7
2 files changed
+24
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
482 | 482 | | |
483 | 483 | | |
484 | 484 | | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
485 | 507 | | |
486 | 508 | | |
487 | 509 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2918 | 2918 | | |
2919 | 2919 | | |
2920 | 2920 | | |
2921 | | - | |
| 2921 | + | |
| 2922 | + | |
2922 | 2923 | | |
2923 | 2924 | | |
2924 | 2925 | | |
| |||
0 commit comments