GH-45457: [Python] Add pyarrow.ArrayStatistics#45550
Conversation
|
@github-actions crossbow submit -g python |
|
|
This comment was marked as outdated.
This comment was marked as outdated.
|
@pitrou @jorisvandenbossche Could you take a look at this? |
|
I'll merge this in a few days if nobody objects it. |
python/pyarrow/array.pxi
Outdated
There was a problem hiding this comment.
For the record, I've opened a Cython feature request to make this more automatic.
There was a problem hiding this comment.
Thanks! I've added a comment that refers the issue.
python/pyarrow/array.pxi
Outdated
There was a problem hiding this comment.
uint64_t isn't handled below, should the docstring or the code be fixed?
There was a problem hiding this comment.
Oh... The code was wrong... I've added the uint64_t case.
python/pyarrow/array.pxi
Outdated
There was a problem hiding this comment.
except * means it could raise Python exceptions, but it doesn't here, so perhaps you can remove that annotation (though it's not really a problem either).
There was a problem hiding this comment.
Thanks! I didn't know much about except in Cython...
It's the bindings of `arrow::ArrayStatistics`. You can get it by `pyarrow.Array.statistics()`.
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
|
@github-actions crossbow submit -g python |
This comment was marked as outdated.
This comment was marked as outdated.
|
@github-actions crossbow submit -g python |
This comment was marked as outdated.
This comment was marked as outdated.
| assert statistics.min == -1 | ||
| assert statistics.is_min_exact | ||
| assert statistics.max == 3 | ||
| assert statistics.is_max_exact |
There was a problem hiding this comment.
Can we have a test for repr(statistics) to make sure that the string representation works?
There was a problem hiding this comment.
It's a good idea. I've added it.
|
@github-actions crossbow submit -g python |
|
Revision: e3a20b5 Submitted crossbow builds: ursacomputing/crossbow @ actions-747dbaddf2 |
|
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 631fa0a. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 12 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
Apache Arrow C++ can attach statistics read from Apache Parquet data to
arrow::Array. If we have the bindings of the feature in Python, Python users can also use attached statistics.What changes are included in this PR?
pyarrow.ArrayStatisticspyarrow.Array.statistics().Are these changes tested?
Yes.
Are there any user-facing changes?
Yes.
arrow::ArrayStatisticsbindings #45457