Skip to content

Conversation

@morotti
Copy link
Contributor

@morotti morotti commented Mar 25, 2025

Hello,

This is a simple PR to expose the crc32 function from the lzma library.
The code and tests are very similar to the crc in binascii and zlib. I don't think it's too controversial.

I came across this while looking at compression libraries for #91349
crc32 is always available from the lzma library (xz-utils/liblzma.so). I think it should be exposed.
there are more hashing functions available (namely crc64) but they need compilation/runtime checks to verify whether they are available.

Regards.


📚 Documentation preview 📚: https://cpython-previews--131721.org.readthedocs.build/

@morotti morotti force-pushed the rmorotti-lzma-crc32 branch from 1b298bd to ad43096 Compare April 17, 2025 13:32
@morotti
Copy link
Contributor Author

morotti commented Apr 17, 2025

sorry for delay,
@merwok @AA-Turner I started over from the main branch and made all the adjustments.

all the docstring/comments were copy pasted from the zlib function. I guess the old function had an old style that is no longer wanted.

for the return type in the docstring, we can settle with "positive integer"

https://docs.python.org/3/library/zlib.html#zlib.crc32
the docstring on zlib.crc32 says "The result is an unsigned 32-bit integer." but there is no concept of unsigned or 32 bits for python int.

Copy link
Member

@AA-Turner AA-Turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@morotti
Copy link
Contributor Author

morotti commented Apr 17, 2025

@AA-Turner , I merged all your suggestions.

for the versionadded: next, I thought changes on main would ship in 3.14? I am too late for 3.14 release?

@AA-Turner
Copy link
Member

'next' is updated by the Release Manager for each version, it's a convenience tool that we use internally. We can still get this in for 3.14, yes.

A

@morotti
Copy link
Contributor Author

morotti commented Apr 17, 2025

regarding performance, don't mind that for now, I will come back later.

in principle

  • zlib is abandoned and the slowest.
  • liblzma is 2 or 3 times faster.
  • zlib-ng is two orders of magnitude faster. it's using hardware instructions.

liblzma had code to start using hardware crc instructions soon ... and I see they just made a new stable release few days ago https://github.com/tukaani-project/xz/releases
I will check what's better and come back with more PR to use the best available crc function.

@morotti
Copy link
Contributor Author

morotti commented Apr 17, 2025

it appears the Ubuntu (free-threading) / build and test (ubuntu-24.04-arm is flaky

@AA-Turner
Copy link
Member

I'll re-run it.

@morotti
Copy link
Contributor Author

morotti commented Apr 17, 2025

build passed, ready to merge

@AA-Turner
Copy link
Member

Do you want to include your man.com email in the commit? Currently GH lists it as a co-authored-by, but the commit itself will be under your GMail address.

@morotti
Copy link
Contributor Author

morotti commented Apr 18, 2025

Please don't mind the email.
My work and my personal emails can both show up depending how the commits and PRs are opened. It doesn't matter which one shows up. My organization signed the CLA for the cpython interpreter, I have permissions to contribute.

@morotti
Copy link
Contributor Author

morotti commented May 12, 2025

ping, can this be merged?

@morotti morotti force-pushed the rmorotti-lzma-crc32 branch from 61094d8 to 0603528 Compare June 25, 2025 14:09
@morotti
Copy link
Contributor Author

morotti commented Jun 25, 2025

@merwok @AA-Turner are you able to review?

I've rebased on main and redid the chance, since it's been 3 months and main had changed.

@morotti
Copy link
Contributor Author

morotti commented Jun 27, 2025

@gpshead you've replied multiple times to my questions on improving compression code with "PR welcome". maybe you would be interested in reviewing this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants