Skip to content

zipfile detection of when to write a zip64 header should be made accurate #113931

@gpshead

Description

@gpshead

Bug report

Proposal:

Today our zipfile module internal implementation uses a heuristic dance to determine when a zip64 header is likely to be required between zipfile.ZipFile._open_to_write() and zipfile._ZipWriteFile.close().

This seems rather silly. Any the time zipfile._ZipWriteFile.close() is called, we know the real uncompressed and compressed data sizes and can deterministically decide at that time. Instead of the existing heuristic of "if the expected input file_size * 1.05 > ZIP64_LIMIT" used within _open_to_write() today.

The only time we should ever raise an exception regarding zip64 being requires is if the API user has explicitly forbidden zip64's use.

I wouldn't backport this change to a stable release as it will alter the exact output produced in some circumstances (zip64 headers will no longer be added unnecessarily in borderline cases where they were not needed), but it is fair to consider it more of a bug that removes an odd API internal implementation wart as well as a feature.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibStandard Library Python modules in the Lib/ directorytriagedThe issue has been accepted as valid by a triager.type-bugAn unexpected behavior, bug, or errortype-featureA feature request or enhancement

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions