Skip to content

Error -3 while decompressing data: invalid stored block lengths  #128258

@ankasani

Description

@ankasani

Bug report

Bug description:

When using below code is working fine with 3.11 version and not working with 3.12, 3.13.

import zipfile
import asyncio

async def process_file(text_file_name: str, zip_file: zipfile.ZipFile):
    try:
        with zip_file.open(text_file_name, mode='r') as text_file:
            try:
                content = await asyncio.to_thread(text_file.read)
                lines = content.decode('utf-8').splitlines()
            except UnicodeDecodeError as e:
                print(f"Error decoding file {text_file_name}: {e}")
                return None
            except Exception as e:
                print(f"Error reading file {text_file_name}: {e}")
                return None
            if not lines:
                return None
            # Process lines here
            return lines
    except Exception as e:
        print(f"Error opening file {text_file_name}: {e}")
        return None

async def main():
    temp_file_path = 'Tests.zip'
    zip_file = zipfile.ZipFile(temp_file_path, 'r')
    tasks = [process_file(text_file_name, zip_file) for text_file_name in zip_file.namelist()]
    await asyncio.gather(*tasks)

asyncio.run(main())

Also, open the ZIP file inside the process_file function, it was leading to higher memory usage. This is because each task would open a new instance of the ZIP file, potentially loading multiple instances into memory simultaneously, especially if the ZIP contains many files. I’m looking for a solution that minimizes memory usage while still allowing for fast, simultaneous execution. so any one has any suggestions on it ?

CPython versions tested on:

3.11, 3.12, 3.13

Operating systems tested on:

Linux, Windows

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.12only security fixes3.13bugs and security fixes3.14bugs and security fixesstdlibStandard Library Python modules in the Lib/ directorytype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions