zstdfile: ensure we do not read more than size / IO_BLOCK_SIZE by joelynch · Pull Request #217 · Aiven-Open/rohmu

joelynch · 2025-12-22T11:21:32Z

In the previous implementation, _ZtsdFileReader.read could produce output of arbirary size. This can cause memory spikes while decompressing a file. Instead, we should use a ZstdDecompressor.stream_reader which decompresses incrementally into a fixed size output buffer.

About this change - What it does

Resolves: #xxxxx

Why this way

Copilot

Pull request overview

This PR addresses memory spikes during zstd decompression by refactoring the _ZtsdFileReader class to use ZstdDecompressor.stream_reader() instead of decompressobj(). The new approach enables incremental decompression with a bounded output buffer, ensuring that read() operations respect the requested size parameter and do not produce arbitrarily large outputs.

Replaced decompressobj() with stream_reader() for controlled, incremental decompression
Modified the read() method to honor size parameter and limit output to requested bytes
Added comprehensive test coverage for compression and decompression with size-bounded reads

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
rohmu/zstdfile.py	Refactored `_ZtsdFileReader` to use `stream_reader()` for bounded decompression and updated `read()` method to respect size parameter
test/test_zstdfile.py	Added new test case validating compression/decompression with size-limited reads to verify the fix

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rohmu/zstdfile.py

In the previous implementation, _ZtsdFileReader.read could produce output of arbirary size. This can cause memory spikes while decompressing a file. Instead, we should use a ZstdDecompressor.stream_reader which decompresses incrementally into a fixed size output buffer.

joelynch requested a review from Copilot December 22, 2025 11:21

Copilot started reviewing on behalf of joelynch December 22, 2025 11:21 View session

Copilot AI reviewed Dec 22, 2025

View reviewed changes

rohmu/zstdfile.py Show resolved Hide resolved

rohmu/zstdfile.py Outdated Show resolved Hide resolved

joelynch force-pushed the joelynch/memory-usage-zstd branch from c8f5835 to b80c669 Compare December 22, 2025 11:25

joelynch force-pushed the joelynch/memory-usage-zstd branch from b80c669 to 5406382 Compare December 22, 2025 11:27

joelynch requested a review from a team December 22, 2025 11:41

Khatskevich approved these changes Dec 23, 2025

View reviewed changes

Khatskevich merged commit 0d87620 into main Dec 23, 2025
7 checks passed

Khatskevich deleted the joelynch/memory-usage-zstd branch December 23, 2025 10:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zstdfile: ensure we do not read more than size / IO_BLOCK_SIZE#217

zstdfile: ensure we do not read more than size / IO_BLOCK_SIZE#217
Khatskevich merged 1 commit intomainfrom
joelynch/memory-usage-zstd

joelynch commented Dec 22, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joelynch commented Dec 22, 2025

About this change - What it does

Why this way

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants