-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
GH-128131: Completely support random read access of uncompressed unencrypted files in ZipFile #128143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-128131: Completely support random read access of uncompressed unencrypted files in ZipFile #128143
Changes from 8 commits
f18239d
f90340f
5d23be6
1a64610
57cb51c
4d9cea4
3d37f31
67f05de
25f0a7e
f2f2374
a2c9037
6a55aad
1239005
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -3447,6 +3447,62 @@ def test_too_short(self): | |||||||||||||
| self.assertEqual( | ||||||||||||||
| b"zzz", zipfile._Extra.strip(b"zzz", (self.ZIP64_EXTRA,))) | ||||||||||||||
|
|
||||||||||||||
| class StoredZipExtFileRandomReadTest(unittest.TestCase): | ||||||||||||||
| def test_random_read(self): | ||||||||||||||
|
||||||||||||||
| from _pyio import BytesIO | ||||||||||||||
| class StatIO(BytesIO): | ||||||||||||||
| def __init__(self): | ||||||||||||||
| super().__init__() | ||||||||||||||
| self.bytes_read = 0 | ||||||||||||||
|
|
||||||||||||||
| def read(self, size=-1): | ||||||||||||||
| bs = super().read(size) | ||||||||||||||
| self.bytes_read += len(bs) | ||||||||||||||
| return bs | ||||||||||||||
5ec1cff marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||||
|
|
||||||||||||||
| sio = StatIO() | ||||||||||||||
| # 20000 bytes | ||||||||||||||
| txt = b'0123456789' * 2000 | ||||||||||||||
|
|
||||||||||||||
| # seek length must be greater than ZipExtFile.MIN_READ_SIZE (4096) | ||||||||||||||
5ec1cff marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||||
| min_size = zipfile.ZipExtFile.MIN_READ_SIZE | ||||||||||||||
|
||||||||||||||
| self.assertGreaterEqual(min_size, 100) | ||||||||||||||
5ec1cff marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||||
|
|
||||||||||||||
| with zipfile.ZipFile(sio, "w", compression=zipfile.ZIP_STORED) as zipf: | ||||||||||||||
| zipf.writestr("foo.txt", txt) | ||||||||||||||
|
|
||||||||||||||
| # check random seek and read on a file | ||||||||||||||
| with zipfile.ZipFile(sio, "r") as zipf: | ||||||||||||||
| with zipf.open("foo.txt", "r") as fp: | ||||||||||||||
| # forward seek | ||||||||||||||
| old_count = sio.bytes_read | ||||||||||||||
| fp.seek(10002, os.SEEK_CUR) | ||||||||||||||
| self.assertEqual(fp.tell(), 10002) | ||||||||||||||
| self.assertEqual(fp._left, fp._compress_left) | ||||||||||||||
| arr = fp.read(100) | ||||||||||||||
| self.assertEqual(fp.tell(), 10102) | ||||||||||||||
| self.assertEqual(arr, txt[10002:10102]) | ||||||||||||||
|
||||||||||||||
| arr = fp.read(100) | |
| self.assertEqual(fp.tell(), 10102) | |
| self.assertEqual(arr, txt[10002:10102]) | |
| arr = fp.read(read_length) | |
| self.assertEqual(fp.tell(), forward_seek + read_length) | |
| self.assertEqual(arr, txt[forward_seek:forward_seek + read_length]) |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a better name for d that indicates its meaning, or just inline it if it's only needed once and its meaning is inconsequential.
5ec1cff marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
picnixz marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| fp.seek(-5003, os.SEEK_CUR) | |
| fp.seek(-backward_seek, os.SEEK_CUR) |
5ec1cff marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
5ec1cff marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it should be a separate test (or two). In fact, I'm not even sure I understand why this private flag is even relevant to the issue at hand. If it's not a separate test with a separate justification, can you explain why it's related to the issue at hand? If these are private attributes, what is the public effect that's being validated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is to ensure that the eof flag is correctly updated after seeking to the end of the file and then seeking back.
5ec1cff marked this conversation as resolved.
Show resolved
Hide resolved
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| Completely support random access of uncompressed unencrypted read-only | ||
| zip files obtained by :meth:`ZipFile.open <zipfile.ZipFile.open>`. |
Uh oh!
There was an error while loading. Please reload this page.