Skip to content

Commit faa1e89

Browse files
committed
FIX: Fix egregious memory usage while hashing
1 parent 1ad0128 commit faa1e89

File tree

2 files changed

+10
-1
lines changed

2 files changed

+10
-1
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
# Changelog
22

3+
## 0.43.1 - TBD
4+
5+
#### Bug fixes
6+
- Fixed an issue where validating the checksum of a batch file loaded the entire file into memory
7+
38
## 0.43.0 - 2024-10-09
49

510
This release drops support for Python 3.8 which has reached end-of-life.

databento/historical/api/batch.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -431,7 +431,11 @@ def _download_batch_file(
431431
hash_algo, _, hash_hex = batch_download_file.hash_str.partition(":")
432432

433433
if hash_algo == "sha256":
434-
output_hash = hashlib.sha256(output_path.read_bytes())
434+
output_hash = hashlib.new(hash_algo)
435+
with open(output_path, "rb") as fd:
436+
while chunk := fd.read(32_000_000):
437+
output_hash.update(chunk)
438+
435439
if output_hash.hexdigest() != hash_hex:
436440
warn_msg = f"Downloaded file failed checksum validation: {output_path.name}"
437441
logger.warning(warn_msg)

0 commit comments

Comments
 (0)