Skip to content

Conversation

@kyungeonchoi
Copy link
Contributor

@kyungeonchoi kyungeonchoi commented Dec 30, 2025

Story

  • When downloading a large number of files (~3000) from MinIO/S3 backend, a small number of files (~10) are downloaded with an incorrect size mostly at the end. The size mismatch is small (less than 1% of file size) and can be either smaller and larger than the expected size.
  • This happens only with download. Signed URLs works fine.
  • Played with the options of the MinioAdapter class (e.g. transferconfig for aioboto3, concurrency setting (_file_transfer_sem) in the download_file function) but nothing fixes the issue.
  • On the other hand the standard-alone download using CLI command - servicex transforms download <request-id> works fine.

Updates

  • Download files using a .part suffix and rename them to the final filename only after validating the file size.

@kyungeonchoi kyungeonchoi added the bug Something isn't working label Dec 30, 2025
@codecov
Copy link

codecov bot commented Dec 30, 2025

Codecov Report

❌ Patch coverage is 60.00000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.92%. Comparing base (3e9ab61) to head (c796005).

Files with missing lines Patch % Lines
servicex/minio_adapter.py 66.66% 2 Missing ⚠️
servicex/query_core.py 50.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #697      +/-   ##
==========================================
- Coverage   98.27%   93.92%   -4.35%     
==========================================
  Files          29       27       -2     
  Lines        2085     2074      -11     
==========================================
- Hits         2049     1948     -101     
- Misses         36      126      +90     
Flag Coverage Δ
unittests 93.92% <60.00%> (-4.35%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

# Now just wait until all of our tasks complete
await asyncio.gather(*download_tasks)
MAX_INFLIGHT = 100
if len(download_tasks) >= MAX_INFLIGHT:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we already control this via sempahore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Number of downloads is controlled by a semaphore in the minio_adapter.py, but these new lines control the number of concurrent asyncio tasks . Maybe it's okay to await for thousands of tasks but only downloading small number of tasks. At least this doesn't fix the problem.

localsize = path.stat().st_size

# Ensure filesystem flush visibility
await asyncio.sleep(0.05)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a magic number and might not work in general. Would it be enough to download to the .part file, rename, then stat the new file?

@kyungeonchoi
Copy link
Contributor Author

Here is an example size difference which leads to a download failure

Download of 
root:::c114.af.uchicago.edu:1094::https:::dcgftp.usatlas.bnl.gov:443:pnfs:usatlas.bnl.gov:BNLT0D1:rucio:data17_13Te
V:61:5b:DAOD_BPHY28.44180100._000018.pool.root.1 failed:  local size - 290668105, remote 
size - 290671373

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants