Skip to content

Conversation

@yuanjieding-db
Copy link
Collaborator

What changes are proposed in this pull request?

Provide the readers and reviewers with the information they need to understand
this PR in a comprehensive manner.

Specifically, try to answer the two following questions:

  • WHAT
    • Fix the FilesExt issue where it throws exception when the content size is 0, by checking if the size of the content is below a certain threshold. If it is below the threshold, upload using single-shot instead of multipart upload.
  • WHY
    • upload interface needs to support upload data with zero size.

How is this tested?

Unit test cases are updated to test the behavior change.

@yuanjieding-db yuanjieding-db force-pushed the yuanjie/fix-zero-size-file-upload branch from d9e6e97 to 3f1cea1 Compare October 27, 2025 13:57
@yuanjieding-db yuanjieding-db force-pushed the yuanjie/fix-zero-size-file-upload branch from 3f1cea1 to efb8b02 Compare October 27, 2025 13:58
@yuanjieding-db yuanjieding-db force-pushed the yuanjie/fix-zero-size-file-upload branch from efb8b02 to bf741e8 Compare October 27, 2025 14:39
@yuanjieding-db yuanjieding-db force-pushed the yuanjie/fix-zero-size-file-upload branch from bf741e8 to c6c2892 Compare October 27, 2025 14:41
@yuanjieding-db
Copy link
Collaborator Author

jenkins trigger all

parallelism=parallelism,
)
if ctx.use_parallel:
if ctx.use_parallel and ctx.content_length >= self._config.files_ext_multipart_upload_min_stream_size:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can content_length be none here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, in the upload_from function, content_length will always be set using os.path.getsize(source_path), which only returns a number.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, not in this PR. Let's avoid using the same uploadContext in both methods, as they may differ for different functions.
FWIW, I am planning to add a static type checker in the Python SDK, which will block such scenarios.

),
MultipartUploadTestCase(
"Multipart upload successful: empty file or empty seekable stream",
content_size=0, # less than part size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
content_size=0, # less than part size
content_size=0, # content with zero length

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

parallelism=parallelism,
)
if ctx.use_parallel:
if ctx.use_parallel and ctx.content_length >= self._config.files_ext_multipart_upload_min_stream_size:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, not in this PR. Let's avoid using the same uploadContext in both methods, as they may differ for different functions.
FWIW, I am planning to add a static type checker in the Python SDK, which will block such scenarios.

@yuanjieding-db yuanjieding-db force-pushed the yuanjie/fix-zero-size-file-upload branch from fce05cc to e15bcb5 Compare October 30, 2025 13:03
@yuanjieding-db yuanjieding-db force-pushed the yuanjie/fix-zero-size-file-upload branch from e15bcb5 to 504f30c Compare October 30, 2025 22:00
@github-actions
Copy link

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-py

Inputs:

  • PR number: 1088
  • Commit SHA: 504f30cb5dc80a9373b43a99d032c52afa226fcf

Checks will be approved automatically on success.

@yuanjieding-db yuanjieding-db added this pull request to the merge queue Oct 31, 2025
Merged via the queue into databricks:main with commit e890378 Oct 31, 2025
17 checks passed
@yuanjieding-db yuanjieding-db deleted the yuanjie/fix-zero-size-file-upload branch October 31, 2025 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants