Skip to content

Conversation

@nicktindall
Copy link
Contributor

@nicktindall nicktindall commented Apr 1, 2025

This PR includes a couple of bug-fixes and some increased compliance with the actual GCS API

  • Fixed bug where 416 was being erroneously returned for zero-length blobs even with no Range header
  • Fixed bug where partial upload wouldn't be completed if the last PUT included no data
  • Return 206 (partial content) status when a Range header is specified
  • Return an ETag on object get - BlobReadChannel uses this to ensure we fail when the blob is updated between successive chunks being fetched)

The 416 on zero-length blobs was one of(?) the causes of #125668

@nicktindall nicktindall requested review from DaveCTurner and mhl-b April 1, 2025 00:44
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 labels Apr 1, 2025
@nicktindall nicktindall added >test Issues or PRs that are addressing/adding tests :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Apr 1, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Apr 1, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Apr 1, 2025
Copy link
Contributor

@mhl-b mhl-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

exchange.sendResponseHeaders(RestStatus.REQUESTED_RANGE_NOT_SATISFIED.getStatus(), -1);
return;
}
if (range.start() >= blob.contents().length()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, should we be checking the range end here? If you ask for bytes 0-10 of an 8-byte blob, we cannot satisfy that range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, probably. I'll have a look at what the spec says, currently we just send as much of it as we have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also adding ETags, because it looks like that's the mechanism that the BlobReadChannel uses to detect that the file it's streaming has changed under it. We can implement them trivially using generation I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it looks like

  • We don't yet check that the file hasn't changed when resuming a download (I raised ES-11432 to address at some point)
  • We do need to allow ranges beyond the end of actual content (streaming uses this to request a chunk). There may be more to the story, but if we start enforcing the ranges all that breaks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed this testing against the actual API

  • If the start of the range is past the end of the file, you get a 416
  • If the end of the range is past the end of the file, but the start is in it, you get a 206 and as much of the file as can be delivered


container.deleteBlobsIgnoringIfNotExists(randomPurpose(), Iterators.single(key));
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a bug when we wrote a multiple of the chunk size in bytes, it sends
POST metadata
PUT chunk1
PUT chunk2
...
PUT no data with content-range header to indicate we're done

We had assumed the finished header was sent with the final chunk.

@nicktindall nicktindall changed the title Fix range handling for zero-length blobs Make GCS HttpHandler more compliant Apr 2, 2025
@nicktindall nicktindall merged commit 28dd8e1 into elastic:main Apr 2, 2025
17 checks passed
@nicktindall nicktindall deleted the Fix_range_handling_zero_length_blob branch April 2, 2025 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Coordination Meta label for Distributed Coordination team >test Issues or PRs that are addressing/adding tests v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants