-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Make GCS HttpHandler more compliant #126007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make GCS HttpHandler more compliant #126007
Conversation
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
mhl-b
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| exchange.sendResponseHeaders(RestStatus.REQUESTED_RANGE_NOT_SATISFIED.getStatus(), -1); | ||
| return; | ||
| } | ||
| if (range.start() >= blob.contents().length()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, should we be checking the range end here? If you ask for bytes 0-10 of an 8-byte blob, we cannot satisfy that range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, probably. I'll have a look at what the spec says, currently we just send as much of it as we have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also adding ETags, because it looks like that's the mechanism that the BlobReadChannel uses to detect that the file it's streaming has changed under it. We can implement them trivially using generation I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it looks like
- We don't yet check that the file hasn't changed when resuming a download (I raised ES-11432 to address at some point)
- We do need to allow ranges beyond the end of actual content (streaming uses this to request a chunk). There may be more to the story, but if we start enforcing the ranges all that breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed this testing against the actual API
- If the start of the range is past the end of the file, you get a 416
- If the end of the range is past the end of the file, but the start is in it, you get a 206 and as much of the file as can be delivered
|
|
||
| container.deleteBlobsIgnoringIfNotExists(randomPurpose(), Iterators.single(key)); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a bug when we wrote a multiple of the chunk size in bytes, it sends
POST metadata
PUT chunk1
PUT chunk2
...
PUT no data with content-range header to indicate we're done
We had assumed the finished header was sent with the final chunk.
This PR includes a couple of bug-fixes and some increased compliance with the actual GCS API
RangeheaderRangeheader is specifiedBlobReadChanneluses this to ensure we fail when the blob is updated between successive chunks being fetched)The 416 on zero-length blobs was one of(?) the causes of #125668