-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Make writerWithOffset fully delegate to the writer it wraps #126937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
writerWithOffset uses a lambda to create a RangeMissingHandler however, the RangeMissingHandler interface has a default implementation for `sharedInputStreamFactory`. This makes `writerWithOffset` delegate to the received writer only for the `fillCacheRange` method where the writer itself perhaps didn't have the `sharedInputStream` method invoked (always invoking `sharedInputStream` before `fillCacheRange` is part of the contract of the RangeMissingHandler interface) This PR makes `writerWithOffset` delegate the `sharedInputStream` to the underlying writer.
|
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
henningandersen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a concern.
|
|
||
| @Override | ||
| public SourceInputStreamFactory sharedInputStreamFactory(List<SparseFileTracker.Gap> gaps) { | ||
| return writer.sharedInputStreamFactory(gaps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this works - since it does not apply the offset in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, would the relativePos - writeOffset parameter passed in fillCacheRange not do the trick here?
My understanding on sharedInputStreamFactory is that it has to do with how much we can parallelise the calls to fillCacheRange but the place we read from is the relativePos passed in (which will take the offset into account as it'll be relativePos - writeOffset?
We could return null after calling writer.sharedInputStreamFactory(gaps) here if you think it's better? i.e.:
writer.sharedInputStreamFactory(gaps);
return null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to return the shard input stream factory there.
Is there a test demonstrating that it works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache tests should have this covered - e.g. https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/blob-cache/src/test/java/org/elasticsearch/blobcache/shared/SharedBlobCacheServiceTests.java#L1157
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My link above still only overrides the fill cache method. I'll add one more test that calls maybeFetchRange with a real SequentialRangeMissingHandler instance. Thanks Henning!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henningandersen adding a unit test with SequentialRangeMissingHandler proven to be quite the mission (we'd end up mostly mocking things) so I added an integration test instead in https://github.com/elastic/elasticsearch-serverless/pull/3853
henningandersen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
@elasticmachine update branch |
|
@elasticmachine update branch |
|
@elasticmachine update branch |
|
@elasticmachine update branch |
…126937) writerWithOffset uses a lambda to create a RangeMissingHandler however, the RangeMissingHandler interface has a default implementation for `sharedInputStreamFactory`. This makes `writerWithOffset` delegate to the received writer only for the `fillCacheRange` method where the writer itself perhaps didn't have the `sharedInputStream` method invoked (always invoking `sharedInputStream` before `fillCacheRange` is part of the contract of the RangeMissingHandler interface) This PR makes `writerWithOffset` delegate the `sharedInputStream` to the underlying writer.
…126937) writerWithOffset uses a lambda to create a RangeMissingHandler however, the RangeMissingHandler interface has a default implementation for `sharedInputStreamFactory`. This makes `writerWithOffset` delegate to the received writer only for the `fillCacheRange` method where the writer itself perhaps didn't have the `sharedInputStream` method invoked (always invoking `sharedInputStream` before `fillCacheRange` is part of the contract of the RangeMissingHandler interface) This PR makes `writerWithOffset` delegate the `sharedInputStream` to the underlying writer.
writerWithOffsetuses a lambda to create a RangeMissingHandler however, the RangeMissingHandler interface has a default implementation forsharedInputStreamFactory.This makes
writerWithOffsetdelegate to the received writer only for thefillCacheRangemethod where the writer itself perhaps didn't have thesharedInputStreammethod invoked (always invokingsharedInputStreamFactorybeforefillCacheRangeis part of the contract of the RangeMissingHandler interface)This PR makes
writerWithOffsetdelegate thesharedInputStreamFactoryto the underlying writer.