Skip to content

Conversation

@romseygeek
Copy link
Contributor

We currently have two major implementations of ValueFetcher:

  • Source/SourceArrayValueFetcher, which pulls data from a SourceLookup
    passed directly to ValueFetcher.fetchValues()
  • DocValueFetcher, which pulls data from a SearchLookup passed in via
    the QueryExecutionContext when the fetcher is built. This also requires
    a separate method to be called when moving between segments.

This split confuses the API, and the second implementation in particular will
cause problems if we try and use these fetchers to build index-time scripts.

This commit adds a new interface, ValuesLookup, which wraps both source
and doc, and changes the signature of ValueFetcher.fetchValues() to
accept this in place of SourceLookup. Positioning of this ValuesLookup is
the responsibility of the caller, meaning that we can remove the setNextReader
method on ValueFetcher. Access to formatted doc values is moved into
LeafDocLookup.

Relates to #68984

@romseygeek romseygeek self-assigned this Feb 17, 2021
@romseygeek romseygeek marked this pull request as draft February 17, 2021 16:01
.startArray("field1");
for (int j = 0; j < numInnerObjects; j++) {
source.startObject().field("x", "y").endObject();
source.startObject().field("x", "y" + i + ":" + j).endObject();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used these to debug some deep weirdness with how InnerHits uses its lookups, but they're useful so I think they can stay

/**
* Value fetcher that loads from doc values.
*/
public final class DocValueFetcher implements ValueFetcher {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially all of this class is now moved into ValuesLookup.docValues() so it might be worth just removing it and replacing it with lambdas wherever it is used.

return parseLong(value, roundUp, now);
}

@Override
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use DocValueFormat as part of a hash key now (so that a lookup can access the same field with different formats and still use caching); this was the only impl with no hashCode()/equals()

return searchContext.searcher();
}

/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now moved to the wrapping SearchContext, so that inner hits can have their own version of the lookup that filters the source appropriately.

// Store the loaded source on the hit context so that fetch subphases can access it.
// Also make it available to scripts by storing it on the shared SearchLookup instance.
hitContext.sourceLookup().setSource(fieldsVisitor.source());
hitContext.valuesLookup().source().setSource(fieldsVisitor.source());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still pretty ugly - would be nice to more stored fields access into ValuesLookup as well so that the lookup can handle access to everything and we don't need to set the source separately - but that's a much bigger change and is not necessary for the current project.

public void setNextReader(LeafReaderContext readerContext) {
for (DocValueField f : fields) {
f.fetcher.setNextReader(readerContext);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled at the top-level FetchPhase now

}

@Override
public SearchLookup getSearchLookup() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly tricksy; inner hits goes and runs a new FetchPhase for each document, and because positioning of the ValuesLookup is now done by the FetchPhase we need to have a new lookup for each inner hits context so that documents with multiple nested types don't interfere with each other. We still only have one lookup per nested type throughout the phase so the overhead is kept low.

@Override
public FetchSubPhaseProcessor getProcessor(FetchContext searchContext) {
if (searchContext.innerHits() == null) {
if (searchContext.innerHits() == null || searchContext.innerHits().getInnerHits().isEmpty()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really related but I saw we were running inner hits for every doc even if none were asked for because by default we return an empty list here, not null.

}
}

public ScriptDocValues<?> get(String name, DocValueFormat format) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is largely analogous to get(field) below, with the addition of formatting. The FormatKey is so that a fields call can request the same field multiple times with different formats.

@elasticsearchmachine elasticsearchmachine changed the base branch from master to main July 22, 2022 23:11
@romseygeek romseygeek closed this Aug 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant