ValueFetchers should access all values via `fetchValues()` #69137

romseygeek · 2021-02-17T16:01:25Z

We currently have two major implementations of ValueFetcher:

Source/SourceArrayValueFetcher, which pulls data from a SourceLookup
passed directly to ValueFetcher.fetchValues()
DocValueFetcher, which pulls data from a SearchLookup passed in via
the QueryExecutionContext when the fetcher is built. This also requires
a separate method to be called when moving between segments.

This split confuses the API, and the second implementation in particular will
cause problems if we try and use these fetchers to build index-time scripts.

This commit adds a new interface, ValuesLookup, which wraps both source
and doc, and changes the signature of ValueFetcher.fetchValues() to
accept this in place of SourceLookup. Positioning of this ValuesLookup is
the responsibility of the caller, meaning that we can remove the setNextReader
method on ValueFetcher. Access to formatted doc values is moved into
LeafDocLookup.

Relates to #68984

WIP - still failures in InnerHitsIT

romseygeek · 2021-02-22T13:39:35Z

server/src/internalClusterTest/java/org/elasticsearch/search/fetch/subphase/InnerHitsIT.java

                    .startArray("field1");
            for (int j = 0; j < numInnerObjects; j++) {
-                source.startObject().field("x", "y").endObject();
+                source.startObject().field("x", "y" + i + ":" + j).endObject();


I used these to debug some deep weirdness with how InnerHits uses its lookups, but they're useful so I think they can stay

romseygeek · 2021-02-22T13:40:49Z

server/src/main/java/org/elasticsearch/index/mapper/DocValueFetcher.java

 /**
 * Value fetcher that loads from doc values.
 */
 public final class DocValueFetcher implements ValueFetcher {


Essentially all of this class is now moved into ValuesLookup.docValues() so it might be worth just removing it and replacing it with lambdas wherever it is used.

romseygeek · 2021-02-22T13:42:12Z

server/src/main/java/org/elasticsearch/search/DocValueFormat.java

            return parseLong(value, roundUp, now);
        }

+        @Override


We use DocValueFormat as part of a hash key now (so that a lookup can access the same field with different formats and still use caching); this was the only impl with no hashCode()/equals()

romseygeek · 2021-02-22T13:42:57Z

server/src/main/java/org/elasticsearch/search/fetch/FetchContext.java

        return searchContext.searcher();
    }

-    /**


This is now moved to the wrapping SearchContext, so that inner hits can have their own version of the lookup that filters the source appropriately.

romseygeek · 2021-02-22T13:44:24Z

server/src/main/java/org/elasticsearch/search/fetch/FetchPhase.java

                // Store the loaded source on the hit context so that fetch subphases can access it.
                // Also make it available to scripts by storing it on the shared SearchLookup instance.
-                hitContext.sourceLookup().setSource(fieldsVisitor.source());
+                hitContext.valuesLookup().source().setSource(fieldsVisitor.source());


This is still pretty ugly - would be nice to more stored fields access into ValuesLookup as well so that the lookup can handle access to everything and we don't need to set the source separately - but that's a much bigger change and is not necessary for the current project.

romseygeek · 2021-02-22T13:45:26Z

server/src/main/java/org/elasticsearch/search/fetch/subphase/FetchDocValuesPhase.java

            public void setNextReader(LeafReaderContext readerContext) {
-                for (DocValueField f : fields) {
-                    f.fetcher.setNextReader(readerContext);
-                }


Handled at the top-level FetchPhase now

romseygeek · 2021-02-22T13:48:22Z

server/src/main/java/org/elasticsearch/search/fetch/subphase/InnerHitsContext.java

+        }
+
+        @Override
+        public SearchLookup getSearchLookup() {


Slightly tricksy; inner hits goes and runs a new FetchPhase for each document, and because positioning of the ValuesLookup is now done by the FetchPhase we need to have a new lookup for each inner hits context so that documents with multiple nested types don't interfere with each other. We still only have one lookup per nested type throughout the phase so the overhead is kept low.

romseygeek · 2021-02-22T13:49:14Z

server/src/main/java/org/elasticsearch/search/fetch/subphase/InnerHitsPhase.java

    @Override
    public FetchSubPhaseProcessor getProcessor(FetchContext searchContext) {
-        if (searchContext.innerHits() == null) {
+        if (searchContext.innerHits() == null || searchContext.innerHits().getInnerHits().isEmpty()) {


Not really related but I saw we were running inner hits for every doc even if none were asked for because by default we return an empty list here, not null.

romseygeek · 2021-02-22T13:53:35Z

server/src/main/java/org/elasticsearch/search/lookup/LeafDocLookup.java

+        }
+    }
+
+    public ScriptDocValues<?> get(String name, DocValueFormat format) {


This is largely analogous to get(field) below, with the addition of formatting. The FormatKey is so that a fields call can request the same field multiple times with different formats.

romseygeek added 11 commits February 9, 2021 11:35

Handle ignored fields directly in SourceValueFetcher

c738b87

Merge remote-tracking branch 'origin/master' into fetch/ignored-fields

174f3bb

Merge remote-tracking branch 'origin/master' into fetch/ignored-fields

0b2572a

Rename DocValueFetcher.Leaf to FormattedDocValues

1486ab0

Merge branch 'fielddata/formatted-values' into fetch/values-lookup

c6665bf

ValueFetcher takes ValuesLookup in place of SourceLookup

7cbe6a1

WIP - still failures in InnerHitsIT

Merge remote-tracking branch 'origin/master' into fetch/values-lookup

7fd8be8

Use new search lookup for each inner hits context

dcfe174

cleanups

e5a9110

javadocs

35751c1

tidy

581a3db

romseygeek self-assigned this Feb 17, 2021

romseygeek marked this pull request as draft February 17, 2021 16:01

romseygeek added 5 commits February 18, 2021 12:24

flattened fields continue to not work here

d850a98

Merge remote-tracking branch 'origin/master' into fetch/values-lookup

7176455

Switch back to using name

b3afee5

some cleanups

a5b9f18

Don't need real LeafReaderContext in test any more

8667726

romseygeek commented Feb 22, 2021

View reviewed changes

romseygeek added 2 commits February 22, 2021 14:01

wut

7a7c20c

nested value fetcher for docvalues

7eb727c

elasticsearchmachine changed the base branch from master to main July 22, 2022 23:11

romseygeek closed this Aug 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ValueFetchers should access all values via `fetchValues()` #69137

ValueFetchers should access all values via `fetchValues()` #69137

Uh oh!

romseygeek commented Feb 17, 2021

Uh oh!

romseygeek Feb 22, 2021

Uh oh!

romseygeek Feb 22, 2021

Uh oh!

romseygeek Feb 22, 2021

Uh oh!

romseygeek Feb 22, 2021

Uh oh!

romseygeek Feb 22, 2021

Uh oh!

romseygeek Feb 22, 2021

Uh oh!

romseygeek Feb 22, 2021

Uh oh!

romseygeek Feb 22, 2021

Uh oh!

romseygeek Feb 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ValueFetchers should access all values via fetchValues() #69137

ValueFetchers should access all values via fetchValues() #69137

Uh oh!

Conversation

romseygeek commented Feb 17, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ValueFetchers should access all values via `fetchValues()` #69137

ValueFetchers should access all values via `fetchValues()` #69137