Skip to content

Conversation

@Mikep86
Copy link
Contributor

@Mikep86 Mikep86 commented May 29, 2025

Adds simplified syntaxes for the linear and rrf retrievers:

GET my-index/_search
{
  "retriever": {
    "linear": {
      "fields": ["*"],
      "query": "my awesome query",
      "normalizer": "minmax"
    }
  }
}

GET my-index/_search
{
  "retriever": {
    "rrf": {
      "fields": ["*"],
      "query": "my awesome query"
    }
  }
}

This PR isn't quite complete as YAML tests for the rrf retriever haven't been added yet. But everything else should be in a reviewable state.

@Mikep86 Mikep86 added >enhancement auto-backport Automatically create backport pull requests when merged :SearchOrg/Relevance Label for the Search (solution/org) Relevance team :Search Relevance/Search Catch all for Search Relevance v8.19.0 v9.1.0 labels May 29, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @Mikep86, I've created a changelog YAML for you.

@Mikep86 Mikep86 requested review from jimczi, kderusso and pmpailis June 9, 2025 15:52
Comment on lines +383 to +393
- do:
catch: bad_request
search:
index: test-index
body:
retriever:
linear:
fields: [ "text_1", "text_2" ]
query: ""

- contains: { error.root_cause.0.reason: "[linear] [query] cannot be empty" }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should handle this by rewriting to a MatchNoneQueryBuilder instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be interesting to think about whether we want to do a match_none here or even a match_all if the query is not provided. Or even perhaps an exists query if fields are provided. We could look at how match_phrase supports a zero terms query.

However, I think all of that could be done as followup work and for this PR throwing is probably fine.

Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass - looks really good, nice work! 👏 I left a few commends and questions!

Comment on lines +383 to +393
- do:
catch: bad_request
search:
index: test-index
body:
retriever:
linear:
fields: [ "text_1", "text_2" ]
query: ""

- contains: { error.root_cause.0.reason: "[linear] [query] cannot be empty" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be interesting to think about whether we want to do a match_none here or even a match_all if the query is not provided. Or even perhaps an exists query if fields are provided. We could look at how match_phrase supports a zero terms query.

However, I think all of that could be done as followup work and for this PR throwing is probably fine.

assertEquals("[linear] does not support the simplified query format when querying remote indices", iae.getMessage());
}

private static ResolvedIndices createMockResolvedIndices(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some duplication between this and the RRF tests, could that be cleaned up a bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100%, there is duplication here. We could probably refactor this into a base class with extensions for linear & rrf. I didn't focus on this immediately in the interest of getting a complete PR out first. Time permitting, we can refcator in this PR or save it for a quick follow-up after feature freeze.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with it as a followup post FF

var localIndicesMetadata = resolvedIndices.getConcreteLocalIndicesMetadata();
if (localIndicesMetadata.size() > 1) {
throw new IllegalArgumentException(
"[" + NAME + "] does not support the simplified query format when querying multiple indices"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nitpick here about error messages. Also, could some more of this type of checking be shared with the linear retriever instead of duplicated?

+ " }"
+ " }"
+ "}";
String restContent = """
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, much more readable 😁

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you split this pull request in two? One for the simplified linear retriever and one for RRF.
I wouldn't expect RRF and linear to implement the same logic and I find it difficult to read since we don't really describe how it works in the description.

@Mikep86
Copy link
Contributor Author

Mikep86 commented Jun 10, 2025

Closing this in favor of separate PRs for linear & RRF. Linear retriever PR is first: #129200

@Mikep86 Mikep86 closed this Jun 10, 2025
Mikep86 added a commit to Mikep86/elasticsearch that referenced this pull request Jun 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >enhancement :Search Relevance/Search Catch all for Search Relevance :SearchOrg/Relevance Label for the Search (solution/org) Relevance team v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants