Search result comparison by jzonthemtn · Pull Request #131 · o19s/opensearch-search-quality-evaluation

jzonthemtn · 2025-03-10T14:59:47Z

Adds indexing of search configs and evaluating search configs for #108.

Signed-off-by: jzonthemtn <jeff.zemerick@mtnfog.com>

wrigleyDan

Thanks for taking a first stab at this. I now realize that we had some flaws in the design that we have in the Miro board.

Adding a couple of points to add clarity on how I think search result comparison might work:

User defines at least two search configurations. At the moment we don't have a finalized decision on what a search configuration exactly is. At the moment I think it is a Query DSL object + an optional search pipeline (similar to what we can confIgure when we "run a query set" (=run an evaluation to calculate search metrics) or a search template.
User creates a query set with one of the existing samplers.
User defines the configuration of the search comparison job similarly to defining a config file for query set generation:

{
  "query_set_id": "abc", --> the created query set
  "search_configurations": ["X", "Y"], --> the two search configurations to compare
  "k": 10, --> result list depth 
  "index": "ecommerce" --> the index to run the queries of the query set with the search configurations
}

User runs the search comparison where search-comparison.json is the just created JSON file.

java -jar ../target/search-evaluation-framework.jar -search-comparison search-comparison.json

The search result quality app now runs every query from the query set abc once for each search configuration (X, Y), stores the results, compares the results for each query and calculates the metrics Jaccard & RBO.

Let me know if that makes sense or if you need anything in addition to this.

jzonthemtn · 2025-03-11T19:52:40Z

The search result quality app now runs every query from the query set abc once for each search configuration (X, Y), stores the results, compares the results for each query and calculates the metrics Jaccard & RBO.

Let me know if that makes sense or if you need anything in addition to this.

That makes sense, and I think that's what I had in mind even if it's not quite what was on the Miro board. In this draft PR, the user can create (index) a search config. The next step is once the user has created at least two search configs, they can do the evaluation by specifying two search config IDs like in the JSON you gave above. Does that sound on track?

wrigleyDan · 2025-03-12T12:24:30Z

That makes sense, and I think that's what I had in mind even if it's not quite what was on the Miro board. In this draft PR, the user can create (index) a search config. The next step is once the user has created at least two search configs, they can do the evaluation by specifying two search config IDs like in the JSON you gave above. Does that sound on track?

Yes, that does sound on track. I think what triggered my long response was the impression that it looks like the configuration file that defines the result comparison evaluation job is indexed as a search configuration and that shouldn't be the case.

The search configuration is something we had not thought about when initially talking about result list comparison, now it starts to take shape. As a starting point I suggest we define it as a name (string), a query (similar to running a running a query set that's a Query DSL JSON object) and an optional search_pipeline.
Search configurations can then be referenced by their name in the config file to define the result list comparison job.

Does that sound reasonable or am I misunderstanding something?

jzonthemtn · 2025-03-12T12:55:27Z

That makes sense, and I think that's what I had in mind even if it's not quite what was on the Miro board. In this draft PR, the user can create (index) a search config. The next step is once the user has created at least two search configs, they can do the evaluation by specifying two search config IDs like in the JSON you gave above. Does that sound on track?

Yes, that does sound on track. I think what triggered my long response was the impression that it looks like the configuration file that defines the result comparison evaluation job is indexed as a search configuration and that shouldn't be the case.

The search configuration is something we had not thought about when initially talking about result list comparison, now it starts to take shape. As a starting point I suggest we define it as a name (string), a query (similar to running a running a query set that's a Query DSL JSON object) and an optional search_pipeline. Search configurations can then be referenced by their name in the config file to define the result list comparison job.

Does that sound reasonable or am I misunderstanding something?

You are right. I picked up the wrong set of info for the search config.

jzonthemtn · 2025-03-26T16:08:16Z

I'm going to close this PR because the work has been moved over to https://github.com/o19s/search-relevance/issues/15.

jzonthemtn added 4 commits March 6, 2025 11:57

#108 Adding search configuration index and mapping.

c0fca0c

Signed-off-by: jzonthemtn <jeff.zemerick@mtnfog.com>

#108 Storing search configurations.

9a7b86f

Signed-off-by: jzonthemtn <jeff.zemerick@mtnfog.com>

#108 Working on evaluating search configurations.

089431f

Signed-off-by: jzonthemtn <jeff.zemerick@mtnfog.com>

Merge branch 'main' into 108-search-result-comparison

c81a1f0

jzonthemtn marked this pull request as draft March 10, 2025 14:59

wrigleyDan reviewed Mar 11, 2025

View reviewed changes

jzonthemtn closed this Mar 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Search result comparison#131

Search result comparison#131
jzonthemtn wants to merge 4 commits intomainfrom
108-search-result-comparison

jzonthemtn commented Mar 10, 2025

Uh oh!

wrigleyDan left a comment

Uh oh!

jzonthemtn commented Mar 11, 2025 •

edited

Loading

Uh oh!

wrigleyDan commented Mar 12, 2025

Uh oh!

jzonthemtn commented Mar 12, 2025

Uh oh!

jzonthemtn commented Mar 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jzonthemtn commented Mar 10, 2025

Uh oh!

wrigleyDan left a comment

Choose a reason for hiding this comment

Uh oh!

jzonthemtn commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wrigleyDan commented Mar 12, 2025

Uh oh!

jzonthemtn commented Mar 12, 2025

Uh oh!

jzonthemtn commented Mar 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jzonthemtn commented Mar 11, 2025 •

edited

Loading