[META] Auto-optimized hybrid search

This is a meta tracking issue for implementing the Auto-optimized hybrid search feature in the Search Relevance Workbench. The goal is to reduce hybrid search optimization from a multi-day manual process to a guided, mostly-automated workflow — from generating test queries to deploying the optimal search pipeline configuration.

## Background

The [Hybrid Optimizer experiment](https://docs.opensearch.org/latest/search-plugins/search-relevance/using-search-relevance-workbench/) in the Search Relevance Workbench runs a grid search over normalization techniques, combination methods, and weight configurations (66 variants by default) to find the optimal hybrid search pipeline. However, the current workflow requires users to manually:

1. Create query terms and import them
2. Set up an LLM connector and generate relevance judgments
3. Run the optimizer experiment
4. Manually read raw results to identify the best configuration
5. Manually create and deploy the search pipeline

This meta issue tracks the work to automate steps 1, 4, 5 (and 2 at some extend), significantly reducing the barrier to finding and deploying optimal hybrid search configurations.

## Components

The implementation is broken down into the following work streams:

- [ ] **Query Set Generation with LLM** — Extend the query sets API (`PUT /query_sets`) with an `llm_generated` sampling mode that generates synthetic search queries from index documents using an LLM, eliminating manual query creation
- [ ] **Improved Judgment Coverage for Hybrid Optimizer** — Add `expandCoverage` parameter to the judgment creation API that pools documents from multiple hybrid weight configurations to improve rating coverage from ~50% to ~71-78%
  - [x] RFC: #401
  - [ ] Implementation
  - [ ] Dashboards UI changes
- [ ] **Aggregated Experiment Results** — Pre-compute and store per-configuration aggregated metrics (mean NDCG, MAP, Precision) at experiment completion, with a new retrieval API. Replaces the current approach that requires fetching 33K+ raw evaluation result documents
- [ ] **Experiment Results Summary UI** — Replace the current raw results table in the Dashboards experiment view (which breaks at OpenSearch's 10K `max_result_window`) with a ranked configuration summary showing the best hybrid search configuration and all 66 variants with aggregated metrics
- [ ] **Deploy Optimal Configuration** — New API and UI to deploy the best experiment result as an OpenSearch search pipeline and set it as the index-level default, completing the optimization workflow end-to-end
- [ ] **Judgment Cache Cleanup** — Add configurable TTL-based cleanup for the judgment cache index to bound growth from repeated optimization runs
- [ ] Documentation
- [ ] Integration tests

## Relation to Existing Hybrid Optimizer

This feature builds on the existing Hybrid Optimizer (#107) and extends it with LLM-powered automation. The existing experiment API (`PUT /experiments` with `type: HYBRID_OPTIMIZER`) and grid search logic remain unchanged. The new components automate the preparation steps (query creation, judgment generation) and post-processing steps (results aggregation, pipeline deployment) that surround the optimizer.

## Related Changes

Infrastructure and bug fixes that support this feature:

- [x] Fixed thread pool starvation in LLM judgment processing #387
- [x] Extract reusable BatchedAsyncExecutor; migrate LlmJudgmentTaskManager and ExperimentTaskManager #392
- [x] LLM Judgment customized prompt template implementation #264
- [x] Version-based index mapping update support #344 

## Future Work (not tracked here)

The following items are planned for later phases and will be tracked separately:

- Auto-Optimized experiment type — single API call orchestrating the full workflow (query generation → judgment creation → grid search → aggregation → deployment)
- Auto-Optimize mode in Dashboards UI with guided form
- On-the-fly LLM model provisioning
- RRF and z_score normalization support in the optimizer grid https://github.com/opensearch-project/search-relevance/issues/343
- Detailed per-query experiment results view
- Progress monitoring with cancel capability

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[META] Auto-optimized hybrid search #407

Background

Components

Relation to Existing Hybrid Optimizer

Related Changes

Future Work (not tracked here)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[META] Auto-optimized hybrid search #407

Description

Background

Components

Relation to Existing Hybrid Optimizer

Related Changes

Future Work (not tracked here)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions