From 576fa0ecc0cb94822da486af1881bf197c856e9c Mon Sep 17 00:00:00 2001 From: Mridula Date: Thu, 12 Jun 2025 17:37:50 +0100 Subject: [PATCH] Fix minscore propagation in text similarity reranker (#129223) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * propgating retrievers to inner retrievers * test feature taken care of * Small changes in concurrent multipart upload interfaces (#128977) Small changes in BlobContainer interface and wrapper. Relates ES-11815 * Unmute FollowingEngineTests#testProcessOnceOnPrimary() test (#129054) The reason the test fails is that operations contained _seq_no field with different doc value types (with no skippers and with skippers) and this isn't allowed, since field types need to be consistent in a Lucene index. The initial operations were generated not knowing about the fact the index mode was set to logsdb or time_series. Causing the operations to not have doc value skippers. However when replaying the operations via following engine, the operations did have doc value skippers. The fix is to set `index.seq_no.index_options` to `points_and_doc_values`, so that the initial operations are indexed without doc value skippers. This test doesn't gain anything from storing seqno with doc value skippers, so there is no loss of testing coverage. Closes #128541 * [Build] Add support for publishing to maven central (#128659) This ensures we package an aggregation zip with all artifacts we want to publish to maven central as part of a release. Running zipAggregation will produce a zip file in the build/nmcp/zip folder. The content of this zip is meant to match the maven artifacts we have currently declared as dra maven artifacts. * ESQL: Check for errors while loading blocks (#129016) Runs a sanity check after loading a block of values. Previously we were doing a quick check if assertions were enabled. Now we do two quick checks all the time. Better - we attach information about how a block was loaded when there's a problem. Relates to #128959 * Make `PhaseCacheManagementTests` project-aware (#129047) The functionality in `PhaseCacheManagement` was already project-aware, but these tests were still using deprecated methods. * Vector test tools (#128934) This adds some testing tools for verifying vector recall and latency directly without having to spin up an entire ES node and running a rally track. Its pretty barebones and takes inspiration from lucene-util, but I wanted access to our own formats and tooling to make our lives easier. Here is an example config file. This will build the initial index, run queries at num_candidates: 50, then again at num_candidates 100 (without reindexing, and re-using the cached nearest neighbors). ``` [{ "doc_vectors" : "path", "query_vectors" : "path", "num_docs" : 10000, "num_queries" : 10, "index_type" : "hnsw", "num_candidates" : 50, "k" : 10, "hnsw_m" : 16, "hnsw_ef_construction" : 200, "index_threads" : 4, "reindex" : true, "force_merge" : false, "vector_space" : "maximum_inner_product", "dimensions" : 768 }, { "doc_vectors" : "path", "query_vectors" : "path", "num_docs" : 10000, "num_queries" : 10, "index_type" : "hnsw", "num_candidates" : 100, "k" : 10, "hnsw_m" : 16, "hnsw_ef_construction" : 200, "vector_space" : "maximum_inner_product", "dimensions" : 768 } ] ``` To execute: ``` ./gradlew :qa:vector:checkVec --args="/Path/to/knn_tester_config.json" ``` Calling `./gradlew :qa:vector:checkVecHelp` gives some guidance on how to use it, additionally providing a way to run it via java directly (useful to bypass gradlew guff). * ES|QL: refactor generative tests (#129028) * Add a test of LOOKUP JOIN against a time series index (#129007) Add a spec test of `LOOKUP JOIN` against a time series index. * Make ILM `ClusterStateWaitStep` project-aware (#129042) This is part of an iterative process to make ILM project-aware. * Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {lookup-join.LookupJoinOnTimeSeriesIndex ASYNC} #129078 * Remove `ClusterState` param from ILM `AsyncBranchingStep` (#129076) The `ClusterState` parameter of the `asyncPredicate` is not used anywhere. * Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {lookup-join.LookupJoinOnTimeSeriesIndex SYNC} #129082 * Mute org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT test {p0=upgraded_cluster/70_ilm/Test Lifecycle Still There And Indices Are Still Managed} #129097 * Mute org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT test {p0=upgraded_cluster/90_ml_data_frame_analytics_crud/Get mixed cluster outlier_detection job} #129098 * Mute org.elasticsearch.packaging.test.DockerTests test081SymlinksAreFollowedWithEnvironmentVariableFiles #128867 * Threadpool merge executor is aware of available disk space (#127613) This PR introduces 3 new settings: indices.merge.disk.check_interval, indices.merge.disk.watermark.high, and indices.merge.disk.watermark.high.max_headroom that control if the threadpool merge executor starts executing new merges when the disk space is getting low. The intent of this change is to avoid the situation where in-progress merges exhaust the available disk space on the node's local filesystem. To this end, the thread pool merge executor periodically monitors the available disk space, as well as the current disk space estimates required by all in-progress (currently running) merges on the node, and will NOT schedule any new merges if the disk space is getting low (by default below the 5% limit of the total disk space, or 100 GB, whichever is smaller (same as the disk allocation flood stage level)). * Add option to include or exclude vectors from _source retrieval (#128735) This PR introduces a new include_vectors option to the _source retrieval context. When set to false, vectors are excluded from the returned _source. This is especially efficient when used with synthetic source, as it avoids loading vector fields entirely. By default, vectors remain included unless explicitly excluded. * Remove direct minScore propagation to inner retrievers * cleaned up skip * Mute org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceDiskSpaceTests testAvailableDiskSpaceMonitorWhenFileSystemStatErrors #129149 * Add transport version for ML inference Mistral chat completion (#129033) * Add transport version for ML inference Mistral chat completion * Add changelog for Mistral Chat Completion version fix * Revert "Add changelog for Mistral Chat Completion version fix" This reverts commit 7a57416bdc65805d5303ee3ee7db3368aad89689. * Correct index path validation (#129144) All we care about is if reindex is true or false. We shouldn't worry about force merge. Because if reindex is true, we will create the directory, if its false, we won't. * Mute org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceDiskSpaceTests testUnavailableBudgetBlocksNewMergeTasksFromStartingExecution #129148 * Implemented completion task for Google VertexAI (#128694) * Google Vertex AI completion model, response entity and tests * Fixed GoogleVertexAiServiceTest for Service configuration * Changelog * Removed downcasting and using `moveToFirstToken` * Create GoogleVertexAiChatCompletionResponseHandler for streaming and non streaming responses * Added unit tests * PR feedback * Removed googlevertexaicompletion model. Using just GoogleVertexAiChatCompletionModel for completion and chat completion * Renamed uri -> nonStreamingUri. Added streamingUri and getters in GoogleVertexAiChatCompletionModel * Moved rateLimitGroupHashing to subclasses of GoogleVertexAiModel * Fixed rate limit has of GoogleVertexAiRerankModel and refactored uri for GoogleVertexAiUnifiedChatCompletionRequest --------- Co-authored-by: lhoet-google Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com> * Fixing minscore filtering in the text similarity reranker * ES|QL - kNN function initial support (#127322) * Remove optional seed from ES|QL SAMPLE (#128887) * Remove optional seed from ES|QL SAMPLE * make it clear that seed is for testing * [Inference API] Add "rerank" task type to "elastic" provider (#126022) * Rename target destination for microbenchmarks (#128878) * Include direct memory and non-heap memory in ML memory calculations (take #2) (#128742) * Include direct memory and non-heap memory in ML memory calculations. * Reduce ML_ONLY heap size, so that direct memory is accounted for. * [CI] Auto commit changes from spotless * changelog * improve docs * Reuse direct memory to heap factor * feature flag --------- Co-authored-by: elasticsearchmachine * Throw better exception for unsupported aggregations over shape fields (#129139) * Update Test Framework To Handle Query Rewrites That Rely on Non-Null Searchers (#129160) * Update ReproduceInfoPrinter to correctly print a reproduction line for Lucene & build candidate upgrade tests (#129044) * Increment inference stats counter for shard bulk inference calls (#129140) This change updates the inference stats counter to include chunked inference calls performed by the shard bulk inference filter on all semantic text fields. It ensures that usage of inference on semantic text fields is properly recorded in the stats. * Synthetic source: avoid storing multi fields of type text and match_only_text by default. (#129126) Don't store text and match_only_text field by default when source mode is synthetic and a field is a multi field or when there is a suitable multi field. Without this change, ES would store field otherwise twice in a multi-field configuration. For example: ``` ... "os": { "properties": { "name": { "ignore_above": 1024, "type": "keyword", "fields": { "text": { "type": "match_only_text" } } } ... ``` In this case, two stored fields were added, one in case for the `name` field and one for `name.text` multi-field. This change prevents this, and would never store a stored field when text or match_only_text field is a multi-field. * Adding `scheduled_report_id` field to kibana reporting template (#127827) * Adding scheduled_report_id field to kibana reporting template * Incrementing stack template registry version * ES|QL: Add FORK generative tests (#129135) * ES|QL Completion command syntax change (#129189) * propagated minscore to rankdsocsretrieverbuilder * Modified the file to include minscore and the test case to verify it * Revert "Use IndexOrDocValuesQuery in NumberFieldType#termQuery implementations (#128293)" (#129206) This reverts commit de7c91c1d98c4b2b95302655103df2d6eb9cb423. * Fixed the rankdocsretriever builder * Update docs/changelog/129223.yaml * Update 129223.yaml * trying to introduce cluster featureS * included cluster features in the test * Fixed the merge issue * [CI] Auto commit changes from spotless * Removed local variable from RankDocsRetrieverBuilder * Update RankDocsRetrieverBuilder.java --------- Co-authored-by: Tanguy Leroux Co-authored-by: Martijn van Groningen Co-authored-by: Rene Groeschke Co-authored-by: Nik Everett Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com> Co-authored-by: Benjamin Trent Co-authored-by: Luigi Dell'Aquila Co-authored-by: Bogdan Pintea Co-authored-by: elasticsearchmachine <58790826+elasticsearchmachine@users.noreply.github.com> Co-authored-by: Albert Zaharovits Co-authored-by: Jim Ferenczi Co-authored-by: Jan-Kazlouski-elastic Co-authored-by: Leonardo Hoet <55866308+leo-hoet@users.noreply.github.com> Co-authored-by: lhoet-google Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com> Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com> Co-authored-by: Jan Kuipers <148754765+jan-elastic@users.noreply.github.com> Co-authored-by: Tim Grein Co-authored-by: Ievgen Degtiarenko Co-authored-by: elasticsearchmachine Co-authored-by: Ignacio Vera Co-authored-by: Mike Pellegrini Co-authored-by: Moritz Mack Co-authored-by: Ying Mao Co-authored-by: Ioana Tagirta Co-authored-by: Aurélien FOUCRET --- docs/changelog/129223.yaml | 5 + .../retriever/CompoundRetrieverBuilder.java | 3 +- .../retriever/RankDocsRetrieverBuilder.java | 7 +- .../RankDocsRetrieverBuilderTests.java | 2 +- .../xpack/inference/InferenceFeatures.java | 1 + .../TextSimilarityRankRetrieverBuilder.java | 24 ++-- .../70_text_similarity_rank_retriever.yml | 108 ++++++++++++++++++ 7 files changed, 133 insertions(+), 17 deletions(-) create mode 100644 docs/changelog/129223.yaml diff --git a/docs/changelog/129223.yaml b/docs/changelog/129223.yaml new file mode 100644 index 0000000000000..ec84ec52c8cf7 --- /dev/null +++ b/docs/changelog/129223.yaml @@ -0,0 +1,5 @@ +pr: 129223 +summary: Fix text similarity reranker does not propagate min score correctly +area: Search +type: bug +issues: [] diff --git a/server/src/main/java/org/elasticsearch/search/retriever/CompoundRetrieverBuilder.java b/server/src/main/java/org/elasticsearch/search/retriever/CompoundRetrieverBuilder.java index 6cf0af0ef1541..3f9353251b920 100644 --- a/server/src/main/java/org/elasticsearch/search/retriever/CompoundRetrieverBuilder.java +++ b/server/src/main/java/org/elasticsearch/search/retriever/CompoundRetrieverBuilder.java @@ -195,7 +195,8 @@ public void onFailure(Exception e) { RankDocsRetrieverBuilder rankDocsRetrieverBuilder = new RankDocsRetrieverBuilder( rankWindowSize, newRetrievers.stream().map(s -> s.retriever).toList(), - results::get + results::get, + this.minScore ); rankDocsRetrieverBuilder.retrieverName(retrieverName()); return rankDocsRetrieverBuilder; diff --git a/server/src/main/java/org/elasticsearch/search/retriever/RankDocsRetrieverBuilder.java b/server/src/main/java/org/elasticsearch/search/retriever/RankDocsRetrieverBuilder.java index a77f5327fbc26..0cdd5ab35adcd 100644 --- a/server/src/main/java/org/elasticsearch/search/retriever/RankDocsRetrieverBuilder.java +++ b/server/src/main/java/org/elasticsearch/search/retriever/RankDocsRetrieverBuilder.java @@ -33,13 +33,14 @@ public class RankDocsRetrieverBuilder extends RetrieverBuilder { final List sources; final Supplier rankDocs; - public RankDocsRetrieverBuilder(int rankWindowSize, List sources, Supplier rankDocs) { + public RankDocsRetrieverBuilder(int rankWindowSize, List sources, Supplier rankDocs, Float minScore) { this.rankWindowSize = rankWindowSize; this.rankDocs = rankDocs; if (sources == null || sources.isEmpty()) { throw new IllegalArgumentException("sources must not be null or empty"); } this.sources = sources; + this.minScore = minScore; } @Override @@ -48,7 +49,7 @@ public String getName() { } private boolean sourceHasMinScore() { - return minScore != null || sources.stream().anyMatch(x -> x.minScore() != null); + return this.minScore != null || sources.stream().anyMatch(x -> x.minScore() != null); } private boolean sourceShouldRewrite(QueryRewriteContext ctx) throws IOException { @@ -132,7 +133,7 @@ public void extractToSearchSourceBuilder(SearchSourceBuilder searchSourceBuilder searchSourceBuilder.size(rankWindowSize); } if (sourceHasMinScore()) { - searchSourceBuilder.minScore(this.minScore() == null ? Float.MIN_VALUE : this.minScore()); + searchSourceBuilder.minScore(this.minScore == null ? Float.MIN_VALUE : this.minScore); } if (searchSourceBuilder.size() + searchSourceBuilder.from() > rankDocResults.length) { searchSourceBuilder.size(Math.max(0, rankDocResults.length - searchSourceBuilder.from())); diff --git a/server/src/test/java/org/elasticsearch/search/retriever/RankDocsRetrieverBuilderTests.java b/server/src/test/java/org/elasticsearch/search/retriever/RankDocsRetrieverBuilderTests.java index eafab1d25c38e..165ad9b2de183 100644 --- a/server/src/test/java/org/elasticsearch/search/retriever/RankDocsRetrieverBuilderTests.java +++ b/server/src/test/java/org/elasticsearch/search/retriever/RankDocsRetrieverBuilderTests.java @@ -97,7 +97,7 @@ private List preFilters(QueryRewriteContext queryRewriteContext) t } private RankDocsRetrieverBuilder createRandomRankDocsRetrieverBuilder(QueryRewriteContext queryRewriteContext) throws IOException { - return new RankDocsRetrieverBuilder(randomIntBetween(1, 100), innerRetrievers(queryRewriteContext), rankDocsSupplier()); + return new RankDocsRetrieverBuilder(randomIntBetween(1, 100), innerRetrievers(queryRewriteContext), rankDocsSupplier(), null); } public void testExtractToSearchSourceBuilder() throws IOException { diff --git a/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java b/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java index 34a91c28de40c..69df0dca8dd94 100644 --- a/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java +++ b/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java @@ -44,6 +44,7 @@ public Set getTestFeatures() { SemanticInferenceMetadataFieldsMapper.EXPLICIT_NULL_FIXES, SEMANTIC_KNN_VECTOR_QUERY_REWRITE_INTERCEPTION_SUPPORTED, TextSimilarityRankRetrieverBuilder.TEXT_SIMILARITY_RERANKER_ALIAS_HANDLING_FIX, + TextSimilarityRankRetrieverBuilder.TEXT_SIMILARITY_RERANKER_MINSCORE_FIX, SemanticInferenceMetadataFieldsMapper.INFERENCE_METADATA_FIELDS_ENABLED_BY_DEFAULT, SEMANTIC_TEXT_HIGHLIGHTER_DEFAULT, SEMANTIC_KNN_FILTER_FIX, diff --git a/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankRetrieverBuilder.java b/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankRetrieverBuilder.java index f3564f2b9aa11..dad29d240f5ca 100644 --- a/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankRetrieverBuilder.java +++ b/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankRetrieverBuilder.java @@ -23,6 +23,7 @@ import org.elasticsearch.xcontent.XContentParser; import java.io.IOException; +import java.util.ArrayList; import java.util.List; import java.util.Objects; @@ -39,6 +40,7 @@ public class TextSimilarityRankRetrieverBuilder extends CompoundRetrieverBuilder public static final NodeFeature TEXT_SIMILARITY_RERANKER_ALIAS_HANDLING_FIX = new NodeFeature( "text_similarity_reranker_alias_handling_fix" ); + public static final NodeFeature TEXT_SIMILARITY_RERANKER_MINSCORE_FIX = new NodeFeature("text_similarity_reranker_minscore_fix"); public static final ParseField RETRIEVER_FIELD = new ParseField("retriever"); public static final ParseField INFERENCE_ID_FIELD = new ParseField("inference_id"); @@ -141,23 +143,21 @@ protected TextSimilarityRankRetrieverBuilder clone( protected RankDoc[] combineInnerRetrieverResults(List rankResults, boolean explain) { assert rankResults.size() == 1; ScoreDoc[] scoreDocs = rankResults.getFirst(); - TextSimilarityRankDoc[] textSimilarityRankDocs = new TextSimilarityRankDoc[scoreDocs.length]; + List filteredDocs = new ArrayList<>(); + // Filtering by min_score must be done here, after reranking. + // Applying min_score in the child retriever could prematurely exclude documents that would receive high scores from the reranker. for (int i = 0; i < scoreDocs.length; i++) { ScoreDoc scoreDoc = scoreDocs[i]; assert scoreDoc.score >= 0; - if (explain) { - textSimilarityRankDocs[i] = new TextSimilarityRankDoc( - scoreDoc.doc, - scoreDoc.score, - scoreDoc.shardIndex, - inferenceId, - field - ); - } else { - textSimilarityRankDocs[i] = new TextSimilarityRankDoc(scoreDoc.doc, scoreDoc.score, scoreDoc.shardIndex); + if (minScore == null || scoreDoc.score >= minScore) { + if (explain) { + filteredDocs.add(new TextSimilarityRankDoc(scoreDoc.doc, scoreDoc.score, scoreDoc.shardIndex, inferenceId, field)); + } else { + filteredDocs.add(new TextSimilarityRankDoc(scoreDoc.doc, scoreDoc.score, scoreDoc.shardIndex)); + } } } - return textSimilarityRankDocs; + return filteredDocs.toArray(new TextSimilarityRankDoc[0]); } @Override diff --git a/x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/70_text_similarity_rank_retriever.yml b/x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/70_text_similarity_rank_retriever.yml index 9a6ecffe29d4d..8c012801a00f7 100644 --- a/x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/70_text_similarity_rank_retriever.yml +++ b/x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/70_text_similarity_rank_retriever.yml @@ -287,3 +287,111 @@ setup: - match: { hits.total.value: 1 } - length: { hits.hits: 1 } - match: { hits.hits.0._id: "doc_1" } + +--- +"Text similarity reranker respects min_score": + + - requires: + cluster_features: "text_similarity_reranker_minscore_fix" + reason: test min score functionality + + - do: + index: + index: test-index + id: doc_2 + body: + text: "The phases of the Moon come from the position of the Moon relative to the Earth and Sun." + topic: [ "science" ] + subtopic: [ "astronomy" ] + inference_text_field: "10" + refresh: true + + - do: + search: + index: test-index + body: + track_total_hits: true + fields: [ "text", "topic" ] + retriever: + text_similarity_reranker: + retriever: + standard: + query: + bool: + should: + - constant_score: + filter: + term: { subtopic: "technology" } + boost: 10 + - constant_score: + filter: + term: { subtopic: "astronomy" } + boost: 1 + rank_window_size: 10 + inference_id: my-rerank-model + inference_text: "How often does the moon hide the sun?" + field: inference_text_field + min_score: 10 + size: 10 + + - match: { hits.total.value: 1 } + - length: { hits.hits: 1 } + - match: { hits.hits.0._id: "doc_2" } + +--- +"Text similarity reranker with min_score zero includes all docs": + + - requires: + cluster_features: "text_similarity_reranker_minscore_fix" + reason: test min score functionality + + - do: + search: + index: test-index + body: + track_total_hits: true + fields: [ "text", "topic" ] + retriever: + text_similarity_reranker: + retriever: + standard: + query: + match_all: {} + rank_window_size: 10 + inference_id: my-rerank-model + inference_text: "How often does the moon hide the sun?" + field: inference_text_field + min_score: 0 + size: 10 + + - match: { hits.total.value: 3 } + - length: { hits.hits: 3 } + +--- +"Text similarity reranker with high min_score excludes all docs": + + - requires: + cluster_features: "text_similarity_reranker_minscore_fix" + reason: test min score functionality + + - do: + search: + index: test-index + body: + track_total_hits: true + fields: [ "text", "topic" ] + retriever: + text_similarity_reranker: + retriever: + standard: + query: + match_all: {} + rank_window_size: 10 + inference_id: my-rerank-model + inference_text: "How often does the moon hide the sun?" + field: inference_text_field + min_score: 1000 + size: 10 + + - match: { hits.total.value: 0 } + - length: { hits.hits: 0 }