Skip to content

Conversation

@dan-rubinstein
Copy link
Member

@dan-rubinstein dan-rubinstein commented Jul 2, 2025

Issue - #121567

This change adds the ability to use chunking for the elastic reranker as an alternative long document handling strategy to the existing truncation method. To enable chunking you must include the long_document_strategy (with the value set to chunk) in the service_settings of the rerank inference endpoint being used to perform inference. The value can also be set manually to truncate to force chunking but this is currently the default behavior. The max_chunks_per_doc value can optionally be included to limit the number of chunks that are sent for inference per document. If this value is not set then all chunks generated for the document will be sent. For example:

PUT _inference/rerank/my-elasticrerank-endpoint
{
  "service": "elasticsearch",
  "service_settings": {
    "model_id": ".rerank-v1", 
    "num_threads": 1,
    "num_allocations": 1,
    "long_document_strategy": "chunk",
    "max_chunks_per_doc": 2
  }
}

When using chunking, documents will be chunked before inference and the chunks (either all or some depending on whether max_chunks_per_doc is set) will be sent for inference. For each document, the relevance score returned to the user will be the maximum score for any given chunk within the document.

Testing

  • Unit tests + integration tests
  • Created an elastic reranker endpoint with no chunking configuration values and ensured that truncation worked as expected.
  • Created an elastic reranker endpoint with truncate selected for long document strategy and ensured that truncation worked as expected.
  • Created an elastic reranker endpoint with chunk selected for long document strategy and ensured that documents were chunked and all chunks were sent for inference.
  • Created an elastic reranker endpoint with chunk selected for long document strategy and max_chunks_per_doc set and ensured that subset of chunks were sent for inference.
  • (TODO) Created an non-elastic reranker elasticsearch service endpoint and ensured that inference is still working.

@dan-rubinstein dan-rubinstein added :ml Machine learning Team:ML Meta label for the ML team v9.2.0 labels Jul 2, 2025
@dan-rubinstein
Copy link
Member Author

@elasticmachine merge upstream

@davidkyle davidkyle added the cloud-deploy Publish cloud docker image for Cloud-First-Testing label Jul 18, 2025
@dan-rubinstein
Copy link
Member Author

@elasticmachine merge upstream

@dan-rubinstein
Copy link
Member Author

@elasticmachine merge upstream

@dan-rubinstein
Copy link
Member Author

@elasticmachine merge upstream

@dan-rubinstein
Copy link
Member Author

@elasticmachine merge upstream

@dan-rubinstein dan-rubinstein marked this pull request as ready for review September 22, 2025 17:17
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Hi @dan-rubinstein, I've created a changelog YAML for you.

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

private RankedDocsResults parseRankedDocResultsForChunks(RankedDocsResults rankedDocsResults) {
List<RankedDocsResults.RankedDoc> updatedRankedDocs = new ArrayList<>();
Set<Integer> docIndicesSeen = new HashSet<>();
for (RankedDocsResults.RankedDoc rankedDoc : rankedDocsResults.getRankedDocs()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be safe and ensure the highest scoring chunk is used rankedDocsResults should be sorted. The results almost certainly will be sorted but just in case.

The sorting could be done in the RankedDocsResults constructor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, good catch, I added the sort at the end of this function but it should be in the construction to cover cases when the results aren't sorted but it should be in the rankedDocsResults.getRankedDocs() call to ensure we are taking the top result for each doc. I'll update this to sort the ranked docs before looping and will also update the updatedRankedDocs to be topRankedDocs as I think that's a bit clearer on what we're trying to store.

@dan-rubinstein dan-rubinstein enabled auto-merge (squash) September 29, 2025 14:49
@dan-rubinstein dan-rubinstein merged commit 0cee213 into elastic:main Sep 29, 2025
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cloud-deploy Publish cloud docker image for Cloud-First-Testing >enhancement :ml Machine learning Team:ML Meta label for the ML team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants