-
Couldn't load subscription status.
- Fork 116
Adds new parameters to the elasticsearch inference API for the rerank task type #5476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
2b23030
e36beb6
911b868
4357270
e8b530c
05f6ec6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1306,6 +1306,23 @@ export class ElasticsearchServiceSettings { | |
| * The maximum value is 32. | ||
| */ | ||
| num_threads: integer | ||
| /** | ||
| * Only for the `rerank` task type. | ||
| * Controls the strategy used for processing long documents during inference. | ||
| * | ||
| * Possible values: | ||
| * - `truncate` (default): Processes only the beginning of each document. | ||
| * - `chunk`: Splits long documents into smaller parts (chunks) before inference. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure where it's best to clarify this but with chunking enabled we will return to the user a single score per document (same as we do for truncating) with the score correlating to the highest score of any chunk. I just want to make it clear that the structure of the response to the user will not change, only the rerank relevance scores. |
||
| * | ||
| * To enable chunking, set this value to `chunk`. | ||
| */ | ||
| long_document_strategy?: string | ||
| /** | ||
| * Only for the `rerank` task type. | ||
| * Limits the number of chunks per document that are sent for inference when chunking is enabled. | ||
| * If not set, all chunks generated for the document are processed. | ||
| */ | ||
| max_chunks_per_doc?: integer | ||
| } | ||
|
|
||
| export class ElasticsearchTaskSettings { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -26,7 +26,7 @@ import { | |
| ElasticsearchTaskSettings, | ||
| ElasticsearchTaskType | ||
| } from '@inference/_types/CommonTypes' | ||
| import { InferenceChunkingSettings } from '@inference/_types/Services' | ||
| import { ElasticsearchInferenceChunkingSettings } from '@inference/_types/Services' | ||
|
|
||
| /** | ||
| * Create an Elasticsearch inference endpoint. | ||
|
|
@@ -78,10 +78,10 @@ export interface Request extends RequestBase { | |
| } | ||
| body: { | ||
| /** | ||
| * The chunking configuration object. | ||
| * The chunking configuration object. For the `rerank` task type, you can enable chunking by setting the `long_document_strategy` parameter to `chunk` in the `service_settings` object. | ||
|
||
| * @ext_doc_id inference-chunking | ||
| */ | ||
| chunking_settings?: InferenceChunkingSettings | ||
| chunking_settings?: ElasticsearchInferenceChunkingSettings | ||
| /** | ||
| * The type of service supported for the specified task type. In this case, `elasticsearch`. | ||
| */ | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick clarification. For 9.2, these two values are only configurable for rerank endpoints using the elastic reranker model.