[Inference API] Add parameters to perform inference API as alias to task_settings #114329
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Alias task_settings as parameters in the perform inference API.
examples:
Put the endpoint
{ "service": "cohere", "service_settings": { "model_id": "rerank-english-v3.0", "api_key": "<REDACTED>" }, "task_settings": { "return_documents": true } }perform inference with task settings:
request:
{ "input": ["test1", "test2", "test3"], "query": "test", "task_settings":{ "return_documents":true, "top_n": 1 } }response:
{ "rerank": [ { "index": 0, "relevance_score": 9.927262E-4, "text": "test1" } ] }perform inference with parameters:
request:
{ "input": ["test1", "test2", "test3"], "query": "test", "parameters":{ "return_documents":true, "top_n": 1 } }response:
{ "rerank": [ { "index": 0, "relevance_score": 9.927262E-4, "text": "test1" } ] }without parameters or task_settings:
request:
{ "input": ["test1", "test2", "test3"], "query": "test" }response:
{ "rerank": [ { "index": 0, "relevance_score": 9.927262E-4, "text": "test1" }, { "index": 1, "relevance_score": 7.1802974E-4, "text": "test2" }, { "index": 2, "relevance_score": 2.1152887E-4, "text": "test3" } ] }with both parameters and task settings (the latter one is used)
request:
{ "input": ["test1", "test2", "test3"], "query": "test", "parameters":{ "top_n": 1 }, "task_settings":{ "top_n": 2 } }response:
{ "rerank": [ { "index": 0, "relevance_score": 9.927262E-4, "text": "test1" }, { "index": 1, "relevance_score": 7.1802974E-4, "text": "test2" } ] }