Skip to content

Commit 0e0d624

Browse files
authored
[DOCS] Adds examples to inference processor docs (#116018) (#118130)
1 parent 82ebd0a commit 0e0d624

File tree

1 file changed

+67
-0
lines changed

1 file changed

+67
-0
lines changed

docs/reference/ingest/processors/inference.asciidoc

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -735,3 +735,70 @@ You can also specify the target field as follows:
735735

736736
In this case, {feat-imp} is exposed in the
737737
`my_field.foo.feature_importance` field.
738+
739+
740+
[discrete]
741+
[[inference-processor-examples]]
742+
==== {infer-cap} processor examples
743+
744+
The following example uses an <<inference-apis,{infer} endpoint>> in an {infer} processor named `query_helper_pipeline` to perform a chat completion task.
745+
The processor generates an {es} query from natural language input using a prompt designed for a completion task type.
746+
Refer to <<put-inference-api-desc,this list>> for the {infer} service you use and check the corresponding examples of setting up an endpoint with the chat completion task type.
747+
748+
749+
[source,console]
750+
--------------------------------------------------
751+
PUT _ingest/pipeline/query_helper_pipeline
752+
{
753+
"processors": [
754+
{
755+
"script": {
756+
"source": "ctx.prompt = 'Please generate an elasticsearch search query on index `articles_index` for the following natural language query. Dates are in the field `@timestamp`, document types are in the field `type` (options are `news`, `publication`), categories in the field `category` and can be multiple (options are `medicine`, `pharmaceuticals`, `technology`), and document names are in the field `title` which should use a fuzzy match. Ignore fields which cannot be determined from the natural language query context: ' + ctx.content" <1>
757+
}
758+
},
759+
{
760+
"inference": {
761+
"model_id": "openai_chat_completions", <2>
762+
"input_output": {
763+
"input_field": "prompt",
764+
"output_field": "query"
765+
}
766+
}
767+
},
768+
{
769+
"remove": {
770+
"field": "prompt"
771+
}
772+
}
773+
]
774+
}
775+
--------------------------------------------------
776+
// TEST[skip: An inference endpoint is required.]
777+
<1> The `prompt` field contains the prompt used for the completion task, created with <<modules-scripting-painless,Painless>>.
778+
`+ ctx.content` appends the natural language input to the prompt.
779+
<2> The ID of the pre-configured {infer} endpoint, which utilizes the <<infer-service-openai,`openai` service>> with the `completion` task type.
780+
781+
The following API request will simulate running a document through the ingest pipeline created previously:
782+
783+
[source,console]
784+
--------------------------------------------------
785+
POST _ingest/pipeline/query_helper_pipeline/_simulate
786+
{
787+
"docs": [
788+
{
789+
"_source": {
790+
"content": "artificial intelligence in medicine articles published in the last 12 months" <1>
791+
}
792+
}
793+
]
794+
}
795+
--------------------------------------------------
796+
// TEST[skip: An inference processor with an inference endpoint is required.]
797+
<1> The natural language query used to generate an {es} query within the prompt created by the {infer} processor.
798+
799+
800+
[discrete]
801+
[[infer-proc-readings]]
802+
==== Further readings
803+
804+
* https://www.elastic.co/search-labs/blog/openwebcrawler-llms-semantic-text-resume-job-search[Which job is the best for you? Using LLMs and semantic_text to match resumes to jobs]

0 commit comments

Comments
 (0)