Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion docs/reference/inference/chat-completion-inference.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,11 @@ However, if you do not plan to use the {infer} APIs to use these models or if yo
The chat completion {infer} API enables real-time responses for chat completion tasks by delivering answers incrementally, reducing response times during computation.
It only works with the `chat_completion` task type for `openai` and `elastic` {infer} services.


[NOTE]
====
The `chat_completion` task type is only available within the _unified API and only supports streaming.
* The `chat_completion` task type is only available within the _unified API and only supports streaming.
* The Chat completion {infer} API and the Stream {infer} API differ in their response structure. If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API.
====

[discrete]
Expand Down
2 changes: 2 additions & 0 deletions docs/reference/inference/stream-inference.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ However, if you do not plan to use the {infer} APIs to use these models or if yo
The stream {infer} API enables real-time responses for completion tasks by delivering answers incrementally, reducing response times during computation.
It only works with the `completion` and `chat_completion` task types.

The Chat completion {infer} API and the Stream {infer} API differ in their response structure. If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API.

[NOTE]
====
include::inference-shared.asciidoc[tag=chat-completion-docs]
Expand Down