From 023882ec455b34ec8e60480884a42de3c04e3393 Mon Sep 17 00:00:00 2001 From: kosabogi <105062005+kosabogi@users.noreply.github.com> Date: Thu, 5 Dec 2024 16:32:59 +0100 Subject: [PATCH] Adds warning to Create inference API page (#118073) --- docs/reference/inference/put-inference.asciidoc | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/docs/reference/inference/put-inference.asciidoc b/docs/reference/inference/put-inference.asciidoc index e7e25ec98b49d..2986f16916f3c 100644 --- a/docs/reference/inference/put-inference.asciidoc +++ b/docs/reference/inference/put-inference.asciidoc @@ -10,7 +10,6 @@ Creates an {infer} endpoint to perform an {infer} task. * For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <>. ==== - [discrete] [[put-inference-api-request]] ==== {api-request-title} @@ -47,6 +46,14 @@ Refer to the service list in the <> API. In the response, look for `"state": "fully_allocated"` and ensure the `"allocation_count"` matches the `"target_allocation_count"`. +* Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources. +==== + + The following services are available through the {infer} API. You can find the available task types next to the service name. Click the links to review the configuration details of the services: