diff --git a/modules/deploying-a-llama-model-with-kserve.adoc b/modules/deploying-a-llama-model-with-kserve.adoc index 3db13497..f5740191 100644 --- a/modules/deploying-a-llama-model-with-kserve.adoc +++ b/modules/deploying-a-llama-model-with-kserve.adoc @@ -10,7 +10,7 @@ To use Llama Stack and retrieval-augmented generation (RAG) workloads in {produc * You have logged in to {productname-long}. * You have cluster administrator privileges for your {openshift-platform} cluster. -* You have installed the Llama Stack Operator. +* You have activated the Llama Stack Operator. ifdef::upstream[] For more information, see link:{odhdocshome}/working-with-rag/#installing-the-llama-stack-operator_rag[Installing the Llama Stack Operator]. endif::[] diff --git a/modules/deploying-a-llamastackdistribution-instance.adoc b/modules/deploying-a-llamastackdistribution-instance.adoc index 36ea07b8..36fade51 100644 --- a/modules/deploying-a-llamastackdistribution-instance.adoc +++ b/modules/deploying-a-llamastackdistribution-instance.adoc @@ -4,14 +4,7 @@ = Deploying a LlamaStackDistribution instance [role='_abstract'] -You can integrate LlamaStack and its retrieval-augmented generation (RAG) capabilities with your deployed Llama 3.2 model served by vLLM. This integration enables you to build intelligent applications that combine large language models (LLMs) with real-time data retrieval, providing more accurate and contextually relevant responses for your AI workloads. - -When you create a `LlamaStackDistribution` custom resource (CR), specify the Llama Stack image `quay.io/opendatahub/llama-stack:odh` in the `spec.server.distribution.image` field. The image is hosted on link:https://quay.io[Quay.io], a secure registry that provides vulnerability scanning, role‑based access control, and globally distributed content delivery. Using this {org-name}–validated image ensures that your deployment automatically receives the latest security patches and compatibility updates. For more information about working with Quay.io, see link:https://docs.redhat.com/en/documentation/red_hat_quay/3/html/about_quay_io/quayio-overview[Quay.io overview]. - -[IMPORTANT] -==== -The Llama Stack image is hosted on link:https://quay.io[Quay.io] only during the Developer Preview phase of the Llama Stack integration with {productname-short}. When the Llama Stack integration reaches general availability, the image will be available on link:https://registry.redhat.io[registry.redhat.io]. -==== +You can integrate LlamaStack and its retrieval-augmented generation (RAG) capabilities with your deployed Llama 3.2 model served by vLLM. You can use this integration to build intelligent applications that combine large language models (LLMs) with real-time data retrieval, providing more accurate and contextually relevant responses for your AI workloads. When you create a `LlamaStackDistribution` custom resource (CR), specify `rh-dev` in the `spec.server.distribution.name` field. ifdef::self-managed[] ifdef::disconnected[] @@ -42,7 +35,6 @@ endif::[] .Procedure - . Open a new terminal window. .. Log in to your {openshift-platform} cluster from the CLI: .. In the upper-right corner of the OpenShift web console, click your user name and select *Copy login command*. @@ -119,10 +111,13 @@ spec: name: llama-stack port: 8321 distribution: - image: quay.io/opendatahub/llama-stack:odh - storage: - size: "5Gi" + name: rh-dev ---- ++ +[NOTE] +==== +The `rh-dev` value is an internal image reference. When you create the `LlamaStackDistribution` custom resource, the {productname-short} Operator automatically resolves `rh-dev` to the container image in the appropriate registry. This internal image reference allows the underlying image to update without requiring changes to your custom resource. +==== . Click *Create*.