Skip to content

RHAI-ENG-306-modify-docs-on-deploying-llamastackdistribution-instanc… #898

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion modules/deploying-a-llama-model-with-kserve.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ To use Llama Stack and retrieval-augmented generation (RAG) workloads in {produc

* You have logged in to {productname-long}.
* You have cluster administrator privileges for your {openshift-platform} cluster.
* You have installed the Llama Stack Operator.
* You have activated the Llama Stack Operator.
ifdef::upstream[]
For more information, see link:{odhdocshome}/working-with-rag/#installing-the-llama-stack-operator_rag[Installing the Llama Stack Operator].
endif::[]
Expand Down
19 changes: 7 additions & 12 deletions modules/deploying-a-llamastackdistribution-instance.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,7 @@
= Deploying a LlamaStackDistribution instance

[role='_abstract']
You can integrate LlamaStack and its retrieval-augmented generation (RAG) capabilities with your deployed Llama 3.2 model served by vLLM. This integration enables you to build intelligent applications that combine large language models (LLMs) with real-time data retrieval, providing more accurate and contextually relevant responses for your AI workloads.

When you create a `LlamaStackDistribution` custom resource (CR), specify the Llama Stack image `quay.io/opendatahub/llama-stack:odh` in the `spec.server.distribution.image` field. The image is hosted on link:https://quay.io[Quay.io], a secure registry that provides vulnerability scanning, role‑based access control, and globally distributed content delivery. Using this {org-name}–validated image ensures that your deployment automatically receives the latest security patches and compatibility updates. For more information about working with Quay.io, see link:https://docs.redhat.com/en/documentation/red_hat_quay/3/html/about_quay_io/quayio-overview[Quay.io overview].

[IMPORTANT]
====
The Llama Stack image is hosted on link:https://quay.io[Quay.io] only during the Developer Preview phase of the Llama Stack integration with {productname-short}. When the Llama Stack integration reaches general availability, the image will be available on link:https://registry.redhat.io[registry.redhat.io].
====
You can integrate LlamaStack and its retrieval-augmented generation (RAG) capabilities with your deployed Llama 3.2 model served by vLLM. You can use this integration to build intelligent applications that combine large language models (LLMs) with real-time data retrieval, providing more accurate and contextually relevant responses for your AI workloads. When you create a `LlamaStackDistribution` custom resource (CR), specify `rh-dev` in the `spec.server.distribution.name` field.

ifdef::self-managed[]
ifdef::disconnected[]
Expand Down Expand Up @@ -42,7 +35,6 @@ endif::[]

.Procedure


. Open a new terminal window.
.. Log in to your {openshift-platform} cluster from the CLI:
.. In the upper-right corner of the OpenShift web console, click your user name and select *Copy login command*.
Expand Down Expand Up @@ -119,10 +111,13 @@ spec:
name: llama-stack
port: 8321
distribution:
image: quay.io/opendatahub/llama-stack:odh
storage:
size: "5Gi"
name: rh-dev
----
+
[NOTE]
====
The `rh-dev` value is an internal image reference. When you create the `LlamaStackDistribution` custom resource, the {productname-short} Operator automatically resolves `rh-dev` to the container image in the appropriate registry. This internal image reference allows the underlying image to update without requiring changes to your custom resource.
====

. Click *Create*.

Expand Down