Skip to content

Commit 3c16a36

Browse files
authored
Merge pull request #877 from chtyler/RHAI-ENG-124-llamastack-activation
RHAI-ENG-124-llamastack-activation - initial draft of tech preview co…
2 parents fd5c0d9 + e9f7e73 commit 3c16a36

File tree

2 files changed

+45
-0
lines changed

2 files changed

+45
-0
lines changed

modules/working-with-llama-stack.adoc

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
:_module-type: CONCEPT
2+
3+
[id="working-with-llama-stack_{context}"]
4+
= Working with Llama Stack
5+
6+
[role="_abstract"]
7+
Llama Stack is a unified AI runtime environment designed to simplify the deployment and management of generative AI workloads on {productname-short}. Llama Stack integrates LLM inference servers, vector databases, and retrieval services in a single stack, optimized for Retrieval-Augmented Generation (RAG) and agent-based AI workflows. In {openshift-platform}, the Llama Stack Operator manages the deployment lifecycle of these components, ensuring scalability, consistency, and integration with {productname-short} projects.
8+
9+
ifndef::upstream[]
10+
[IMPORTANT]
11+
====
12+
ifdef::self-managed[]
13+
Llama Stack integration is currently available in {productname-long} {vernum} as a Technology Preview feature.
14+
endif::[]
15+
ifdef::cloud-service[]
16+
Llama Stack integration is currently available in {productname-long} as a Technology Preview feature.
17+
endif::[]
18+
Technology Preview features are not supported with {org-name} production service level agreements (SLAs) and might not be functionally complete.
19+
{org-name} does not recommend using them in production.
20+
These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
21+
22+
For more information about the support scope of {org-name} Technology Preview features, see link:https://access.redhat.com/support/offerings/techpreview/[Technology Preview Features Support Scope].
23+
====
24+
endif::[]
25+
26+
Llama Stack includes the following components:
27+
28+
* **Inference model servers** such as vLLM, designed to efficiently serve large language models.
29+
* **Vector storage** solutions, primarily Milvus, to store embeddings generated from your domain data.
30+
* **Retrieval and embedding management** workflows using integrated tools, such as Docling, to handle continuous data ingestion and synchronization.
31+
* **Integration with {productname-short}** by using the `LlamaStackDistribution` custom resource, simplifying configuration and deployment.
32+
33+
ifdef::upstream[]
34+
For information about how to deploy Llama Stack in {productname-short}, see link:{odhdocshome}/working-with-rag/#deploying-a-rag-stack-in-a-data-science-project_rag[Deploying a RAG stack in a Data Science Project].
35+
endif::[]
36+
ifndef::upstream[]
37+
For information about how to deploy Llama Stack in {productname-short}, see link:{rhoaidocshome}{default-format-url}/working_with_rag/deploying-a-rag-stack-in-a-data-science-project_rag[Deploying a RAG stack in a Data Science Project].
38+
endif::[]
39+
40+
[role="_additional-resources"]
41+
.Additional resources
42+
* link:https://github.com/opendatahub-io/llama-stack-demos[Llama Stack Demos repository^]
43+
* link:https://llama-stack-k8s-operator.pages.dev/[Llama Stack Kubernetes Operator documentation^]
44+
* link:https://llama-stack.readthedocs.io/en/latest/[Llama Stack documentation]

working-with-rag.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,5 @@ include::_artifacts/document-attributes-global.adoc[]
1616
= Working with RAG
1717

1818
include::modules/overview-of-rag.adoc[leveloffset=+1]
19+
include::modules/working-with-llama-stack.adoc[leveloffset=+1]
1920
include::assemblies/deploying-a-rag-stack-in-a-data-science-project.adoc[leveloffset=+1]

0 commit comments

Comments
 (0)