Skip to content

Commit 5fcdb75

Browse files
authored
Update the documentation to reflect use of lightspeed-stack (#68)
1 parent 543bebe commit 5fcdb75

File tree

2 files changed

+172
-106
lines changed

2 files changed

+172
-106
lines changed

README.md

Lines changed: 87 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -1,78 +1,129 @@
11
# Ansible Chatbot (llama) Stack
22

3-
An Ansible Chatbot (llama) Stack [custom distribution](https://llama-stack.readthedocs.io/en/latest/distributions/building_distro.html) (`Container` type).
3+
This repository contains the necessary configuration to build a Docker Container Image for `ansible-chatbot-stack`.
44

5-
It includes:
5+
`ansible-chatbot-stack` builds on top of `lightspeed-stack` that wraps Meta's `llama-stack` AI framework.
6+
7+
`ansible-chatbot-stack` includes various customisations for:
68

79
- A remote vLLM inference provider (RHOSAI vLLM compatible)
810
- The inline sentence transformers (Meta)
911
- AAP RAG database files and configuration
1012
- [Lightspeed external providers](https://github.com/lightspeed-core/lightspeed-providers)
11-
- Other default providers from the [Remote vLLM distribution](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/remote-vllm.html) as well
13+
- System Prompt injection
1214

1315
Build/Run overview:
1416

1517
```mermaid
1618
flowchart TB
1719
%% Nodes
18-
CHATBOT_STACK([fa:fa-layer-group ansible-chatbot-stack-base:x.y.z])
19-
AAP_CHATBOT_STACK([fa:fa-layer-group ansible-chatbot-stack:x.y.z])
20-
AAP_CHATBOT([fa:fa-comment Ansible Chatbot Service])
21-
CHATBOT_BUILD_CONFIG{{fa:fa-wrench ansible-chatbot-build.yaml}}
22-
CHATBOT_RUN_CONFIG{{fa:fa-wrench ansible-chatbot-run.yaml}}
23-
AAP_CHATBOT_DOCKERFILE{{fa:fa-wrench Containerfile}}
24-
Lightspeed_Providers("fa:fa-code-branch lightspeed-providers")
20+
LLAMA_STACK([fa:fa-layer-group llama-stack:x.y.z])
21+
LIGHTSPEED_STACK([fa:fa-layer-group lightspeed-stack:x.y.z])
22+
LIGHTSPEED_RUN_CONFIG{{fa:fa-wrench lightspeed-stack.yaml}}
23+
ANSIBLE_CHATBOT_STACK([fa:fa-layer-group ansible-chatbot-stack:x.y.z])
24+
ANSIBLE_CHATBOT_RUN_CONFIG{{fa:fa-wrench ansible-chatbot-run.yaml}}
25+
ANSIBLE_CHATBOT_DOCKERFILE{{fa:fa-wrench Containerfile}}
26+
ANSIBLE_LIGHTSPEED([fa:fa-layer-group ansible-ai-connect-service:x.y.z])
27+
LIGHTSPEED_PROVIDERS("fa:fa-code-branch lightspeed-providers:x.y.z")
2528
PYPI("fa:fa-database PyPI")
2629
2730
%% Edge connections between nodes
28-
CHATBOT_STACK -- Consumes --> PYPI
29-
Lightspeed_Providers -- Publishes --> PYPI
30-
CHATBOT_STACK -- Built from --> CHATBOT_BUILD_CONFIG
31-
AAP_CHATBOT_STACK -- Built from --> AAP_CHATBOT_DOCKERFILE
32-
AAP_CHATBOT_STACK -- inherits from --> CHATBOT_STACK
33-
AAP_CHATBOT -- Uses --> CHATBOT_RUN_CONFIG
34-
AAP_CHATBOT_STACK -- Runtime --> AAP_CHATBOT
31+
ANSIBLE_LIGHTSPEED -- Uses --> ANSIBLE_CHATBOT_STACK
32+
ANSIBLE_CHATBOT_STACK -- Consumes --> PYPI
33+
LIGHTSPEED_PROVIDERS -- Publishes --> PYPI
34+
ANSIBLE_CHATBOT_STACK -- Built from --> ANSIBLE_CHATBOT_DOCKERFILE
35+
ANSIBLE_CHATBOT_STACK -- Inherits from --> LIGHTSPEED_STACK
36+
ANSIBLE_CHATBOT_STACK -- Includes --> LIGHTSPEED_RUN_CONFIG
37+
ANSIBLE_CHATBOT_STACK -- Includes --> ANSIBLE_CHATBOT_RUN_CONFIG
38+
LIGHTSPEED_STACK -- Embeds --> LLAMA_STACK
39+
LIGHTSPEED_STACK -- Uses --> LIGHTSPEED_RUN_CONFIG
40+
LLAMA_STACK -- Uses --> ANSIBLE_CHATBOT_RUN_CONFIG
3541
```
3642

3743
## Build
3844

3945
### Setup for Ansible Chatbot Stack
4046

41-
---
42-
43-
> Actually using temporary [lightspeed stack providers](https://pypi.org/project/lightspeed-stack-providers/) package, otherwise further need for [lightspeed external providers](https://github.com/lightspeed-core/lightspeed-providers) available on PyPI
44-
45-
- Install llama-stack on the host machine, if not present.
46-
- External providers YAML manifests must be present in `providers.d/` of your host's llama-stack directory.
47-
- External providers' python libraries must be in the container's python's library path, but also in the host machine's python library path. It is a workaround for [this hack](https://github.com/meta-llama/llama-stack/blob/0cc07311890c00feb5bbd40f5052c8a84a88aa65/llama_stack/cli/stack/_build.py#L299).
48-
- Vector DB and embedding image files are copied from the latest `aap-rag-content` image to `./vector_db` and `./embeddings_model` respectively.
47+
- External Providers YAML manifests must be present in `providers.d/` of your host's `llama-stack` directory.
48+
- Vector Database is copied from the latest `aap-rag-content` image to `./vector_db`.
49+
- Embeddings image files are copied from the latest `aap-rag-content` image to `./embeddings_model`.
4950

5051
```shell
5152
make setup
5253
```
5354

54-
### Building the Ansible Chatbot Stack
55+
### Building Ansible Chatbot Stack
5556

56-
---
57+
Builds the image `ansible-chatbot-stack:$ANSIBLE_CHATBOT_VERSION`.
5758

58-
> Builds the image `ansible-chatbot-stack-base:$PYPI_VERSION`.
59+
> Change the `ANSIBLE_CHATBOT_VERSION` version and inference parameters below accordingly.
5960
6061
```shell
62+
export ANSIBLE_CHATBOT_VERSION=0.0.1
63+
6164
make build
6265
```
6366

64-
### Customizing the Ansible Chatbot Stack
67+
### Container file structure
68+
69+
#### Files from `lightspeed-stack` base image
70+
```commandline
71+
└── app-root/
72+
├── .venv/
73+
└── src/
74+
├── <lightspeed-stack files>
75+
└── lightspeed_stack.py
76+
````
77+
78+
#### Runtime files
79+
80+
> These are stored in a `PersistentVolumeClaim` for resilience
81+
```commandline
82+
└── .llama/
83+
└── data/
84+
└── distributions/
85+
└── ansible-chatbot/
86+
├── aap_faiss_store.db
87+
├── agents_store.db
88+
├── responses_store.db
89+
├── localfs_datasetio.db
90+
├── trace_store.db
91+
└── embeddings_model/
92+
```
93+
94+
#### Configuration files
95+
```commandline
96+
└── .llama/
97+
├── distributions/
98+
│ └── ansible-chatbot/
99+
│ ├── lightspeed-stack.yaml
100+
│ ├── ansible-chatbot-run.yaml
101+
│ ├── ansible-chatbot-version-info.json
102+
│ └── system-prompts/
103+
│ └── default.txt
104+
└── providers.d
105+
└── <llama-stack external providers>
106+
```
107+
108+
## Run
65109

66-
---
110+
Runs the image `ansible-chatbot-stack:$ANSIBLE_CHATBOT_VERSION` as a local container.
67111

68-
> Builds the image `ansible-chatbot-stack:$ANSIBLE_CHATBOT_VERSION`.
112+
> Change the `ANSIBLE_CHATBOT_VERSION` version and inference parameters below accordingly.
69113
70114
```shell
71115
export ANSIBLE_CHATBOT_VERSION=0.0.1
72-
make build-custom
73-
```
116+
export ANSIBLE_CHATBOT_VLLM_URL=<YOUR_MODEL_SERVING_URL>
117+
export ANSIBLE_CHATBOT_VLLM_API_TOKEN=<YOUR_MODEL_SERVING_API_TOKEN>
118+
export ANSIBLE_CHATBOT_INFERENCE_MODEL=<YOUR_INFERENCE_MODEL>
119+
export ANSIBLE_CHATBOT_INFERENCE_MODEL_FILTER=<YOUR_INFERENCE_MODEL_TOOLS_FILTERING>
120+
121+
make run
122+
```
74123

75-
## Run
124+
## Basic tests
125+
126+
Runs basic tests against the local container.
76127

77128
> Change the `ANSIBLE_CHATBOT_VERSION` version and inference parameters below accordingly.
78129
@@ -82,8 +133,8 @@ flowchart TB
82133
export ANSIBLE_CHATBOT_VLLM_API_TOKEN=<YOUR_MODEL_SERVING_API_TOKEN>
83134
export ANSIBLE_CHATBOT_INFERENCE_MODEL=<YOUR_INFERENCE_MODEL>
84135
export ANSIBLE_CHATBOT_INFERENCE_MODEL_FILTER=<YOUR_INFERENCE_MODEL_TOOLS_FILTERING>
85-
export AAP_GATEWAY_TOKEN=<YOUR_AAP_GATEWAY_TOKEN>
86-
make run
136+
137+
make run-test
87138
```
88139

89140
## Deploy into a k8s cluster
@@ -110,51 +161,6 @@ If you have the need for re-building images, apply the following clean-ups right
110161
make clean
111162
```
112163

113-
## Appendix - Testing by using the CLI client
114-
115-
```shell
116-
> llama-stack-client --configure ...
117-
118-
> llama-stack-client models list
119-
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
120-
┃ model_type ┃ identifier ┃ provider_resource_id ┃ metadata ┃ provider_id ┃
121-
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
122-
│ llm │ granite-3.3-8b-instruct │ granite-3.3-8b-instruct │ │ rhosai_vllm_dev │
123-
├────────────────────────┼──────────────────────────────────────────────────┼──────────────────────────────────────────────────┼────────────────────────────────────────────────────────────────┼────────────────────────────────────────────┤
124-
│ embedding │ all-MiniLM-L6-v2 │ all-MiniLM-L6-v2 │ {'embedding_dimension': 384.0} │ inline_sentence-transformer │
125-
└────────────────────────┴──────────────────────────────────────────────────┴──────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────┴────────────────────────────────────────────┘
126-
127-
> llama-stack-client providers list
128-
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
129-
┃ API ┃ Provider ID ┃ Provider Type ┃
130-
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
131-
│ inference │ rhosai_vllm_dev │ remote::vllm │
132-
│ inference │ inline_sentence-transformer │ inline::sentence-transformers │
133-
│ vector_io │ aap_faiss │ inline::faiss │
134-
│ safety │ llama-guard │ inline::llama-guard │
135-
│ safety │ lightspeed_question_validity │ inline::lightspeed_question_validity │
136-
│ agents │ meta-reference │ inline::meta-reference │
137-
│ datasetio │ localfs │ inline::localfs │
138-
│ telemetry │ meta-reference │ inline::meta-reference │
139-
│ tool_runtime │ rag-runtime-0 │ inline::rag-runtime │
140-
│ tool_runtime │ model-context-protocol-1 │ remote::model-context-protocol │
141-
│ tool_runtime │ lightspeed │ remote::lightspeed │
142-
└──────────────┴──────────────────────────────┴──────────────────────────────────────┘
143-
144-
> llama-stack-client vector_dbs list
145-
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
146-
┃ identifier ┃ provider_id ┃ provider_resource_id ┃ vector_db_type ┃ params ┃
147-
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
148-
│ aap-product-docs-2_5 │ aap_faiss │ aap-product-docs-2_5 │ │ embedding_dimension: 384 │
149-
│ │ │ │ │ embedding_model: all-MiniLM-L6-v2 │
150-
│ │ │ │ │ type: vector_db │
151-
│ │ │ │ │ │
152-
└──────────────────────┴─────────────┴──────────────────────┴────────────────┴───────────────────────────────────┘
153-
154-
> llama-stack-client inference chat-completion --message "tell me about Ansible Lightspeed"
155-
...
156-
```
157-
158164
## Appendix - Obtain a container shell
159165

160166
```shell

0 commit comments

Comments
 (0)