Skip to content

Commit 7f82332

Browse files
progress
Signed-off-by: Adrian Cole <[email protected]>
1 parent dd4a381 commit 7f82332

File tree

6 files changed

+97
-47
lines changed

6 files changed

+97
-47
lines changed

example-apps/chatbot-rag-app/README.md

Lines changed: 34 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -45,34 +45,48 @@ and configure its templated connection settings:
4545

4646
## Running the App
4747

48+
This application contains two services:
49+
* create-index: Installs ELSER and ingests data into elasticsearch
50+
* api-frontend: Hosts the chatbot-rag-app application on http://localhost:4000
51+
4852
There are two ways to run the app: via Docker or locally. Docker is advised for
4953
ease while locally is advised if you are making changes to the application.
5054

5155
### Run with docker
5256

53-
Docker compose is the easiest way, as you get one-step to:
54-
* ingest data into elasticsearch
55-
* run the app, which listens on http://localhost:4000
57+
Docker compose is the easiest way to get started, as you don't need to have a
58+
working Python environment.
5659

5760
**Double-check you have a `.env` file with all your variables set first!**
5861

62+
#### Create your Elasticsearch index
63+
64+
First, ingest the data into elasticsearch:
65+
```bash
66+
docker compose run -T --rm --pull always create-index
67+
```
68+
69+
*Note*: This may take several minutes to complete
70+
71+
#### Run the application
72+
73+
Now, run the app, which listens on http://localhost:4000
5974
```bash
60-
docker compose up --pull always --force-recreate
75+
docker compose run --rm --pull always api-frontend
6176
```
6277

63-
*Note*: First time creating the index can fail on timeout. Wait a few minutes
64-
and retry.
78+
#### Cleanup when finished
6579

66-
Clean up when finished, like this:
80+
When you are done, clean up the services above like this:
6781

6882
```bash
6983
docker compose down
7084
```
7185

72-
### Run locally
86+
### Run with Python
7387

74-
If you want to run this example with Python and Node.js, you need to do a few
75-
things listed in the [Dockerfile](Dockerfile). The below uses the same
88+
If you want to run this example with Python, you need to do a few things listed
89+
in the [Dockerfile](Dockerfile) to build it first. The below uses the same
7690
production mode as used in Docker to avoid problems in debug mode.
7791

7892
**Double-check you have a `.env` file with all your variables set first!**
@@ -89,7 +103,7 @@ nvm use --lts
89103
(cd frontend; yarn install; REACT_APP_API_HOST=/api yarn build)
90104
```
91105

92-
#### Configure your python environment
106+
#### Configure your Python environment
93107

94108
Before we can run the app, we need a working Python environment with the
95109
correct packages installed:
@@ -102,17 +116,16 @@ pip install "python-dotenv[cli]"
102116
pip install -r requirements.txt
103117
```
104118

105-
#### Run the ingest command
119+
#### Create your Elasticsearch index
106120

107121
First, ingest the data into elasticsearch:
108122
```bash
109-
FLASK_APP=api/app.py dotenv run -- flask create-index
123+
dotenv run -- flask create-index
110124
```
111125

112-
*Note*: First time creating the index can fail on timeout. Wait a few minutes
113-
and retry.
126+
*Note*: This may take several minutes to complete
114127

115-
#### Run the app
128+
#### Run the application
116129

117130
Now, run the app, which listens on http://localhost:4000
118131
```bash
@@ -185,13 +198,14 @@ passages. Modify this script to index your own data.
185198

186199
See [Langchain documentation][loader-docs] for more ways to load documents.
187200

188-
### Building from source with docker
201+
### Running from source with Docker
189202

190-
To build the app from source instead of using published images, pass the `--build`
191-
flag to Docker Compose.
203+
To build the app from source instead of using published images, pass the
204+
`--build` flag to Docker Compose instead of `--pull always`
192205

206+
For example, to run the create-index service from source:
193207
```bash
194-
docker compose up --build --force-recreate
208+
docker compose run --rm --build create-index
195209
```
196210

197211
---

example-apps/chatbot-rag-app/data/index_data.py

Lines changed: 30 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
import json
22
import os
3+
from sys import stdout
34
import time
5+
from halo import Halo
46
from warnings import warn
57

68
from elasticsearch import (
@@ -135,30 +137,43 @@ def await_ml_tasks(max_timeout=600, interval=5):
135137
TimeoutError: If the timeout is reached and machine learning tasks are still running.
136138
"""
137139
start_time = time.time()
140+
ml_tasks = get_ml_tasks()
141+
if not ml_tasks:
142+
return # likely a lost race on tasks
143+
144+
spinner = Halo(text="Awaiting ML tasks", spinner="dots")
145+
if stdout.isatty():
146+
spinner.start()
147+
else:
148+
print(f"Awaiting {len(ml_tasks)} ML tasks")
138149

139-
tasks = [] # Initialize tasks list
140-
previous_task_count = 0 # Track the previous number of tasks
141150
while time.time() - start_time < max_timeout:
142-
tasks = []
143-
resp = es.tasks.list(detailed=True, actions=["cluster:monitor/xpack/ml/*"])
144-
for node_id, node_info in resp["nodes"].items():
145-
node_tasks = node_info.get("tasks", {})
146-
for task_id, task_info in node_tasks.items():
147-
tasks.append(task_info["action"])
148-
if not tasks:
151+
ml_tasks = get_ml_tasks()
152+
if len(ml_tasks) == 0:
149153
break
150-
current_task_count = len(tasks)
151-
if current_task_count != previous_task_count:
152-
warn(f"Awaiting {current_task_count} ML tasks")
153-
previous_task_count = current_task_count
154154
time.sleep(interval)
155155

156-
if tasks:
156+
if stdout.isatty():
157+
spinner.stop()
158+
else:
159+
print(f"ML tasks complete")
160+
161+
if ml_tasks:
157162
raise TimeoutError(
158-
f"Timeout reached. ML tasks are still running: {', '.join(tasks)}"
163+
f"Timeout reached. ML tasks are still running: {', '.join(ml_tasks)}"
159164
)
160165

161166

167+
def get_ml_tasks():
168+
"""Return a list of ML task actions from the ES tasks API."""
169+
tasks = []
170+
resp = es.tasks.list(detailed=True, actions=["cluster:monitor/xpack/ml/*"])
171+
for node_info in resp["nodes"].values():
172+
for task_info in node_info.get("tasks", {}).values():
173+
tasks.append(task_info["action"])
174+
return tasks
175+
176+
162177
# Unless we run through flask, we can miss critical settings or telemetry signals.
163178
if __name__ == "__main__":
164179
main()

example-apps/chatbot-rag-app/docker-compose.yml

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
name: chatbot-rag-app
22

33
services:
4-
ingest-data:
4+
create-index:
55
image: ghcr.io/elastic/elasticsearch-labs/chatbot-rag-app
66
build:
77
context: .
8-
container_name: ingest-data
8+
container_name: create-index
99
restart: 'no' # no need to re-ingest on successive runs
10-
environment:
11-
FLASK_APP: api/app.py
1210
env_file:
1311
- .env
12+
# Add settings that allow `docker compose run` to use tty and accept Ctrl+C
13+
stdin_open: true
1414
command: flask create-index
1515
volumes:
1616
# VertexAI uses a file for GOOGLE_APPLICATION_CREDENTIALS, not an API key
@@ -20,14 +20,16 @@ services:
2020

2121
api-frontend:
2222
depends_on:
23-
ingest-data:
23+
create-index:
2424
condition: service_completed_successfully
2525
container_name: api-frontend
2626
image: ghcr.io/elastic/elasticsearch-labs/chatbot-rag-app
2727
build:
2828
context: .
2929
env_file:
3030
- .env
31+
# Add settings that allow `docker compose run` to use tty and accept Ctrl+C
32+
stdin_open: true
3133
volumes:
3234
# VertexAI uses a file for GOOGLE_APPLICATION_CREDENTIALS, not an API key
3335
- ${HOME}/.config/gcloud:/root/.config/gcloud

example-apps/chatbot-rag-app/env.example

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
# Make a copy of this file with the name .env and assign values to variables
22

3+
# Location of the application routes
4+
FLASK_APP=api/app.py
5+
# Ensure print statements appear as they happen
6+
PYTHONUNBUFFERED=1
7+
38
# How you connect to Elasticsearch: change details to your instance
49
ELASTICSEARCH_URL=http://localhost:9200
510
ELASTICSEARCH_USER=elastic

example-apps/chatbot-rag-app/requirements.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ langchain-elasticsearch
55
tiktoken
66
flask
77
flask-cors
8+
halo
89

910
# LLM dependencies
1011
langchain-openai

example-apps/chatbot-rag-app/requirements.txt

Lines changed: 20 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,11 @@ attrs==25.1.0
2222
# via aiohttp
2323
blinker==1.9.0
2424
# via flask
25-
boto3==1.36.23
25+
boto3==1.36.24
2626
# via
2727
# langchain-aws
2828
# langtrace-python-sdk
29-
botocore==1.36.23
29+
botocore==1.36.24
3030
# via
3131
# boto3
3232
# s3transfer
@@ -46,7 +46,10 @@ click==8.1.8
4646
cohere==5.13.12
4747
# via langchain-cohere
4848
colorama==0.4.6
49-
# via langtrace-python-sdk
49+
# via
50+
# halo
51+
# langtrace-python-sdk
52+
# log-symbols
5053
dataclasses-json==0.6.7
5154
# via langchain-community
5255
deprecated==1.2.18
@@ -144,6 +147,8 @@ grpcio-status==1.70.0
144147
# via google-api-core
145148
h11==0.14.0
146149
# via httpcore
150+
halo==0.0.31
151+
# via -r requirements.in
147152
httpcore==1.0.7
148153
# via httpx
149154
httpx==0.28.1
@@ -159,7 +164,7 @@ httpx-sse==0.4.0
159164
# langchain-community
160165
# langchain-google-vertexai
161166
# langchain-mistralai
162-
huggingface-hub==0.28.1
167+
huggingface-hub==0.29.0
163168
# via
164169
# tokenizers
165170
# transformers
@@ -193,9 +198,9 @@ langchain-aws==0.2.13
193198
# via -r requirements.in
194199
langchain-cohere==0.4.2
195200
# via -r requirements.in
196-
langchain-community==0.3.17
201+
langchain-community==0.3.18
197202
# via langchain-cohere
198-
langchain-core==0.3.36
203+
langchain-core==0.3.37
199204
# via
200205
# langchain
201206
# langchain-aws
@@ -223,6 +228,8 @@ langsmith==0.3.8
223228
# langchain-core
224229
langtrace-python-sdk==3.6.2
225230
# via -r requirements.in
231+
log-symbols==0.0.14
232+
# via halo
226233
markupsafe==3.0.2
227234
# via
228235
# jinja2
@@ -414,11 +421,15 @@ shapely==2.0.7
414421
simsimd==6.2.1
415422
# via elasticsearch
416423
six==1.17.0
417-
# via python-dateutil
424+
# via
425+
# halo
426+
# python-dateutil
418427
sniffio==1.3.1
419428
# via
420429
# anyio
421430
# openai
431+
spinners==0.0.24
432+
# via halo
422433
sqlalchemy==2.0.38
423434
# via
424435
# langchain
@@ -429,6 +440,8 @@ tenacity==9.0.0
429440
# langchain
430441
# langchain-community
431442
# langchain-core
443+
termcolor==2.5.0
444+
# via halo
432445
tiktoken==0.9.0
433446
# via
434447
# -r requirements.in

0 commit comments

Comments
 (0)