Skip to content

Commit 3de7f9e

Browse files
Merge pull request #384 from DefangLabs/jordan/refactor-managed-llm-samples
Refactor managed llm samples
2 parents a049169 + 87a6faf commit 3de7f9e

File tree

10 files changed

+114
-111
lines changed

10 files changed

+114
-111
lines changed

samples/managed-llm-provider/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ The `x-defang-llm` property on the `llm` service must be set to `true` in order
3636
To run the application locally, you can use the following command:
3737

3838
```bash
39-
docker compose -f compose.dev.yaml up --build
39+
docker compose -f compose.local.yaml up --build
4040
```
4141

4242
## Deployment

samples/managed-llm-provider/app/app.py

Lines changed: 4 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,11 @@
33
import os
44

55
import requests
6-
from fastapi import FastAPI, Form, Request
6+
from fastapi import FastAPI, Form
77
from fastapi.responses import HTMLResponse
88
from fastapi.staticfiles import StaticFiles
99
from fastapi.responses import JSONResponse
10+
from fastapi.responses import FileResponse
1011

1112
app = FastAPI()
1213
app.mount("/static", StaticFiles(directory="static"), name="static")
@@ -22,33 +23,14 @@
2223
MODEL_ID = os.getenv("LLM_MODEL", "gpt-4-turbo")
2324

2425
# Get the API key for the LLM
25-
# For development, you can use your local API key. In production, the LLM gateway service will override the need for it.
26+
# For development, you have the option to use your local API key. In production, the LLM gateway service will override the need for it.
2627
def get_api_key():
2728
return os.getenv("OPENAI_API_KEY", "")
2829

2930
# Home page form
3031
@app.get("/", response_class=HTMLResponse)
3132
async def home():
32-
return """
33-
<html>
34-
<head>
35-
<title>Ask the AI Model</title>
36-
<script type="text/javascript" src="./static/app.js"></script>
37-
</head>
38-
<body>
39-
<h1>Ask the AI Model</h1>
40-
<form method="post" id="askForm" onsubmit="event.preventDefault(); submitForm(event);">
41-
<textarea id="prompt" name="prompt" autofocus="autofocus" rows="5" cols="60" placeholder="Enter your question here..."
42-
onkeydown="if(event.key==='Enter'&&!event.shiftKey){event.preventDefault();this.form.dispatchEvent(new Event('submit', {cancelable:true}));}"></textarea>
43-
<br><br>
44-
<input type="submit" value="Ask">
45-
</form>
46-
<hr>
47-
<h2>Model's Reply:</h2>
48-
<p id="reply"></p>
49-
</body>
50-
</html>
51-
"""
33+
return FileResponse("static/index.html", media_type="text/html")
5234

5335
# Handle form submission
5436
@app.post("/ask", response_class=JSONResponse)
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
<html>
2+
<head>
3+
<title>Ask the AI Model</title>
4+
<script type="text/javascript" src="./static/app.js"></script>
5+
</head>
6+
<body>
7+
<h1>Ask the AI Model</h1>
8+
<form method="post" id="askForm" onsubmit="event.preventDefault(); submitForm(event);">
9+
<textarea id="prompt" name="prompt" autofocus="autofocus" rows="5" cols="60" placeholder="Enter your question here..."
10+
onkeydown="if(event.key==='Enter'&&!event.shiftKey){event.preventDefault();this.form.dispatchEvent(new Event('submit', {cancelable:true}));}"></textarea>
11+
<br><br>
12+
<input type="submit" value="Ask">
13+
</form>
14+
<hr>
15+
<h2>Model's Reply:</h2>
16+
<p id="reply"></p>
17+
</body>
18+
</html>

samples/managed-llm-provider/compose.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ services:
77
- "8000:8000"
88
restart: always
99
environment:
10-
- LLM_MODEL # LLM model ID used
10+
- LLM_MODEL=default
1111
# For other models, see https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping
1212
healthcheck:
1313
test: ["CMD", "python3", "-c", "import sys, urllib.request; urllib.request.urlopen(sys.argv[1]).read()", "http://localhost:8000/"]

samples/managed-llm/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,19 +11,19 @@ Using the [Defang OpenAI Access Gateway](#defang-openai-access-gateway), the fea
1111

1212
This allows switching from OpenAI to the Managed LLMs on supported cloud platforms without modifying your application code.
1313

14-
You can configure the `MODEL` and `ENDPOINT_URL` for the LLM separately for local development and production environments.
14+
You can configure the `MODEL` and `LLM_URL` for the LLM separately for local development and production environments.
1515
* The `MODEL` is the LLM Model ID you are using.
16-
* The `ENDPOINT_URL` is the bridge that provides authenticated access to the LLM model.
16+
* The `LLM_URL` is the bridge that provides authenticated access to the LLM model.
1717

1818
Ensure you have enabled model access for the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access).
1919

20-
To learn about available LLM models in Defang, please see our [Model Mapping documentation](https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping).
20+
To learn about available LLM models in Defang, please see our [Model Mapping documentation](https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping).
2121

2222
For more about Managed LLMs in Defang, please see our [Managed LLMs documentation](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models).
2323

2424
### Defang OpenAI Access Gateway
2525

26-
In the `compose.yaml` file, the `llm` service is used to route requests to the LLM API model. This is known as the [Defang OpenAI Access Gateway](https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway).
26+
In the `compose.yaml` file, the `llm` service is used to route requests to the LLM API model. This is known as the [Defang OpenAI Access Gateway](https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway).
2727

2828
The `x-defang-llm` property on the `llm` service must be set to `true` in order to use the OpenAI Access Gateway when deploying with Defang.
2929

@@ -38,7 +38,7 @@ The `x-defang-llm` property on the `llm` service must be set to `true` in order
3838
To run the application locally, you can use the following command:
3939

4040
```bash
41-
docker compose -f compose.dev.yaml up --build
41+
docker compose -f compose.local.yaml up --build
4242
```
4343

4444
## Deployment

samples/managed-llm/app/app.py

Lines changed: 47 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -1,55 +1,40 @@
1-
import os
21
import json
32
import logging
4-
from fastapi import FastAPI, Form, Request
5-
from fastapi.responses import HTMLResponse
3+
import os
4+
65
import requests
6+
from fastapi import FastAPI, Form
7+
from fastapi.responses import HTMLResponse
8+
from fastapi.staticfiles import StaticFiles
9+
from fastapi.responses import JSONResponse
10+
from fastapi.responses import FileResponse
711

812
app = FastAPI()
13+
app.mount("/static", StaticFiles(directory="static"), name="static")
914

1015
# Configure basic logging
1116
logging.basicConfig(level=logging.INFO)
1217

18+
default_openai_base_url = "https://api.openai.com/v1/"
19+
1320
# Set the environment variables for the chat model
14-
ENDPOINT_URL = os.getenv("ENDPOINT_URL", "https://api.openai.com/v1/chat/completions")
15-
# Fallback to OpenAI Model if not set in environment
16-
MODEL_ID = os.getenv("MODEL", "gpt-4-turbo")
21+
LLM_URL = os.getenv("LLM_URL", default_openai_base_url) + "chat/completions"
22+
# Fallback LLM Model if not set in environment
23+
MODEL_ID = os.getenv("LLM_MODEL", "gpt-4-turbo")
1724

1825
# Get the API key for the LLM
19-
# For development, you can use your local API key. In production, the LLM gateway service will override the need for it.
26+
# For development, you have the option to use your local API key. In production, the LLM gateway service will override the need for it.
2027
def get_api_key():
21-
return os.getenv("OPENAI_API_KEY", "API key not set")
28+
return os.getenv("OPENAI_API_KEY", "")
2229

2330
# Home page form
2431
@app.get("/", response_class=HTMLResponse)
2532
async def home():
26-
return """
27-
<html>
28-
<head><title>Ask the AI Model</title></head>
29-
<body>
30-
<h1>Ask the AI Model</h1>
31-
<form method="post" action="/ask" onsubmit="document.getElementById('loader').style.display='block'">
32-
<textarea name="prompt" autofocus="autofocus" rows="5" cols="60" placeholder="Enter your question here..."
33-
onkeydown="if(event.key==='Enter'&&!event.shiftKey){event.preventDefault();this.form.submit();}">
34-
</textarea>
35-
<br><br>
36-
<input type="submit" value="Ask">
37-
</form>
38-
</body>
39-
40-
</html>
41-
"""
33+
return FileResponse("static/index.html", media_type="text/html")
4234

4335
# Handle form submission
44-
@app.post("/ask", response_class=HTMLResponse)
36+
@app.post("/ask", response_class=JSONResponse)
4537
async def ask(prompt: str = Form(...)):
46-
headers = {
47-
"Content-Type": "application/json"
48-
}
49-
50-
api_key = get_api_key()
51-
headers["Authorization"] = f"Bearer {api_key}"
52-
5338
payload = {
5439
"model": MODEL_ID,
5540
"messages": [
@@ -58,59 +43,43 @@ async def ask(prompt: str = Form(...)):
5843
"stream": False
5944
}
6045

46+
reply = get_llm_response(payload)
47+
48+
return {"prompt": prompt, "reply": reply}
49+
50+
def get_llm_response(payload):
51+
api_key = get_api_key()
52+
request_headers = {
53+
"Content-Type": "application/json",
54+
"Authorization": f"Bearer {api_key}"
55+
}
56+
6157
# Log request details
62-
logging.info(f"Sending POST to {ENDPOINT_URL}")
63-
logging.info(f"Request Headers: {headers}")
64-
logging.info(f"Request Payload: {payload}")
58+
logging.debug(f"Sending POST to {LLM_URL}")
59+
logging.debug(f"Request Headers: {request_headers}")
60+
logging.debug(f"Request Payload: {payload}")
6561

6662
response = None
67-
reply = None
6863
try:
69-
response = requests.post(f"{ENDPOINT_URL}", headers=headers, data=json.dumps(payload))
64+
response = requests.post(f"{LLM_URL}", headers=request_headers, data=json.dumps(payload))
7065
except requests.exceptions.HTTPError as errh:
71-
reply = f"HTTP error:", errh
66+
return f"HTTP error:", errh
7267
except requests.exceptions.ConnectionError as errc:
73-
reply = f"Connection error:", errc
68+
return f"Connection error:", errc
7469
except requests.exceptions.Timeout as errt:
75-
reply = f"Timeout error:", errt
70+
return f"Timeout error:", errt
7671
except requests.exceptions.RequestException as err:
77-
reply = f"Unexpected error:", err
72+
return f"Unexpected error:", err
7873

79-
if response is not None:
80-
# logging.info(f"Response Status Code: {response.status_code}")
81-
# logging.info(f"Response Headers: {response.headers}")
82-
# logging.info(f"Response Body: {response.text}")
83-
if response.status_code == 200:
84-
data = response.json()
85-
try:
86-
reply = data["choices"][0]["message"]["content"]
87-
except (KeyError, IndexError):
88-
reply = "Model returned an unexpected response."
89-
elif response.status_code == 400:
90-
reply = f"Connect Error: {response.status_code} - {response.text}"
91-
elif response.status_code == 500:
92-
reply = f"Error from server: {response.status_code} - {response.text}"
93-
else:
94-
# Log error details
95-
reply = f"Error from server: {response.status_code} - {response.text}"
96-
logging.error(f"Error from server: {response.status_code} - {response.text}")
74+
if response is None:
75+
return f"Error: No response from server."
76+
if response.status_code == 400:
77+
return f"Connect Error: {response.status_code} - {response.text}"
78+
if response.status_code == 500:
79+
return f"Error from server: {response.status_code} - {response.text}"
9780

98-
# Return result
99-
return f"""
100-
<html>
101-
<head><title>Ask the AI Model</title></head>
102-
<body>
103-
<h1>Ask the AI Model</h1>
104-
<form method="post" action="/ask" onsubmit="document.getElementById('loader').style.display='block'">
105-
<textarea name="prompt" autofocus="autofocus" rows="5" cols="60" placeholder="Enter your question here..."
106-
onkeydown="if(event.key==='Enter'&&!event.shiftKey){{event.preventDefault();this.form.submit();}}"></textarea><br><br>
107-
<input type="submit" value="Ask">
108-
</form>
109-
<h2>You Asked:</h2>
110-
<p>{prompt}</p>
111-
<hr>
112-
<h2>Model's Reply:</h2>
113-
<p>{reply}</p>
114-
</body>
115-
</html>
116-
"""
81+
try:
82+
data = response.json()
83+
return data["choices"][0]["message"]["content"]
84+
except (KeyError, IndexError):
85+
return "Model returned an unexpected response."
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
async function submitForm(event) {
2+
event.preventDefault();
3+
const prompt = document.getElementById('prompt').value;
4+
document.getElementById('reply').innerHTML = "Loading...";
5+
const response = await fetch('/ask', {
6+
method: 'POST',
7+
headers: {
8+
'Content-Type': 'application/x-www-form-urlencoded'
9+
},
10+
body: new URLSearchParams({prompt})
11+
});
12+
const data = await response.json();
13+
document.getElementById('reply').innerHTML = data.reply || "No reply found.";
14+
}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
<html>
2+
<head>
3+
<title>Ask the AI Model</title>
4+
<script type="text/javascript" src="./static/app.js"></script>
5+
</head>
6+
<body>
7+
<h1>Ask the AI Model</h1>
8+
<form method="post" id="askForm" onsubmit="event.preventDefault(); submitForm(event);">
9+
<textarea id="prompt" name="prompt" autofocus="autofocus" rows="5" cols="60" placeholder="Enter your question here..."
10+
onkeydown="if(event.key==='Enter'&&!event.shiftKey){event.preventDefault();this.form.dispatchEvent(new Event('submit', {cancelable:true}));}"></textarea>
11+
<br><br>
12+
<input type="submit" value="Ask">
13+
</form>
14+
<hr>
15+
<h2>Model's Reply:</h2>
16+
<p id="reply"></p>
17+
</body>
18+
</html>

samples/managed-llm/compose.local.yaml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,12 @@ services:
33
extends:
44
file: compose.yaml
55
service: app
6+
volumes:
7+
- ./app:/app
68
llm:
79
extends:
8-
file: compose.yaml
9-
service: llm
10+
file: compose.yaml
11+
service: llm
1012
# if using AWS Bedrock for local development, include this section:
1113
environment:
1214
- AWS_REGION=${AWS_REGION} # replace with your AWS region

samples/managed-llm/compose.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@ services:
77
- "8000:8000"
88
restart: always
99
environment:
10-
- ENDPOINT_URL=http://llm/api/v1/chat/completions # endpoint to the gateway service
11-
- MODEL=default # LLM model ID used for the gateway.
10+
- LLM_URL=http://llm/api/v1/ # endpoint to the gateway service
11+
- MODEL=default # LLM model ID used for the gateway.
1212
# For other models, see https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping
1313
- OPENAI_API_KEY=FAKE_TOKEN # the actual value will be ignored when using the gateway, but it should match the one in the llm service
1414
healthcheck:

0 commit comments

Comments
 (0)