Skip to content

Commit 2aebbf8

Browse files
Merge pull request #369 from DefangLabs/linda-managed-llm
Add Managed LLM sample
2 parents 4469657 + 8611066 commit 2aebbf8

File tree

12 files changed

+326
-0
lines changed

12 files changed

+326
-0
lines changed

.github/workflows/deploy-changed-samples.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ jobs:
8282
TEST_MB_DB_PASS: ${{ secrets.TEST_MB_DB_PASS }}
8383
TEST_MB_DB_PORT: ${{ secrets.TEST_MB_DB_PORT }}
8484
TEST_MB_DB_USER: ${{ secrets.TEST_MB_DB_USER }}
85+
TEST_MODEL: ${{ secrets.TEST_MODEL }}
8586
TEST_MONGO_INITDB_ROOT_USERNAME: ${{ secrets.TEST_MONGO_INITDB_ROOT_USERNAME }}
8687
TEST_MONGO_INITDB_ROOT_PASSWORD: ${{ secrets.TEST_MONGO_INITDB_ROOT_PASSWORD }}
8788
TEST_NC_DB: ${{ secrets.TEST_NC_DB }}
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
2+
FROM mcr.microsoft.com/devcontainers/python:alpine3.13
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
{
2+
"build": {
3+
"dockerfile": "Dockerfile",
4+
"context": ".."
5+
},
6+
"features": {
7+
"ghcr.io/defanglabs/devcontainer-feature/defang-cli:1.0.4": {},
8+
"ghcr.io/devcontainers/features/docker-in-docker:2": {},
9+
"ghcr.io/devcontainers/features/aws-cli:1": {}
10+
}
11+
}

samples/managed-llm/.dockerignore

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# Default .dockerignore file for Defang
2+
**/__pycache__
3+
**/.git
4+
**/.github
5+
**/compose.*.yaml
6+
**/compose.*.yml
7+
**/compose.yaml
8+
**/compose.yml
9+
**/docker-compose.*.yaml
10+
**/docker-compose.*.yml
11+
**/docker-compose.yaml
12+
**/docker-compose.yml
13+
Dockerfile
14+
*.Dockerfile
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
name: Deploy
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
8+
jobs:
9+
deploy:
10+
environment: playground
11+
runs-on: ubuntu-latest
12+
permissions:
13+
contents: read
14+
id-token: write
15+
16+
steps:
17+
- name: Checkout Repo
18+
uses: actions/checkout@v4
19+
20+
- name: Deploy
21+
uses: DefangLabs/[email protected]
22+
with:
23+
config-env-vars: MODEL
24+
env:
25+
MODEL: ${{ secrets.MODEL }}

samples/managed-llm/.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
.env
2+
myenv
3+
__pycache__/

samples/managed-llm/README.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Managed LLM
2+
3+
[![1-click-deploy](https://raw.githubusercontent.com/DefangLabs/defang-assets/main/Logos/Buttons/SVG/deploy-with-defang.svg)](https://portal.defang.dev/redirect?url=https%3A%2F%2Fgithub.com%2Fnew%3Ftemplate_name%3Dsample-managed-llm-template%26template_owner%3DDefangSamples)
4+
5+
This sample application demonstrates the use of OpenAI-compatible Managed LLMs (Large Language Models) with Defang.
6+
7+
8+
> Note: Using Docker Model Provider? See our [*Managed LLM with Docker Model Provider*](https://github.com/DefangLabs/samples/tree/main/samples/managed-llm-provider) sample.
9+
10+
The OpenAI-compatible managed LLM feature, provided by the Defang OpenAI Access Gateway, allows users to use AWS Bedrock or Google Cloud Vertex AI with an OpenAI compatible SDK. This enables switching from OpenAI to one of these cloud-native platforms without modifying your application code.
11+
12+
You can configure the `MODEL` and `ENDPOINT_URL` for the LLM separately for local development and production environments.
13+
* The `MODEL` is the LLM Model ID you are using.
14+
* The `ENDPOINT_URL` is the bridge that provides authenticated access to the LLM model.
15+
16+
Ensure you have enabled model access for the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access).
17+
18+
### Defang OpenAI Access Gateway
19+
20+
In the `compose.yaml` file, the `llm` service is used to route requests to the LLM API model. This is known as the Defang OpenAI Access Gateway.
21+
22+
The `x-defang-llm` property on the `llm` service must be set to `true` in order to use the OpenAI Access Gateway when deploying with Defang.
23+
24+
## Prerequisites
25+
26+
1. Download [Defang CLI](https://github.com/DefangLabs/defang)
27+
2. (Optional) If you are using [Defang BYOC](https://docs.defang.io/docs/concepts/defang-byoc) authenticate with your cloud provider account
28+
3. (Optional for local development) [Docker CLI](https://docs.docker.com/engine/install/)
29+
30+
## Development
31+
32+
To run the application locally, you can use the following command:
33+
34+
```bash
35+
docker compose -f compose.dev.yaml up --build
36+
```
37+
38+
## Configuration
39+
40+
For this sample, you will need to provide the following [configuration](https://docs.defang.io/docs/concepts/configuration):
41+
42+
> Note that if you are using the 1-click deploy option, you can set these values as secrets in your GitHub repository and the action will automatically deploy them for you.
43+
44+
### `MODEL`
45+
The Model ID of the LLM you are using for your application. For example, `anthropic.claude-3-5-sonnet-20241022-v2:0`.
46+
```bash
47+
defang config set MODEL
48+
```
49+
50+
## Deployment
51+
52+
> [!NOTE]
53+
> Download [Defang CLI](https://github.com/DefangLabs/defang)
54+
55+
### Defang Playground
56+
57+
Deploy your application to the Defang Playground by opening up your terminal and typing:
58+
```bash
59+
defang compose up
60+
```
61+
62+
### BYOC
63+
64+
If you want to deploy to your own cloud account, you can [use Defang BYOC](https://docs.defang.io/docs/tutorials/deploy-to-your-cloud).
65+
66+
---
67+
68+
Title: Managed LLM
69+
70+
Short Description: An app using Managed LLMs with Defang's OpenAI Access Gateway.
71+
72+
Tags: LLM, OpenAI, Python, Bedrock, Vertex
73+
74+
Languages: Python

samples/managed-llm/app/Dockerfile

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
FROM public.ecr.aws/docker/library/python:3.12-slim
2+
3+
# Set working directory
4+
WORKDIR /app
5+
6+
# Copy requirement files first (for better Docker cache)
7+
COPY requirements.txt .
8+
9+
# Install dependencies
10+
RUN pip install --no-cache-dir -r requirements.txt
11+
12+
# Copy the rest of the code
13+
COPY . .
14+
15+
# Expose the port that Uvicorn will run on
16+
EXPOSE 8000
17+
18+
# Set environment variable for the port
19+
ENV PORT=8000
20+
21+
# Run the app with the correct module path using shell form to interpolate environment variable
22+
CMD ["sh", "-c", "uvicorn app:app --host 0.0.0.0 --port $PORT"]

samples/managed-llm/app/app.py

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
import os
2+
import json
3+
import logging
4+
from fastapi import FastAPI, Form, Request
5+
from fastapi.responses import HTMLResponse
6+
import requests
7+
8+
app = FastAPI()
9+
10+
# Configure basic logging
11+
logging.basicConfig(level=logging.INFO)
12+
13+
# Set the environment variables for the chat model
14+
ENDPOINT_URL = os.getenv("ENDPOINT_URL", "https://api.openai.com/v1/chat/completions")
15+
# Fallback to OpenAI Model if not set in environment
16+
MODEL_ID = os.getenv("MODEL", "gpt-4-turbo")
17+
18+
# Get the API key for the LLM
19+
# For development, you can use your local API key. In production, the LLM gateway service will override the need for it.
20+
def get_api_key():
21+
return os.getenv("OPENAI_API_KEY", "API key not set")
22+
23+
# Home page form
24+
@app.get("/", response_class=HTMLResponse)
25+
async def home():
26+
return """
27+
<html>
28+
<head><title>Ask the AI Model</title></head>
29+
<body>
30+
<h1>Ask the AI Model</h1>
31+
<form method="post" action="/ask" onsubmit="document.getElementById('loader').style.display='block'">
32+
<textarea name="prompt" autofocus="autofocus" rows="5" cols="60" placeholder="Enter your question here..."
33+
onkeydown="if(event.key==='Enter'&&!event.shiftKey){event.preventDefault();this.form.submit();}">
34+
</textarea>
35+
<br><br>
36+
<input type="submit" value="Ask">
37+
</form>
38+
</body>
39+
40+
</html>
41+
"""
42+
43+
# Handle form submission
44+
@app.post("/ask", response_class=HTMLResponse)
45+
async def ask(prompt: str = Form(...)):
46+
headers = {
47+
"Content-Type": "application/json"
48+
}
49+
50+
api_key = get_api_key()
51+
headers["Authorization"] = f"Bearer {api_key}"
52+
53+
payload = {
54+
"model": MODEL_ID,
55+
"messages": [
56+
{"role": "user", "content": prompt}
57+
],
58+
"stream": False
59+
}
60+
61+
# Log request details
62+
logging.info(f"Sending POST to {ENDPOINT_URL}")
63+
logging.info(f"Request Headers: {headers}")
64+
logging.info(f"Request Payload: {payload}")
65+
66+
response = None
67+
reply = None
68+
try:
69+
response = requests.post(f"{ENDPOINT_URL}", headers=headers, data=json.dumps(payload))
70+
except requests.exceptions.HTTPError as errh:
71+
reply = f"HTTP error:", errh
72+
except requests.exceptions.ConnectionError as errc:
73+
reply = f"Connection error:", errc
74+
except requests.exceptions.Timeout as errt:
75+
reply = f"Timeout error:", errt
76+
except requests.exceptions.RequestException as err:
77+
reply = f"Unexpected error:", err
78+
79+
if response is not None:
80+
# logging.info(f"Response Status Code: {response.status_code}")
81+
# logging.info(f"Response Headers: {response.headers}")
82+
# logging.info(f"Response Body: {response.text}")
83+
if response.status_code == 200:
84+
data = response.json()
85+
try:
86+
reply = data["choices"][0]["message"]["content"]
87+
except (KeyError, IndexError):
88+
reply = "Model returned an unexpected response."
89+
elif response.status_code == 400:
90+
reply = f"Connect Error: {response.status_code} - {response.text}"
91+
elif response.status_code == 500:
92+
reply = f"Error from server: {response.status_code} - {response.text}"
93+
else:
94+
# Log error details
95+
reply = f"Error from server: {response.status_code} - {response.text}"
96+
logging.error(f"Error from server: {response.status_code} - {response.text}")
97+
98+
# Return result
99+
return f"""
100+
<html>
101+
<head><title>Ask the AI Model</title></head>
102+
<body>
103+
<h1>Ask the AI Model</h1>
104+
<form method="post" action="/ask" onsubmit="document.getElementById('loader').style.display='block'">
105+
<textarea name="prompt" autofocus="autofocus" rows="5" cols="60" placeholder="Enter your question here..."
106+
onkeydown="if(event.key==='Enter'&&!event.shiftKey){{event.preventDefault();this.form.submit();}}"></textarea><br><br>
107+
<input type="submit" value="Ask">
108+
</form>
109+
<h2>You Asked:</h2>
110+
<p>{prompt}</p>
111+
<hr>
112+
<h2>Model's Reply:</h2>
113+
<p>{reply}</p>
114+
</body>
115+
</html>
116+
"""
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
dotenv
2+
fastapi
3+
python-multipart
4+
requests
5+
uvicorn

0 commit comments

Comments
 (0)