Skip to content

Commit e0d02ad

Browse files
Merge pull request #372 from DefangLabs/linda-managed-llm-provider
Managed LLM (with Provider) Sample
2 parents 2aebbf8 + 9e1de3a commit e0d02ad

File tree

12 files changed

+340
-0
lines changed

12 files changed

+340
-0
lines changed
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
2+
FROM mcr.microsoft.com/devcontainers/python:alpine3.13
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
{
2+
"build": {
3+
"dockerfile": "Dockerfile",
4+
"context": ".."
5+
},
6+
"features": {
7+
"ghcr.io/defanglabs/devcontainer-feature/defang-cli:1.0.4": {},
8+
"ghcr.io/devcontainers/features/docker-in-docker:2": {},
9+
"ghcr.io/devcontainers/features/aws-cli:1": {}
10+
}
11+
}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# Default .dockerignore file for Defang
2+
**/__pycache__
3+
**/.git
4+
**/.github
5+
**/compose.*.yaml
6+
**/compose.*.yml
7+
**/compose.yaml
8+
**/compose.yml
9+
**/docker-compose.*.yaml
10+
**/docker-compose.*.yml
11+
**/docker-compose.yaml
12+
**/docker-compose.yml
13+
Dockerfile
14+
*.Dockerfile
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
name: Deploy
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
8+
jobs:
9+
deploy:
10+
environment: playground
11+
runs-on: ubuntu-latest
12+
permissions:
13+
contents: read
14+
id-token: write
15+
16+
steps:
17+
- name: Checkout Repo
18+
uses: actions/checkout@v4
19+
20+
- name: Deploy
21+
uses: DefangLabs/[email protected]
22+
with:
23+
config-env-vars: MODEL
24+
env:
25+
MODEL: ${{ secrets.MODEL }}
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
.env
2+
myenv
3+
__pycache__/
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# Managed LLM with Docker Model Provider
2+
3+
[![1-click-deploy](https://raw.githubusercontent.com/DefangLabs/defang-assets/main/Logos/Buttons/SVG/deploy-with-defang.svg)](https://portal.defang.dev/redirect?url=https%3A%2F%2Fgithub.com%2Fnew%3Ftemplate_name%3Dsample-managed-llm-provider-template%26template_owner%3DDefangSamples)
4+
5+
This sample application demonstrates using Managed LLMs with a Docker Model Provider, deployed with Defang.
6+
7+
> Note: This version uses a [Docker Model Provider](https://docs.docker.com/compose/how-tos/model-runner/#provider-services) for managing LLMs. For the version with Defang's [OpenAI Access Gateway](https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway), please see our [*Managed LLM Sample*](https://github.com/DefangLabs/samples/tree/main/samples/managed-llm) instead.
8+
9+
The Docker Model Provider allows users to use AWS Bedrock or Google Cloud Vertex AI models with their application. It is a service in the `compose.yaml` file.
10+
11+
You can configure the `MODEL` and `ENDPOINT_URL` for the LLM separately for local development and production environments.
12+
* The `MODEL` is the LLM Model ID you are using.
13+
* The `ENDPOINT_URL` is the bridge that provides authenticated access to the LLM model.
14+
15+
Ensure you have enabled model access for the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access).
16+
17+
### Docker Model Provider
18+
19+
In the `compose.yaml` file, the `llm` service will route requests to the LLM API model using a [Docker Model Provider](https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#docker-model-provider-services).
20+
21+
The `x-defang-llm` property on the `llm` service must be set to `true` in order to use the Docker Model Provider when deploying with Defang.
22+
23+
## Prerequisites
24+
25+
1. Download [Defang CLI](https://github.com/DefangLabs/defang)
26+
2. (Optional) If you are using [Defang BYOC](https://docs.defang.io/docs/concepts/defang-byoc) authenticate with your cloud provider account
27+
3. (Optional for local development) [Docker CLI](https://docs.docker.com/engine/install/)
28+
29+
## Development
30+
31+
To run the application locally, you can use the following command:
32+
33+
```bash
34+
docker compose -f compose.dev.yaml up --build
35+
```
36+
37+
## Configuration
38+
39+
For this sample, you will need to provide the following [configuration](https://docs.defang.io/docs/concepts/configuration):
40+
41+
> Note that if you are using the 1-click deploy option, you can set these values as secrets in your GitHub repository and the action will automatically deploy them for you.
42+
43+
### `MODEL`
44+
The Model ID of the LLM you are using for your application. For example, `anthropic.claude-3-5-sonnet-20241022-v2:0`.
45+
```bash
46+
defang config set MODEL
47+
```
48+
49+
## Deployment
50+
51+
> [!NOTE]
52+
> Download [Defang CLI](https://github.com/DefangLabs/defang)
53+
54+
### Defang Playground
55+
56+
Deploy your application to the Defang Playground by opening up your terminal and typing:
57+
```bash
58+
defang compose up
59+
```
60+
61+
### BYOC
62+
63+
If you want to deploy to your own cloud account, you can [use Defang BYOC](https://docs.defang.io/docs/tutorials/deploy-to-your-cloud).
64+
65+
---
66+
67+
Title: Managed LLM with Docker Model Provider
68+
69+
Short Description: An app using Managed LLMs with a Docker Model Provider, deployed with Defang.
70+
71+
Tags: LLM, Python, Bedrock, Vertex, Docker Model Provider
72+
73+
Languages: Python
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Default .dockerignore file for Defang
2+
**/__pycache__
3+
**/.direnv
4+
**/.DS_Store
5+
**/.envrc
6+
**/.git
7+
**/.github
8+
**/.idea
9+
**/.next
10+
**/.vscode
11+
**/compose.*.yaml
12+
**/compose.*.yml
13+
**/compose.yaml
14+
**/compose.yml
15+
**/docker-compose.*.yaml
16+
**/docker-compose.*.yml
17+
**/docker-compose.yaml
18+
**/docker-compose.yml
19+
**/node_modules
20+
**/Thumbs.db
21+
Dockerfile
22+
*.Dockerfile
23+
# Ignore our own binary, but only in the root to avoid ignoring subfolders
24+
defang
25+
defang.exe
26+
# Ignore our project-level state
27+
.defang
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
FROM public.ecr.aws/docker/library/python:3.12-slim
2+
3+
# Set working directory
4+
WORKDIR /app
5+
6+
# Copy requirement files first (for better Docker cache)
7+
COPY requirements.txt .
8+
9+
# Install dependencies
10+
RUN pip install --no-cache-dir -r requirements.txt
11+
12+
# Copy the rest of the code
13+
COPY . .
14+
15+
# Expose the port that Uvicorn will run on
16+
EXPOSE 8000
17+
18+
# Set environment variable for the port
19+
ENV PORT=8000
20+
21+
# Run the app with the correct module path using shell form to interpolate environment variable
22+
CMD ["sh", "-c", "uvicorn app:app --host 0.0.0.0 --port $PORT"]
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
import os
2+
import json
3+
import logging
4+
from fastapi import FastAPI, Form, Request
5+
from fastapi.responses import HTMLResponse
6+
import requests
7+
8+
app = FastAPI()
9+
10+
# Configure basic logging
11+
logging.basicConfig(level=logging.INFO)
12+
13+
# Set the environment variables for the chat model
14+
ENDPOINT_URL = os.getenv("ENDPOINT_URL", "https://api.openai.com/v1/chat/completions")
15+
# Fallback to OpenAI Model if not set in environment
16+
MODEL_ID = os.getenv("MODEL", "gpt-4-turbo")
17+
18+
# Get the API key for the LLM
19+
# For development, you can use your local API key. In production, the LLM gateway service will override the need for it.
20+
def get_api_key():
21+
return os.getenv("OPENAI_API_KEY", "API key not set")
22+
23+
# Home page form
24+
@app.get("/", response_class=HTMLResponse)
25+
async def home():
26+
return """
27+
<html>
28+
<head><title>Ask the AI Model</title></head>
29+
<body>
30+
<h1>Ask the AI Model</h1>
31+
<form method="post" action="/ask" onsubmit="document.getElementById('loader').style.display='block'">
32+
<textarea name="prompt" autofocus="autofocus" rows="5" cols="60" placeholder="Enter your question here..."
33+
onkeydown="if(event.key==='Enter'&&!event.shiftKey){event.preventDefault();this.form.submit();}">
34+
</textarea>
35+
<br><br>
36+
<input type="submit" value="Ask">
37+
</form>
38+
</body>
39+
40+
</html>
41+
"""
42+
43+
# Handle form submission
44+
@app.post("/ask", response_class=HTMLResponse)
45+
async def ask(prompt: str = Form(...)):
46+
headers = {
47+
"Content-Type": "application/json"
48+
}
49+
50+
api_key = get_api_key()
51+
headers["Authorization"] = f"Bearer {api_key}"
52+
53+
payload = {
54+
"model": MODEL_ID,
55+
"messages": [
56+
{"role": "user", "content": prompt}
57+
],
58+
"stream": False
59+
}
60+
61+
# Log request details
62+
logging.info(f"Sending POST to {ENDPOINT_URL}")
63+
logging.info(f"Request Headers: {headers}")
64+
logging.info(f"Request Payload: {payload}")
65+
66+
response = None
67+
reply = None
68+
try:
69+
response = requests.post(f"{ENDPOINT_URL}", headers=headers, data=json.dumps(payload))
70+
except requests.exceptions.HTTPError as errh:
71+
reply = f"HTTP error:", errh
72+
except requests.exceptions.ConnectionError as errc:
73+
reply = f"Connection error:", errc
74+
except requests.exceptions.Timeout as errt:
75+
reply = f"Timeout error:", errt
76+
except requests.exceptions.RequestException as err:
77+
reply = f"Unexpected error:", err
78+
79+
if response is not None:
80+
# logging.info(f"Response Status Code: {response.status_code}")
81+
# logging.info(f"Response Headers: {response.headers}")
82+
# logging.info(f"Response Body: {response.text}")
83+
if response.status_code == 200:
84+
data = response.json()
85+
try:
86+
reply = data["choices"][0]["message"]["content"]
87+
except (KeyError, IndexError):
88+
reply = "Model returned an unexpected response."
89+
elif response.status_code == 400:
90+
reply = f"Connect Error: {response.status_code} - {response.text}"
91+
elif response.status_code == 500:
92+
reply = f"Error from server: {response.status_code} - {response.text}"
93+
else:
94+
# Log error details
95+
reply = f"Error from server: {response.status_code} - {response.text}"
96+
logging.error(f"Error from server: {response.status_code} - {response.text}")
97+
98+
# Return result
99+
return f"""
100+
<html>
101+
<head><title>Ask the AI Model</title></head>
102+
<body>
103+
<h1>Ask the AI Model</h1>
104+
<form method="post" action="/ask" onsubmit="document.getElementById('loader').style.display='block'">
105+
<textarea name="prompt" autofocus="autofocus" rows="5" cols="60" placeholder="Enter your question here..."
106+
onkeydown="if(event.key==='Enter'&&!event.shiftKey){{event.preventDefault();this.form.submit();}}"></textarea><br><br>
107+
<input type="submit" value="Ask">
108+
</form>
109+
<h2>You Asked:</h2>
110+
<p>{prompt}</p>
111+
<hr>
112+
<h2>Model's Reply:</h2>
113+
<p>{reply}</p>
114+
</body>
115+
</html>
116+
"""
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
dotenv
2+
fastapi
3+
python-multipart
4+
requests
5+
uvicorn

0 commit comments

Comments
 (0)