Skip to content

Commit 4753ef6

Browse files
committed
update for for llm deployment and model mapping
1 parent b382aca commit 4753ef6

File tree

5 files changed

+205
-25
lines changed

5 files changed

+205
-25
lines changed

blog/2025-04-11-mar-product-updates.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Wow - another month has gone by, time flies when you're having fun!
2525

2626
Let us share some important updates regarding what we achieved at Defang in March:
2727

28-
**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploying-openai-apps-aws-bedrock) without making any changes to your code. Support for GCP and DigitalOcean coming soon.
28+
**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploying-openai-apps) without making any changes to your code. Support for GCP and DigitalOcean coming soon.
2929

3030
**Defang Pulumi Provider:** Last month, we announced a preview of the [Defang Pulumi Provider](https://github.com/DefangLabs/pulumi-defang), and this month we are excited to announce that V1 is now available in the [Pulumi Registry](https://www.pulumi.com/registry/packages/defang/). As much as we love Docker, we realize there are many real-world apps that have components that (currently) cannot be described completely in a Compose file. With the Defang Pulumi Provider, you can now leverage [the declarative simplicity of Defang with the imperative power of Pulumi](https://docs.defang.io/docs/concepts/pulumi#when-to-use-the-defang-pulumi-provider).
3131

docs/concepts/managed-llms/openai-access-gateway.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ sidebar_position: 3000
99
Defang makes it easy to deploy on your favourite cloud's managed LLM service with our [OpenAI Access Gateway](https://github.com/DefangLabs/openai-access-gateway). This service sits between your application and the cloud service and acts as a compatibility layer.
1010
It handles incoming OpenAI requests, translates those requests to the appropriate cloud-native API, handles the native response, and re-constructs an OpenAI-compatible response.
1111

12-
See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex/) which describes how to configure the OpenAI Access Gateway for your application
12+
See [our tutorial](/docs/tutorials/deploying-openai-apps) which describes how to configure the OpenAI Access Gateway for your application
1313

1414
## Docker Provider Services
1515

Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
---
2+
title: Deploying your OpenAI Application to AWS Bedrock
3+
sidebar_position: 50
4+
---
5+
6+
# Deploying Your OpenAI Application to AWS Bedrock
7+
8+
Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud on **AWS Bedrock**.
9+
10+
This tutorial shows you how **Defang** makes it easy.
11+
12+
Suppose you start with a compose file like this:
13+
14+
```yaml
15+
services:
16+
app:
17+
build:
18+
context: .
19+
ports:
20+
- 3000:3000
21+
environment:
22+
OPENAI_API_KEY:
23+
healthcheck:
24+
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
25+
```
26+
27+
---
28+
29+
## Add an LLM Service to Your Compose File
30+
31+
You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock).
32+
33+
Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service:
34+
35+
```diff
36+
+ llm:
37+
+ image: defangio/openai-access-gateway
38+
+ x-defang-llm: true
39+
+ ports:
40+
+ - target: 80
41+
+ published: 80
42+
+ mode: host
43+
+ environment:
44+
+ - OPENAI_API_KEY
45+
+ - REGION
46+
```
47+
48+
### Notes:
49+
50+
- The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements.
51+
- `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services.
52+
- New environment variables:
53+
- `REGION` is the zone where the services runs (for AWS this is the equvilent of AWS_REGION)
54+
55+
:::tip
56+
**OpenAI Key**
57+
58+
You no longer need your original OpenAI API Key.
59+
We recommend generating a random secret for authentication with the gateway:
60+
61+
```bash
62+
defang config set OPENAI_API_KEY --random
63+
```
64+
:::
65+
66+
---
67+
68+
## Redirect Application Traffic
69+
70+
Modify your `app` service to send API calls to the `openai-access-gateway`:
71+
72+
```diff
73+
services:
74+
app:
75+
ports:
76+
- 3000:3000
77+
environment:
78+
OPENAI_API_KEY:
79+
+ OPENAI_BASE_URL: "http://llm/api/v1"
80+
healthcheck:
81+
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
82+
```
83+
84+
Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock.
85+
86+
---
87+
88+
## Selecting a Model
89+
90+
You should configure your application to specify the model you want to use.
91+
92+
```diff
93+
services:
94+
app:
95+
ports:
96+
- 3000:3000
97+
environment:
98+
OPENAI_API_KEY:
99+
OPENAI_BASE_URL: "http://llm/api/v1"
100+
+ MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
101+
healthcheck:
102+
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
103+
```
104+
105+
Choose the correct `MODEL` depending on which cloud provider you are using.
106+
107+
:::info
108+
**Choosing the Right Model**
109+
110+
- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
111+
:::
112+
113+
Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to
114+
the closest equilavent on the target platform. If no such match can be found a fallback can be defined to use a known existing model (e.g. ai/mistral). These environment
115+
variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively.
116+
117+
118+
:::info
119+
# Complete Example Compose File
120+
121+
```yaml
122+
services:
123+
app:
124+
build:
125+
context: .
126+
ports:
127+
- 3000:3000
128+
environment:
129+
OPENAI_API_KEY:
130+
OPENAI_BASE_URL: "http://llm/api/v1"
131+
MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
132+
healthcheck:
133+
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
134+
135+
llm:
136+
image: defangio/openai-access-gateway
137+
x-defang-llm: true
138+
ports:
139+
- target: 80
140+
published: 80
141+
mode: host
142+
environment:
143+
- OPENAI_API_KEY
144+
- REGION
145+
```
146+
147+
---
148+
149+
# Environment Variable Matrix
150+
151+
| Variable | AWS Bedrock |
152+
|--------------------|-------------|
153+
| `REGION` | Required|
154+
| `MODEL` | Bedrock model ID / Docker model name |
155+
156+
---
157+
158+
You now have a single app that can:
159+
160+
- Talk to **GCP Vertex AI**
161+
- Use the same OpenAI-compatible client code
162+
- Easily switch cloud providers by changing a few environment variables
163+
:::
164+

docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx renamed to docs/tutorials/deploying-openai-apps-gcp-vertex.mdx

Lines changed: 24 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
---
2-
title: Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI
2+
title: Deploying your OpenAI Application to GCP Vertex AI
33
sidebar_position: 50
44
---
55

6-
# Deploying Your OpenAI Application to AWS Bedrock or GCP Vertex AI
6+
# Deploying Your OpenAI Application to GCP Vertex AI
77

8-
Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud, either on **AWS Bedrock** or **GCP Vertex AI**.
8+
Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud on **GCP Vertex AI**.
99

1010
This tutorial shows you how **Defang** makes it easy.
1111

@@ -28,7 +28,7 @@ services:
2828
2929
## Add an LLM Service to Your Compose File
3030
31-
You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock or Vertex).
31+
You need to add a new service that acts as a proxy between your app and the backend LLM provider (Vertex).
3232
3333
Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service:
3434
@@ -42,7 +42,7 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce
4242
+ mode: host
4343
+ environment:
4444
+ - OPENAI_API_KEY
45-
+ - GCP_PROJECT_ID # if using GCP Vertex AI
45+
+ - GCP_PROJECT_ID
4646
+ - REGION
4747
```
4848

@@ -51,8 +51,8 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce
5151
- The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements.
5252
- `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services.
5353
- New environment variables:
54-
- `REGION` is the zone where the services runs (for AWS this is equvilent of AWS_REGION)
55-
- `GCP_PROJECT_ID` is needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `REGION` = us-central1)
54+
- `REGION` is the zone where the services runs (e.g. us-central1)
55+
- `GCP_PROJECT_ID` is your project to deploy to (e.g. my-project-456789)
5656

5757
:::tip
5858
**OpenAI Key**
@@ -83,7 +83,7 @@ Modify your `app` service to send API calls to the `openai-access-gateway`:
8383
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
8484
```
8585

86-
Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock or GCP Vertex.
86+
Now, all OpenAI traffic will be routed through your gateway service and onto GCP Vertex AI.
8787

8888
---
8989

@@ -99,24 +99,25 @@ You should configure your application to specify the model you want to use.
9999
environment:
100100
OPENAI_API_KEY:
101101
OPENAI_BASE_URL: "http://llm/api/v1"
102-
+ MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # for Bedrock
103-
+ # MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI
102+
+ MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI
104103
healthcheck:
105104
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
106105
```
107106

108107
Choose the correct `MODEL` depending on which cloud provider you are using.
109108

110-
Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to
111-
the closest matching one on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment
112-
variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively.
113-
114109
:::info
115110
**Choosing the Right Model**
116111

117-
- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
118112
- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup)
113+
:::
119114

115+
Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to
116+
the closest matching one on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment
117+
variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively.
118+
119+
120+
:::info
120121
# Complete Example Compose File
121122

122123
```yaml
@@ -129,7 +130,7 @@ services:
129130
environment:
130131
OPENAI_API_KEY:
131132
OPENAI_BASE_URL: "http://llm/api/v1"
132-
MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # or your Vertex AI model path
133+
MODEL: "google/gemini-2.5-pro-preview-03-25"
133134
healthcheck:
134135
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
135136

@@ -142,25 +143,25 @@ services:
142143
mode: host
143144
environment:
144145
- OPENAI_API_KEY
145-
- GCP_PROJECT_ID # required if using Vertex AI
146+
- GCP_PROJECT_ID # required if using GCP Vertex AI
146147
- REGION
147148
```
148149
149150
---
150151
151152
# Environment Variable Matrix
152153
153-
| Variable | AWS Bedrock | GCP Vertex AI |
154-
|--------------------|-------------|---------------|
155-
| `GCP_PROJECT_ID` | _(not used)_| Required |
156-
| `REGION` | Required| Required |
157-
| `MODEL` | Bedrock model ID / Docker model name | Vertex model / Docker model name |
154+
| Variable | GCP Vertex AI |
155+
|--------------------|---------------|
156+
| `GCP_PROJECT_ID` | Required |
157+
| `REGION` | Required |
158+
| `MODEL` | Vertex model / Docker model name |
158159

159160
---
160161

161162
You now have a single app that can:
162163

163-
- Talk to **AWS Bedrock** or **GCP Vertex AI**
164+
- Talk to **GCP Vertex AI**
164165
- Use the same OpenAI-compatible client code
165166
- Easily switch cloud providers by changing a few environment variables
166167
:::
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
title: Deploying your OpenAI Application
3+
sidebar_position: 50
4+
---
5+
6+
# Deploying Your OpenAI application
7+
8+
Defang currently supports LLM using AWS Bedrock and GCP Vertex AI. Follow the link below for your specific platform.
9+
10+
- [AWS Bedrock](/docs/tutorials/deploying-openai-apps-aws-bedrock/)
11+
- [GCP Vertex AI](/docs/tutorials/deploying-openai-apps-gcp-vertex/).
12+
13+
14+
15+

0 commit comments

Comments
 (0)