Skip to content

Commit fbf3244

Browse files
authored
Merge pull request #230 from DefangLabs/linda-fix-llm-docs
Improvements to Managed LLM docs
2 parents 8158798 + 59ced8e commit fbf3244

File tree

7 files changed

+252
-40
lines changed

7 files changed

+252
-40
lines changed

blog/2025-04-11-mar-product-updates.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Wow - another month has gone by, time flies when you're having fun!
2525

2626
Let us share some important updates regarding what we achieved at Defang in March:
2727

28-
**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploying-openai-apps-aws-bedrock) without making any changes to your code. Support for GCP and DigitalOcean coming soon.
28+
**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploy-openai-apps) without making any changes to your code. Support for GCP and DigitalOcean coming soon.
2929

3030
**Defang Pulumi Provider:** Last month, we announced a preview of the [Defang Pulumi Provider](https://github.com/DefangLabs/pulumi-defang), and this month we are excited to announce that V1 is now available in the [Pulumi Registry](https://www.pulumi.com/registry/packages/defang/). As much as we love Docker, we realize there are many real-world apps that have components that (currently) cannot be described completely in a Compose file. With the Defang Pulumi Provider, you can now leverage [the declarative simplicity of Defang with the imperative power of Pulumi](https://docs.defang.io/docs/concepts/pulumi#when-to-use-the-defang-pulumi-provider).
3131

docs/concepts/managed-llms/managed-language-models.md

Lines changed: 19 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,27 @@ sidebar_position: 3000
88

99
Each cloud provider offers their own managed Large Language Model services. AWS offers Bedrock, GCP offers Vertex AI, and Digital Ocean offers their GenAI platform. Defang makes it easy to leverage these services in your projects.
1010

11+
## Current Support
12+
13+
| Provider | Managed Language Models |
14+
| --- | --- |
15+
| [Playground](/docs/providers/playground#managed-large-language-models) ||
16+
| [AWS Bedrock](/docs/providers/aws#managed-large-language-models) ||
17+
| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) ||
18+
| [GCP Vertex AI](/docs/providers/gcp#managed-large-language-models) ||
19+
1120
## Usage
1221

1322
In order to leverage cloud-native managed language models from your Defang services, all you need to do is add the `x-defang-llm` extension to the service config and Defang will configure the approprate roles and permissions for you.
1423

1524
## Example
1625

26+
:::info
27+
Ensure you have enabled model access for the model you intend to use:
28+
* [Configure AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html)
29+
* [Configure GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access)
30+
:::
31+
1732
Assume you have a web service like the following, which uses the cloud native SDK, for example:
1833

1934
```diff
@@ -26,13 +41,8 @@ Assume you have a web service like the following, which uses the cloud native SD
2641

2742
## Deploying OpenAI-compatible apps
2843

29-
If you already have an OpenAI-compatible application, Defang makes it easy to deploy on your favourite cloud's managed LLM service. See our [OpenAI Access Gateway](/docs/concepts/managed-llms/openai-access-gateway)
30-
31-
## Current Support
44+
If you already have an OpenAI-compatible application, Defang makes it easy to deploy on your favourite cloud's managed LLM service. See our [OpenAI Access Gateway](/docs/concepts/managed-llms/openai-access-gateway).
3245

33-
| Provider | Managed Language Models |
34-
| --- | --- |
35-
| [Playground](/docs/providers/playground#managed-large-language-models) ||
36-
| [AWS Bedrock](/docs/providers/aws#managed-large-language-models) ||
37-
| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) ||
38-
| [GCP Vertex AI](/docs/providers/gcp#managed-large-language-models) ||
46+
:::tip
47+
Defang has a [*Managed LLM sample*](https://github.com/DefangLabs/samples/tree/main/samples/managed-llm) that uses the OpenAI Access Gateway, and a [*Managed LLM with Docker Model Provider sample*](https://github.com/DefangLabs/samples/tree/main/samples/managed-llm-provider) that uses a Docker Model Provider.
48+
:::

docs/concepts/managed-llms/openai-access-gateway.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ sidebar_position: 3000
99
Defang makes it easy to deploy on your favourite cloud's managed LLM service with our [OpenAI Access Gateway](https://github.com/DefangLabs/openai-access-gateway). This service sits between your application and the cloud service and acts as a compatibility layer.
1010
It handles incoming OpenAI requests, translates those requests to the appropriate cloud-native API, handles the native response, and re-constructs an OpenAI-compatible response.
1111

12-
See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex/) which describes how to configure the OpenAI Access Gateway for your application
12+
See [our tutorial](/docs/tutorials/deploy-openai-apps) which describes how to configure the OpenAI Access Gateway for your application.
1313

14-
## Docker Provider Services
14+
## Docker Model Provider Services
1515

1616
As of Docker Compose v2.35 and Docker Desktop v4.41, Compose introduces a new service type called `provider` that allows you to declare platform capabilities required by your application.
1717
For AI models, you use the `model` type to declare model dependencies. This will expose an OpenAI compatible API for your service. Check the [Docker Model Runner documentation](https://docs.docker.com/compose/how-tos/model-runner/) for more details.
@@ -32,8 +32,17 @@ services:
3232
```
3333
3434
Under the hood, when you use the `model` provider, Defang will deploy the **OpenAI Access Gateway** in a private network. This allows you to use the same code for both local development and cloud deployment.
35+
3536
The `x-defang-llm` extension is used to configure the appropriate roles and permissions for your service. See the [Managed Language Models](/docs/concepts/managed-llms/managed-language-models/) page for more details.
3637

38+
## Model Mapping
39+
40+
Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway) on AWS and GCP. This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to the closest matching model name on the target platform. If no such match can be found, it can fallback onto a known existing model (e.g. `ai/mistral`).
41+
42+
This can be configured through the following environment variables:
43+
* `USE_MODEL_MAPPING` (default to true) - configures whether or not model mapping should be enabled.
44+
* `FALLBACK_MODEL` (no default) - configure a model which will be used if model mapping fails to find a target model.
45+
3746
## Current Support
3847

3948
| Provider | Managed Language Models |
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"label": "Deploy OpenAI Apps on Managed LLMs",
3+
"position": 45,
4+
"collapsible": true
5+
}

docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx renamed to docs/tutorials/deploy-openai-apps/aws-bedrock.mdx

Lines changed: 32 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,21 @@
11
---
2-
title: Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI
2+
title: AWS Bedrock
33
sidebar_position: 50
4+
description: Deploy OpenAI Apps to AWS Bedrock using Defang.
45
---
6+
import React from 'react';
7+
import {useColorMode} from '@docusaurus/theme-common';
58

6-
# Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI
9+
# Deploy OpenAI Apps to AWS Bedrock
710

8-
Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud, either on **AWS Bedrock** or **GCP Vertex AI**.
11+
Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud on **AWS Bedrock**.
912

1013
This tutorial shows you how **Defang** makes it easy.
14+
:::info
15+
You must [configure AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) for each model you intend to use in your AWS account.
16+
:::
1117

12-
Suppose you start with a compose file like this:
18+
Suppose you start with a `compose.yaml` file with one `app` service, like this:
1319

1420
```yaml
1521
services:
@@ -28,9 +34,7 @@ services:
2834
2935
## Add an LLM Service to Your Compose File
3036
31-
You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock or Vertex).
32-
33-
Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service:
37+
You can use AWS Bedrock without changing your `app` code by introducing a new [`defangio/openai-access-gateway`](https://github.com/DefangLabs/openai-access-gateway) service. We'll call the new service `llm`. This new service will act as a proxy between your application and AWS Bedrock, and will transparently handle converting your OpenAI requests into AWS Bedrock requests and Bedrock responses into OpenAI responses. This allows you to use AWS Bedrock with your existing OpenAI client SDK.
3438

3539
```diff
3640
+ llm:
@@ -42,16 +46,15 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce
4246
+ mode: host
4347
+ environment:
4448
+ - OPENAI_API_KEY
45-
+ - GCP_PROJECT_ID # if using GCP Vertex AI
46-
+ - GCP_REGION # if using GCP Vertex AI, AWS_REGION not necessary for Bedrock
49+
+ - REGION
4750
```
4851

4952
### Notes:
5053

5154
- The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements.
5255
- `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services.
5356
- New environment variables:
54-
- `GCP_PROJECT_ID` and `GCP_REGION` are needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `GCP_REGION` = us-central1)
57+
- `REGION` is the zone where the services runs (for AWS, this is the equivalent of AWS_REGION)
5558

5659
:::tip
5760
**OpenAI Key**
@@ -82,7 +85,7 @@ Modify your `app` service to send API calls to the `openai-access-gateway`:
8285
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
8386
```
8487

85-
Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock or GCP Vertex.
88+
Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock.
8689

8790
---
8891

@@ -98,8 +101,7 @@ You should configure your application to specify the model you want to use.
98101
environment:
99102
OPENAI_API_KEY:
100103
OPENAI_BASE_URL: "http://llm/api/v1"
101-
+ MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # for Bedrock
102-
+ # MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI
104+
+ MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
103105
healthcheck:
104106
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
105107
```
@@ -109,10 +111,15 @@ Choose the correct `MODEL` depending on which cloud provider you are using.
109111
:::info
110112
**Choosing the Right Model**
111113

112-
- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
113-
- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup)
114+
- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`). [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
115+
:::
114116

115-
# Complete Example Compose File
117+
Alternatively, Defang supports [model mapping](/docs/concepts/managed-llms/openai-access-gateway/#model-mapping) through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway). This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to
118+
the closest equivalent on the target platform. If no such match can be found, a fallback can be defined to use a known existing model (e.g. `ai/mistral`). These environment
119+
variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively.
120+
121+
122+
## Complete Example Compose File
116123

117124
```yaml
118125
services:
@@ -124,7 +131,7 @@ services:
124131
environment:
125132
OPENAI_API_KEY:
126133
OPENAI_BASE_URL: "http://llm/api/v1"
127-
MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # or your Vertex AI model path
134+
MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
128135
healthcheck:
129136
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
130137
@@ -137,25 +144,22 @@ services:
137144
mode: host
138145
environment:
139146
- OPENAI_API_KEY
140-
- GCP_PROJECT_ID # required if using Vertex AI
141-
- GCP_REGION # required if using Vertex AI
147+
- REGION
142148
```
143149

144150
---
145151

146-
# Environment Variable Matrix
152+
## Environment Variable Matrix
147153

148-
| Variable | AWS Bedrock | GCP Vertex AI |
149-
|--------------------|-------------|---------------|
150-
| `GCP_PROJECT_ID` | _(not used)_| Required |
151-
| `GCP_REGION` | _(not used)_| Required |
152-
| `MODEL` | Bedrock model ID | Vertex model path |
154+
| Variable | AWS Bedrock |
155+
|--------------------|-------------|
156+
| `REGION` | Required|
157+
| `MODEL` | Bedrock model ID or Docker model name, for example `meta.llama3-3-70b-instruct-v1:0` or `ai/llama3.3` |
153158

154159
---
155160

156161
You now have a single app that can:
157162

158-
- Talk to **AWS Bedrock** or **GCP Vertex AI**
163+
- Talk to **AWS Bedrock**
159164
- Use the same OpenAI-compatible client code
160-
- Easily switch cloud providers by changing a few environment variables
161-
165+
- Easily switch between models or cloud providers by changing a few environment variables
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
---
2+
title: Deploy OpenAI Apps on Managed LLMs
3+
sidebar_position: 45
4+
description: Deploy OpenAI Apps on Managed LLMs
5+
---
6+
7+
# Deploy OpenAI Apps on Managed LLMs
8+
9+
Defang currently supports using Managed LLMs on AWS Bedrock and GCP Vertex AI. Follow the link below for your specific platform.
10+
11+
- [AWS Bedrock](/docs/tutorials/deploy-openai-apps/aws-bedrock/)
12+
- [GCP Vertex AI](/docs/tutorials/deploy-openai-apps/gcp-vertex/)

0 commit comments

Comments
 (0)