Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/concepts/managed-llms/managed-language-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,4 @@ If you already have an OpenAI-compatible application, Defang makes it easy to de
| [Playground](/docs/providers/playground#managed-large-language-models) | ❌ |
| [AWS Bedrock](/docs/providers/aws#managed-large-language-models) | ✅ |
| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ |
| [GCP Vertex](/docs/providers/gcp#managed-large-language-models) | |
| [GCP Vertex](/docs/providers/gcp#managed-large-language-models) | |
4 changes: 2 additions & 2 deletions docs/concepts/managed-llms/openai-access-gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,6 @@ See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock/) which des
| Provider | Managed Language Models |
| --- | --- |
| [Playground](/docs/providers/playground#managed-services) | ❌ |
| [AWS Bedrock](/docs/providers/aws#managed-storage) | ✅ |
| [AWS Bedrock](/docs/providers/aws#managed-llms) | ✅ |
| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ |
| [GCP Vertex](/docs/providers/gcp#future-improvements) | |
| [GCP Vertex](/docs/providers/gcp#managed-llms) | |
4 changes: 1 addition & 3 deletions docs/providers/gcp.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,9 +61,7 @@ The GCP provider does not currently support storing sensitive config values.

### Managed LLMs

Defang offers integration with managed, cloud-native large language model services with the `x-defang-llm` service extension. Add this extension to any services which use the Bedrock SDKs.

When using [Managed LLMs](/docs/concepts/managed-llms/managed-language-models.md), the Defang CLI provisions an ElastiCache Redis cluster in your account.
Defang offers integration with managed, cloud-native large language model services with the x-defang-llm service extension. Add this extension to any services which use the [Google Vertex AI SDKs](https://cloud.google.com/vertex-ai/docs/python-sdk/use-vertex-ai-sdk).

### Future Improvements

Expand Down
161 changes: 161 additions & 0 deletions docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
---
title: Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI
sidebar_position: 50
---

# Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI

Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud, either on **AWS Bedrock** or **GCP Vertex AI**.

This tutorial shows you how **Defang** makes it easy.

Suppose you start with a compose file like this:

```yaml
services:
app:
build:
context: .
ports:
- 3000:3000
environment:
OPENAI_API_KEY:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
```

---

## Add an LLM Service to Your Compose File

You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock or Vertex).

Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service:

```diff
+ llm:
+ image: defangio/openai-access-gateway
+ x-defang-llm: true
+ ports:
+ - target: 80
+ published: 80
+ mode: host
+ environment:
+ - OPENAI_API_KEY
+ - GCP_PROJECT_ID # if using GCP Vertex AI
+ - GCP_REGION # if using GCP Vertex AI, AWS_REGION not necessary for Bedrock
```

### Notes:

- The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements.
- `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services.
- New environment variables:
- `GCP_PROJECT_ID` and `GCP_REGION` are needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `GCP_REGION` = us-central1)

:::tip
**OpenAI Key**

You no longer need your original OpenAI API Key.
We recommend generating a random secret for authentication with the gateway:

```bash
defang config set OPENAI_API_KEY --random
```
:::

---

## Redirect Application Traffic

Modify your `app` service to send API calls to the `openai-access-gateway`:

```diff
services:
app:
ports:
- 3000:3000
environment:
OPENAI_API_KEY:
+ OPENAI_BASE_URL: "http://llm/api/v1"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
```

Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock or GCP Vertex.

---

## Selecting a Model

You should configure your application to specify the model you want to use.

```diff
services:
app:
ports:
- 3000:3000
environment:
OPENAI_API_KEY:
OPENAI_BASE_URL: "http://llm/api/v1"
+ MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # for Bedrock
+ # MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/"]
```

Choose the correct `MODEL` depending on which cloud provider you are using.

:::info
**Choosing the Right Model**

- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup)

# Complete Example Compose File

```yaml
services:
app:
build:
context: .
ports:
- 3000:3000
environment:
OPENAI_API_KEY:
OPENAI_BASE_URL: "http://llm/api/v1"
MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # or your Vertex AI model path
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/"]

llm:
image: defangio/openai-access-gateway
x-defang-llm: true
ports:
- target: 80
published: 80
mode: host
environment:
- OPENAI_API_KEY
- GCP_PROJECT_ID # required if using Vertex AI
- GCP_REGION # required if using Vertex AI
```

---

# Environment Variable Matrix

| Variable | AWS Bedrock | GCP Vertex AI |
|--------------------|-------------|---------------|
| `GCP_PROJECT_ID` | _(not used)_| Required |
| `GCP_REGION` | _(not used)_| Required |
| `MODEL` | Bedrock model ID | Vertex model path |

---

You now have a single app that can:

- Talk to **AWS Bedrock** or **GCP Vertex AI**
- Use the same OpenAI-compatible client code
- Easily switch cloud providers by changing a few environment variables

116 changes: 0 additions & 116 deletions docs/tutorials/deploying-openai-apps-aws-bedrock.mdx

This file was deleted.