DefangLabs · commit111 · May 16, 2025 · May 15, 2025 · May 15, 2025 · May 15, 2025
@@ -25,7 +25,7 @@ Wow - another month has gone by, time flies when you're having fun!
 
 Let us share some important updates regarding what we achieved at Defang in March:
 
-**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploying-openai-apps-aws-bedrock) without making any changes to your code. Support for GCP and DigitalOcean coming soon.
+**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploy-openai-apps) without making any changes to your code. Support for GCP and DigitalOcean coming soon.
 
 **Defang Pulumi Provider:** Last month, we announced a preview of the [Defang Pulumi Provider](https://github.com/DefangLabs/pulumi-defang), and this month we are excited to announce that V1 is now available in the [Pulumi Registry](https://www.pulumi.com/registry/packages/defang/). As much as we love Docker, we realize there are many real-world apps that have components that (currently) cannot be described completely in a Compose file. With the Defang Pulumi Provider, you can now leverage [the declarative simplicity of Defang with the imperative power of Pulumi](https://docs.defang.io/docs/concepts/pulumi#when-to-use-the-defang-pulumi-provider).
 

@@ -8,10 +8,23 @@ sidebar_position: 3000
 
 Each cloud provider offers their own managed Large Language Model services. AWS offers Bedrock, GCP offers Vertex AI, and Digital Ocean offers their GenAI platform. Defang makes it easy to leverage these services in your projects.
 
+## Current Support
+
+| Provider | Managed Language Models |
+| --- | --- |
+| [Playground](/docs/providers/playground#managed-large-language-models) | ✅ |
+| [AWS Bedrock](/docs/providers/aws#managed-large-language-models) | ✅ |
+| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ |
+| [GCP Vertex AI](/docs/providers/gcp#managed-large-language-models) | ✅ |
+
 ## Usage
 
 In order to leverage cloud-native managed language models from your Defang services, all you need to do is add the `x-defang-llm` extension to the service config and Defang will configure the approprate roles and permissions for you.
 
+:::tip
+Ensure you have the necessary permissions to access the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access).
+:::
+
 ## Example
 
 Assume you have a web service like the following, which uses the cloud native SDK, for example:
@@ -27,12 +40,3 @@ Assume you have a web service like the following, which uses the cloud native SD
 ## Deploying OpenAI-compatible apps
 
 If you already have an OpenAI-compatible application, Defang makes it easy to deploy on your favourite cloud's managed LLM service. See our [OpenAI Access Gateway](/docs/concepts/managed-llms/openai-access-gateway)
-
-## Current Support
-
-| Provider | Managed Language Models |
-| --- | --- |
-| [Playground](/docs/providers/playground#managed-large-language-models) | ✅ |
-| [AWS Bedrock](/docs/providers/aws#managed-large-language-models) | ✅ |
-| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ |
-| [GCP Vertex AI](/docs/providers/gcp#managed-large-language-models) | ✅ |
@@ -9,7 +9,7 @@ sidebar_position: 3000
 Defang makes it easy to deploy on your favourite cloud's managed LLM service with our [OpenAI Access Gateway](https://github.com/DefangLabs/openai-access-gateway). This service sits between your application and the cloud service and acts as a compatibility layer.
 It handles incoming OpenAI requests, translates those requests to the appropriate cloud-native API, handles the native response, and re-constructs an OpenAI-compatible response.
 
-See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex/) which describes how to configure the OpenAI Access Gateway for your application
+See [our tutorial](/docs/tutorials/deploy-openai-apps) which describes how to configure the OpenAI Access Gateway for your application
 
 ## Docker Provider Services
 
@@ -32,8 +32,13 @@ services:
 ```
 
 Under the hood, when you use the `model` provider, Defang will deploy the **OpenAI Access Gateway** in a private network. This allows you to use the same code for both local development and cloud deployment.
+
 The `x-defang-llm` extension is used to configure the appropriate roles and permissions for your service. See the [Managed Language Models](/docs/concepts/managed-llms/managed-language-models/) page for more details.
 
+## Model Mapping
+
+Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway) on AWS and GCP. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to the closest matching model name on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively.
+
 ## Current Support
 
 | Provider | Managed Language Models |

diff --git a/...ng-openai-apps-aws-bedrock-gcp-vertex.mdx → ...orials/deploy-openai-apps-aws-bedrock.mdx b/...ng-openai-apps-aws-bedrock-gcp-vertex.mdx → ...orials/deploy-openai-apps-aws-bedrock.mdx
@@ -1,11 +1,11 @@
 ---
-title: Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI
+title: Deploying your OpenAI Application to AWS Bedrock
 sidebar_position: 50
 ---
 
-# Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI
+# Deploying Your OpenAI Application to AWS Bedrock
 
-Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud, either on **AWS Bedrock** or **GCP Vertex AI**.  
+Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud on **AWS Bedrock**.  
 
 This tutorial shows you how **Defang** makes it easy.
 
@@ -28,7 +28,7 @@ services:
 
 ## Add an LLM Service to Your Compose File
 
-You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock or Vertex).
+You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock).
 
 Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service:
 
@@ -42,16 +42,15 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce
 +        mode: host
 +    environment:
 +      - OPENAI_API_KEY
-+      - GCP_PROJECT_ID # if using GCP Vertex AI
-+      - GCP_REGION # if using GCP Vertex AI, AWS_REGION not necessary for Bedrock
++      - REGION
 ```
 
 ### Notes:
 
 - The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements.
 - `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services.
 - New environment variables:
-  - `GCP_PROJECT_ID` and `GCP_REGION` are needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `GCP_REGION` = us-central1)
+  - `REGION` is the zone where the services runs (for AWS this is the equvilent of AWS_REGION)
 
 :::tip
 **OpenAI Key**
@@ -82,7 +81,7 @@ Modify your `app` service to send API calls to the `openai-access-gateway`:
        test: ["CMD", "curl", "-f", "http://localhost:3000/"]
 ```
 
-Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock or GCP Vertex.
+Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock.
 
 ---
 
@@ -98,8 +97,7 @@ You should configure your application to specify the model you want to use.
      environment:
        OPENAI_API_KEY:
        OPENAI_BASE_URL: "http://llm/api/v1"
-+      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # for Bedrock
-+      # MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI
++      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
      healthcheck:
        test: ["CMD", "curl", "-f", "http://localhost:3000/"]
 ```
@@ -110,8 +108,14 @@ Choose the correct `MODEL` depending on which cloud provider you are using.
 **Choosing the Right Model**
 
 - For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
-- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup)
+:::
+
+Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. `ai/lama3.3`) and maps it to
+the closest equilavent on the target platform. If no such match can be found a fallback can be defined to use a known existing model (e.g. ai/mistral). These environment
+variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively.
 
+
+:::info
 # Complete Example Compose File
 
 ```yaml
@@ -124,7 +128,7 @@ services:
     environment:
       OPENAI_API_KEY:
       OPENAI_BASE_URL: "http://llm/api/v1"
-      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # or your Vertex AI model path
+      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
     healthcheck:
       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
 
@@ -137,25 +141,24 @@ services:
         mode: host
     environment:
       - OPENAI_API_KEY
-      - GCP_PROJECT_ID     # required if using Vertex AI
-      - GCP_REGION         # required if using Vertex AI
+      - REGION
 ```
 
 ---
 
 # Environment Variable Matrix
 
-| Variable           | AWS Bedrock | GCP Vertex AI |
-|--------------------|-------------|---------------|
-| `GCP_PROJECT_ID`    | _(not used)_| Required      |
-| `GCP_REGION`        | _(not used)_| Required      |
-| `MODEL`             | Bedrock model ID | Vertex model path |
+| Variable           | AWS Bedrock |
+|--------------------|-------------|
+| `REGION`            | Required|
+| `MODEL`             | Bedrock model ID or Docker model name, for example `meta.llama3-3-70b-instruct-v1:0` or `ai/lama3.3` |
 
 ---
 
 You now have a single app that can:
 
-- Talk to **AWS Bedrock** or **GCP Vertex AI**
+- Talk to **AWS Bedrock**
 - Use the same OpenAI-compatible client code
 - Easily switch cloud providers by changing a few environment variables
+:::
 
diff --git a/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx b/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx
@@ -0,0 +1,171 @@
+---
+title: Deploy OpenAI Apps to GCP Vertex AI
+sidebar_position: 50
+---
+
+# Deploy OpenAI Apps to GCP Vertex AI
+
+Let's assume you have an application that uses an OpenAI client library and you want to deploy it to the cloud using **GCP Vertex AI**.
+
+This tutorial shows you how **Defang** makes it easy.
+
+Suppose you start with a compose file like this:
+
+```yaml
+services:
+  app:
+    build:
+      context: .
+    ports:
+      - 3000:3000
+    environment:
+      OPENAI_API_KEY:
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+---
+
+## Add an LLM Service to Your Compose File
+
+You need to add a new service that acts as a proxy between your app and the backend LLM provider (Vertex).
+
+Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service:
+
+```diff
++  llm:
++    image: defangio/openai-access-gateway
++    x-defang-llm: true
++    ports:
++      - target: 80
++        published: 80
++        mode: host
++    environment:
++      - OPENAI_API_KEY
++      - GCP_PROJECT_ID
++      - REGION
+```
+
+### Notes:
+
+- The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements.
+- `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services.
+- New environment variables:
+  - `REGION` is the zone where the services runs (e.g. us-central1)
+  - `GCP_PROJECT_ID` is your project to deploy to (e.g. my-project-456789)
+
+:::tip
+**OpenAI Key**
+
+You no longer need your original OpenAI API Key.  
+We recommend generating a random secret for authentication with the gateway:
+
+```bash
+defang config set OPENAI_API_KEY --random
+```
+:::
+
+---
+
+## Redirect Application Traffic
+
+Modify your `app` service to send API calls to the `openai-access-gateway`:
+
+```diff
+ services:
+   app:
+     ports:
+       - 3000:3000
+     environment:
+       OPENAI_API_KEY:
++      OPENAI_BASE_URL: "http://llm/api/v1"
+     healthcheck:
+       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+Now, all OpenAI traffic will be routed through your gateway service and onto GCP Vertex AI.
+
+---
+
+## Selecting a Model
+
+You should configure your application to specify the model you want to use.
+
+```diff
+ services:
+   app:
+     ports:
+       - 3000:3000
+     environment:
+       OPENAI_API_KEY:
+       OPENAI_BASE_URL: "http://llm/api/v1"
++      MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI
+     healthcheck:
+       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+Choose the correct `MODEL` depending on which cloud provider you are using.
+
+Ensure you have the necessary permissions to access the model you intend to use. 
+To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access).
+
+:::info
+**Choosing the Right Model**
+
+- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup)
+:::
+
+Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to
+the closest matching one on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment
+variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively.
+
+
+:::info
+# Complete Example Compose File
+
+```yaml
+services:
+  app:
+    build:
+      context: .
+    ports:
+      - 3000:3000
+    environment:
+      OPENAI_API_KEY:
+      OPENAI_BASE_URL: "http://llm/api/v1"
+      MODEL: "google/gemini-2.5-pro-preview-03-25"
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+
+  llm:
+    image: defangio/openai-access-gateway
+    x-defang-llm: true
+    ports:
+      - target: 80
+        published: 80
+        mode: host
+    environment:
+      - OPENAI_API_KEY
+      - GCP_PROJECT_ID     # required if using GCP Vertex AI
+      - REGION
+```
+
+---
+
+# Environment Variable Matrix
+
+| Variable           | GCP Vertex AI |
+|--------------------|---------------|
+| `GCP_PROJECT_ID`    | Required      |
+| `REGION`            | Required      |
+| `MODEL`             | Vertex model or Docker model name, for example `publishers/meta/models/llama-3.3-70b-instruct-maas` or `ai/llama3.3` |
+
+---
+
+You now have a single app that can:
+
+- Talk to **GCP Vertex AI**
+- Use the same OpenAI-compatible client code
+- Easily switch cloud providers by changing a few environment variables
+:::
+
diff --git a/docs/tutorials/deploy-openai-apps.mdx b/docs/tutorials/deploy-openai-apps.mdx
@@ -0,0 +1,11 @@
+---
+title: Deploy your OpenAI Apps
+sidebar_position: 45
+---
+
+# Deploy Your OpenAI Apps
+
+Defang currently supports LLM using AWS Bedrock and GCP Vertex AI. Follow the link below for your specific platform.
+
+- [AWS Bedrock](/docs/tutorials/deploy-openai-apps-aws-bedrock/)
+- [GCP Vertex AI](/docs/tutorials/deploy-openai-apps-gcp-vertex/).