update for for llm deployment and model mapping

nullfunc · nullfunc · commit 4753ef61271a · 2025-05-15T13:30:56.000-07:00
diff --git a/blog/2025-04-11-mar-product-updates.md b/blog/2025-04-11-mar-product-updates.md
@@ -25,7 +25,7 @@ Wow - another month has gone by, time flies when you're having fun!
 
 Let us share some important updates regarding what we achieved at Defang in March:
 
-**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploying-openai-apps-aws-bedrock) without making any changes to your code. Support for GCP and DigitalOcean coming soon.
+**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploying-openai-apps) without making any changes to your code. Support for GCP and DigitalOcean coming soon.
 
 **Defang Pulumi Provider:** Last month, we announced a preview of the [Defang Pulumi Provider](https://github.com/DefangLabs/pulumi-defang), and this month we are excited to announce that V1 is now available in the [Pulumi Registry](https://www.pulumi.com/registry/packages/defang/). As much as we love Docker, we realize there are many real-world apps that have components that (currently) cannot be described completely in a Compose file. With the Defang Pulumi Provider, you can now leverage [the declarative simplicity of Defang with the imperative power of Pulumi](https://docs.defang.io/docs/concepts/pulumi#when-to-use-the-defang-pulumi-provider).
 
diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md
@@ -9,7 +9,7 @@ sidebar_position: 3000
 Defang makes it easy to deploy on your favourite cloud's managed LLM service with our [OpenAI Access Gateway](https://github.com/DefangLabs/openai-access-gateway). This service sits between your application and the cloud service and acts as a compatibility layer.
 It handles incoming OpenAI requests, translates those requests to the appropriate cloud-native API, handles the native response, and re-constructs an OpenAI-compatible response.
 
-See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex/) which describes how to configure the OpenAI Access Gateway for your application
+See [our tutorial](/docs/tutorials/deploying-openai-apps) which describes how to configure the OpenAI Access Gateway for your application
 
 ## Docker Provider Services
 
diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx
@@ -0,0 +1,164 @@
+---
+title: Deploying your OpenAI Application to AWS Bedrock
+sidebar_position: 50
+---
+
+# Deploying Your OpenAI Application to AWS Bedrock
+
+Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud on **AWS Bedrock**.  
+
+This tutorial shows you how **Defang** makes it easy.
+
+Suppose you start with a compose file like this:
+
+```yaml
+services:
+  app:
+    build:
+      context: .
+    ports:
+      - 3000:3000
+    environment:
+      OPENAI_API_KEY:
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+---
+
+## Add an LLM Service to Your Compose File
+
+You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock).
+
+Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service:
+
+```diff
++  llm:
++    image: defangio/openai-access-gateway
++    x-defang-llm: true
++    ports:
++      - target: 80
++        published: 80
++        mode: host
++    environment:
++      - OPENAI_API_KEY
++      - REGION
+```
+
+### Notes:
+
+- The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements.
+- `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services.
+- New environment variables:
+  - `REGION` is the zone where the services runs (for AWS this is the equvilent of AWS_REGION)
+
+:::tip
+**OpenAI Key**
+
+You no longer need your original OpenAI API Key.  
+We recommend generating a random secret for authentication with the gateway:
+
+```bash
+defang config set OPENAI_API_KEY --random
+```
+:::
+
+---
+
+## Redirect Application Traffic
+
+Modify your `app` service to send API calls to the `openai-access-gateway`:
+
+```diff
+ services:
+   app:
+     ports:
+       - 3000:3000
+     environment:
+       OPENAI_API_KEY:
++      OPENAI_BASE_URL: "http://llm/api/v1"
+     healthcheck:
+       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock.
+
+---
+
+## Selecting a Model
+
+You should configure your application to specify the model you want to use.
+
+```diff
+ services:
+   app:
+     ports:
+       - 3000:3000
+     environment:
+       OPENAI_API_KEY:
+       OPENAI_BASE_URL: "http://llm/api/v1"
++      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
+     healthcheck:
+       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+Choose the correct `MODEL` depending on which cloud provider you are using.
+
+:::info
+**Choosing the Right Model**
+
+- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
+:::
+
+Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to
+the closest equilavent on the target platform. If no such match can be found a fallback can be defined to use a known existing model (e.g. ai/mistral). These environment
+variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively.
+
+
+:::info
+# Complete Example Compose File
+
+```yaml
+services:
+  app:
+    build:
+      context: .
+    ports:
+      - 3000:3000
+    environment:
+      OPENAI_API_KEY:
+      OPENAI_BASE_URL: "http://llm/api/v1"
+      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+
+  llm:
+    image: defangio/openai-access-gateway
+    x-defang-llm: true
+    ports:
+      - target: 80
+        published: 80
+        mode: host
+    environment:
+      - OPENAI_API_KEY
+      - REGION
+```
+
+---
+
+# Environment Variable Matrix
+
+| Variable           | AWS Bedrock |
+|--------------------|-------------|
+| `REGION`            | Required|
+| `MODEL`             | Bedrock model ID / Docker model name |
+
+---
+
+You now have a single app that can:
+
+- Talk to **GCP Vertex AI**
+- Use the same OpenAI-compatible client code
+- Easily switch cloud providers by changing a few environment variables
+:::
+
diff --git a/docs/tutorials/deploying-openai-apps-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-gcp-vertex.mdx
@@ -1,11 +1,11 @@
 ---
-title: Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI
+title: Deploying your OpenAI Application to GCP Vertex AI
 sidebar_position: 50
 ---
 
-# Deploying Your OpenAI Application to AWS Bedrock or GCP Vertex AI
+# Deploying Your OpenAI Application to GCP Vertex AI
 
-Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud, either on **AWS Bedrock** or **GCP Vertex AI**.  
+Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud on **GCP Vertex AI**.  
 
 This tutorial shows you how **Defang** makes it easy.
 
@@ -28,7 +28,7 @@ services:
 
 ## Add an LLM Service to Your Compose File
 
-You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock or Vertex).
+You need to add a new service that acts as a proxy between your app and the backend LLM provider (Vertex).
 
 Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service:
 
@@ -42,7 +42,7 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce
 +        mode: host
 +    environment:
 +      - OPENAI_API_KEY
-+      - GCP_PROJECT_ID # if using GCP Vertex AI
++      - GCP_PROJECT_ID
 +      - REGION
 ```
 
@@ -51,8 +51,8 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce
 - The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements.
 - `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services.
 - New environment variables:
-  - `REGION` is the zone where the services runs (for AWS this is equvilent of AWS_REGION)
-  - `GCP_PROJECT_ID` is needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `REGION` = us-central1)
+  - `REGION` is the zone where the services runs (e.g. us-central1)
+  - `GCP_PROJECT_ID` is your project to deploy to (e.g. my-project-456789)
 
 :::tip
 **OpenAI Key**
@@ -83,7 +83,7 @@ Modify your `app` service to send API calls to the `openai-access-gateway`:
        test: ["CMD", "curl", "-f", "http://localhost:3000/"]
 ```
 
-Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock or GCP Vertex.
+Now, all OpenAI traffic will be routed through your gateway service and onto GCP Vertex AI.
 
 ---
 
@@ -99,24 +99,25 @@ You should configure your application to specify the model you want to use.
      environment:
        OPENAI_API_KEY:
        OPENAI_BASE_URL: "http://llm/api/v1"
-+      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # for Bedrock
-+      # MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI
++      MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI
      healthcheck:
        test: ["CMD", "curl", "-f", "http://localhost:3000/"]
 ```
 
 Choose the correct `MODEL` depending on which cloud provider you are using.
 
-Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to
-the closest matching one on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment
-variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively.
-
 :::info
 **Choosing the Right Model**
 
-- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
 - For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup)
+:::
 
+Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to
+the closest matching one on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment
+variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively.
+
+
+:::info
 # Complete Example Compose File
 
 ```yaml
@@ -129,7 +130,7 @@ services:
     environment:
       OPENAI_API_KEY:
       OPENAI_BASE_URL: "http://llm/api/v1"
-      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # or your Vertex AI model path
+      MODEL: "google/gemini-2.5-pro-preview-03-25"
     healthcheck:
       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
 
@@ -142,25 +143,25 @@ services:
         mode: host
     environment:
       - OPENAI_API_KEY
-      - GCP_PROJECT_ID     # required if using Vertex AI
+      - GCP_PROJECT_ID     # required if using GCP Vertex AI
       - REGION
 ```
 
 ---
 
 # Environment Variable Matrix
 
-| Variable           | AWS Bedrock | GCP Vertex AI |
-|--------------------|-------------|---------------|
-| `GCP_PROJECT_ID`    | _(not used)_| Required      |
-| `REGION`            | Required| Required      |
-| `MODEL`             | Bedrock model ID / Docker model name | Vertex model / Docker model name |
+| Variable           | GCP Vertex AI |
+|--------------------|---------------|
+| `GCP_PROJECT_ID`    | Required      |
+| `REGION`            | Required      |
+| `MODEL`             | Vertex model / Docker model name |
 
 ---
 
 You now have a single app that can:
 
-- Talk to **AWS Bedrock** or **GCP Vertex AI**
+- Talk to **GCP Vertex AI**
 - Use the same OpenAI-compatible client code
 - Easily switch cloud providers by changing a few environment variables
 :::
diff --git a/docs/tutorials/deploying-openai-apps.mdx b/docs/tutorials/deploying-openai-apps.mdx
@@ -0,0 +1,15 @@
+---
+title: Deploying your OpenAI Application
+sidebar_position: 50
+---
+
+# Deploying Your OpenAI application
+
+Defang currently supports LLM using AWS Bedrock and GCP Vertex AI. Follow the link below for your specific platform.
+
+- [AWS Bedrock](/docs/tutorials/deploying-openai-apps-aws-bedrock/)
+- [GCP Vertex AI](/docs/tutorials/deploying-openai-apps-gcp-vertex/).
+
+
+
+