From 69dd8afcd4d8209b0b621e7440ce7802a4e2ba1f Mon Sep 17 00:00:00 2001
From: Eric Liu <eric.liu@defang.io>
Date: Sun, 27 Apr 2025 19:27:50 -0700
Subject: [PATCH 1/3] update docs

---
 ...ing-openai-apps-aws-bedrock-gcp-vertex.mdx | 162 ++++++++++++++++++
 .../deploying-openai-apps-aws-bedrock.mdx     | 116 -------------
 2 files changed, 162 insertions(+), 116 deletions(-)
 create mode 100644 docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx
 delete mode 100644 docs/tutorials/deploying-openai-apps-aws-bedrock.mdx

diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx
new file mode 100644
index 000000000..ed880c9e1
--- /dev/null
+++ b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx
@@ -0,0 +1,162 @@
+---
+title: Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI
+sidebar_position: 50
+---
+
+# Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI
+
+Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud, either on **AWS Bedrock** or **GCP Vertex AI**.  
+
+This tutorial shows you how **Defang** makes it easy.
+
+Suppose you start with a compose file like this:
+
+```yaml
+services:
+  app:
+    build:
+      context: .
+    ports:
+      - 3000:3000
+    environment:
+      OPENAI_API_KEY:
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+---
+
+## Add an LLM Service to Your Compose File
+
+You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock or Vertex).
+
+Add **Defang's openai-access-gateway** service:
+
+```diff
++  llm:
++    image: defangio/openai-access-gateway
++    x-defang-llm: true
++    ports:
++      - target: 80
++        published: 80
++        mode: host
++    environment:
++      - OPENAI_API_KEY
++      - GCP_PROJECT_ID
++      - GCP_REGION
+```
+
+### Notes:
+
+- The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements.
+- `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services.
+- New environment variables:
+  - `GCP_PROJECT_ID` and `GCP_REGION` are needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `GCP_REGION` = us-central1)
+
+:::tip
+**OpenAI Key**
+
+You no longer need your original OpenAI API Key.  
+We recommend generating a random secret for authentication with the gateway:
+
+```bash
+defang config set OPENAI_API_KEY --random
+```
+:::
+
+---
+
+## Redirect Application Traffic
+
+Modify your `app` service to send API calls to the `openai-access-gateway`:
+
+```diff
+ services:
+   app:
+     ports:
+       - 3000:3000
+     environment:
+       OPENAI_API_KEY:
++      OPENAI_BASE_URL: "http://llm/api/v1"
+     healthcheck:
+       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+Now, all OpenAI traffic will route through your gateway service.
+
+---
+
+## Selecting a Model
+
+You should configure your application to specify the model you want to use.
+
+```diff
+ services:
+   app:
+     ports:
+       - 3000:3000
+     environment:
+       OPENAI_API_KEY:
+       OPENAI_BASE_URL: "http://llm/api/v1"
++      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # for Bedrock
++      # MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI
+     healthcheck:
+       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+Choose the correct `MODEL` depending on which cloud provider you are using.
+
+:::info
+**Choosing the Right Model**
+
+- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`).
+- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`). 
+[See available Vertex models here.](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup)
+
+# Complete Example Compose File
+
+```yaml
+services:
+  app:
+    build:
+      context: .
+    ports:
+      - 3000:3000
+    environment:
+      OPENAI_API_KEY:
+      OPENAI_BASE_URL: "http://llm/api/v1"
+      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # or your Vertex AI model path
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+
+  llm:
+    image: defangio/openai-access-gateway
+    x-defang-llm: true
+    ports:
+      - target: 80
+        published: 80
+        mode: host
+    environment:
+      - OPENAI_API_KEY
+      - GCP_PROJECT_ID     # required if using Vertex AI
+      - GCP_REGION         # required if using Vertex AI
+```
+
+---
+
+# Environment Variable Matrix
+
+| Variable           | AWS Bedrock | GCP Vertex AI |
+|--------------------|-------------|---------------|
+| `GCP_PROJECT_ID`    | _(not used)_| Required      |
+| `GCP_REGION`        | _(not used)_| Required      |
+| `MODEL`             | Bedrock model ID | Vertex model path |
+
+---
+
+You now have a single app that can:
+
+- Talk to **AWS Bedrock** or **GCP Vertex AI**
+- Use the same OpenAI-compatible client code
+- Easily switch cloud providers by changing a few environment variables
+
diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx
deleted file mode 100644
index a21b73146..000000000
--- a/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx
+++ /dev/null
@@ -1,116 +0,0 @@
----
-title: Deploying your OpenAI application to AWS using Bedrock
-sidebar_position: 50
----
-
-# Deploying your OpenAI application to AWS using Bedrock
-
-Let's assume you have an app which is using one of the OpenAI client libraries and you want to deploy your app to AWS so you can leverage Bedrock. This tutorial will show you how Defang makes it easy.
-
-Assume you have a compose file like this:
-
-```yaml
-services:
-  app:
-    build:
-        context: .
-    ports:
-      - 3000:3000
-    environment:
-      OPENAI_API_KEY:
-    healthcheck:
-      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
-```
-
-## Add an LLM service to your compose file
-
-The first step is to add a new service to your compose file: the `defangio/openai-access-gateway`. This service provides an OpenAI compatible interface to AWS Bedrock. It's easy to configure, first you need to add it to your compose file:
-
-```diff
-+  llm:
-+    image: defangio/openai-access-gateway
-+    x-defang-llm: true
-+    ports:
-+      - target: 80
-+        published: 80
-+        mode: host
-+    environment:
-+      - OPENAI_API_KEY
-```
-
-A few things to note here. First the image is a fork of [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with a few modifications to make it easier to use. The source code is available [here](https://github.com/DefangLabs/openai-access-gateway). Second: the `x-defang-llm` property. Defang uses extensions like this to signal special handling of certain kinds of services. In this case, it signals to Defang that we need to configure the appropriate IAM Roles and Policies to support your application.
-
-:::warning
-**Your OpenAI key**
-
-You no longer need to use your original OpenAI API key. We do recommend using _something_ in its place, but feel free to generate a new secret and set it with `defang config set OPENAI_API_KEY --random`.
-
-This is used to authenticate your application service with the openai-access-gateway.
-:::
-
-
-## Redirecting application traffic
-
-Then you need to configure your application to redirect traffic to the openai-access-gateway, like this:
-
-```diff
- services:
-   app:
-     ports:
-       - 3000:3000
-     environment:
-       OPENAI_API_KEY:
-+      OPENAI_BASE_URL: "http://llm/api/v1"
-     healthcheck:
-       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
-```
-
-## Selecting a model
-
-You will also need to configure your application to use one of the bedrock models. We recommend configuring an environment variable called `MODEL` like this:
-
-```diff
- services:
-   app:
-     ports:
-       - 3000:3000
-     environment:
-       OPENAI_API_KEY:
-       OPENAI_BASE_URL: "http://llm/api/v1"
-+      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
-     healthcheck:
-       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
-```
-
-:::warning
-**Enabling bedrock model access**
-
-AWS currently requires access to be manually configured on a per-model basis in each account. See this guide for [how to enable model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html).
-:::
-
-## Complete Example Compose File
-
-```yaml
-services:
-  app:
-    build:
-        context: .
-    ports:
-      - 3000:3000
-    environment:
-      OPENAI_API_KEY:
-      OPENAI_BASE_URL: "http://llm/api/v1"
-      MODEL: "us:anthropic.claude-3-sonnet-20240229-v1:0"
-    healthcheck:
-      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
-  llm:
-    image: defangio/openai-access-gateway
-    x-defang-llm: true
-    ports:
-      - target: 80
-        published: 80
-        mode: host
-    environment:
-      - OPENAI_API_KEY
-
-```

From 333f8e44cf58e3b786eacc11216e799bfda64afa Mon Sep 17 00:00:00 2001
From: Eric Liu <nullfunc@users.noreply.github.com>
Date: Mon, 28 Apr 2025 09:24:28 -0700
Subject: [PATCH 2/3] Apply suggestions from code review

Document review text updates

Co-authored-by: Jordan Stephens <jordan@stephens.io>
---
 ...deploying-openai-apps-aws-bedrock-gcp-vertex.mdx | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx
index ed880c9e1..c27320efa 100644
--- a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx
+++ b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx
@@ -30,7 +30,7 @@ services:
 
 You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock or Vertex).
 
-Add **Defang's openai-access-gateway** service:
+Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service:
 
 ```diff
 +  llm:
@@ -42,8 +42,8 @@ Add **Defang's openai-access-gateway** service:
 +        mode: host
 +    environment:
 +      - OPENAI_API_KEY
-+      - GCP_PROJECT_ID
-+      - GCP_REGION
++      - GCP_PROJECT_ID # if using GCP Vertex AI
++      - GCP_REGION # if using GCP Vertex AI, AWS_REGION not necessary for Bedrock
 ```
 
 ### Notes:
@@ -82,7 +82,7 @@ Modify your `app` service to send API calls to the `openai-access-gateway`:
        test: ["CMD", "curl", "-f", "http://localhost:3000/"]
 ```
 
-Now, all OpenAI traffic will route through your gateway service.
+Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock or GCP Vertex.
 
 ---
 
@@ -109,9 +109,8 @@ Choose the correct `MODEL` depending on which cloud provider you are using.
 :::info
 **Choosing the Right Model**
 
-- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`).
-- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`). 
-[See available Vertex models here.](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup)
+- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
+- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup)
 
 # Complete Example Compose File
 

From ffd40a573ff889ebfbaf01979388977ab862df6d Mon Sep 17 00:00:00 2001
From: Eric Liu <eric.liu@defang.io>
Date: Mon, 28 Apr 2025 09:39:27 -0700
Subject: [PATCH 3/3] updated to add gcp llm

---
 docs/concepts/managed-llms/managed-language-models.md | 2 +-
 docs/concepts/managed-llms/openai-access-gateway.md   | 4 ++--
 docs/providers/gcp.md                                 | 4 +---
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/docs/concepts/managed-llms/managed-language-models.md b/docs/concepts/managed-llms/managed-language-models.md
index 04feff431..360053b0b 100644
--- a/docs/concepts/managed-llms/managed-language-models.md
+++ b/docs/concepts/managed-llms/managed-language-models.md
@@ -35,4 +35,4 @@ If you already have an OpenAI-compatible application, Defang makes it easy to de
 | [Playground](/docs/providers/playground#managed-large-language-models) | ❌ |
 | [AWS Bedrock](/docs/providers/aws#managed-large-language-models) | ✅ |
 | [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ |
-| [GCP Vertex](/docs/providers/gcp#managed-large-language-models) | ❌ |
+| [GCP Vertex](/docs/providers/gcp#managed-large-language-models) | ✅ |
diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md
index aa1cf0a64..0433622a4 100644
--- a/docs/concepts/managed-llms/openai-access-gateway.md
+++ b/docs/concepts/managed-llms/openai-access-gateway.md
@@ -15,6 +15,6 @@ See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock/) which des
 | Provider | Managed Language Models |
 | --- | --- |
 | [Playground](/docs/providers/playground#managed-services) | ❌ |
-| [AWS Bedrock](/docs/providers/aws#managed-storage) | ✅ |
+| [AWS Bedrock](/docs/providers/aws#managed-llms) | ✅ |
 | [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ |
-| [GCP Vertex](/docs/providers/gcp#future-improvements) | ❌ |
+| [GCP Vertex](/docs/providers/gcp#managed-llms) | ✅ |
diff --git a/docs/providers/gcp.md b/docs/providers/gcp.md
index 62a6f8128..8760f94ac 100644
--- a/docs/providers/gcp.md
+++ b/docs/providers/gcp.md
@@ -61,9 +61,7 @@ The GCP provider does not currently support storing sensitive config values.
 
 ### Managed LLMs
 
-Defang offers integration with managed, cloud-native large language model services with the `x-defang-llm` service extension. Add this extension to any services which use the Bedrock SDKs.
-
-When using [Managed LLMs](/docs/concepts/managed-llms/managed-language-models.md), the Defang CLI provisions an ElastiCache Redis cluster in your account.
+Defang offers integration with managed, cloud-native large language model services with the x-defang-llm service extension. Add this extension to any services which use the [Google Vertex AI SDKs](https://cloud.google.com/vertex-ai/docs/python-sdk/use-vertex-ai-sdk).
 
 ### Future Improvements