Merge pull request #219 from DefangLabs/eric-add-gcp-vertex

jordanstephens · web-flow · commit 7861a1cb4909 · 2025-04-28T09:51:44.000-07:00
GCP Vertext AI
diff --git a/docs/concepts/managed-llms/managed-language-models.md b/docs/concepts/managed-llms/managed-language-models.md
@@ -35,4 +35,4 @@ If you already have an OpenAI-compatible application, Defang makes it easy to de
 | [Playground](/docs/providers/playground#managed-large-language-models) | ❌ |
 | [AWS Bedrock](/docs/providers/aws#managed-large-language-models) | ✅ |
 | [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ |
-| [GCP Vertex](/docs/providers/gcp#managed-large-language-models) | ❌ |
+| [GCP Vertex](/docs/providers/gcp#managed-large-language-models) | ✅ |
diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md
@@ -15,6 +15,6 @@ See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock/) which des
 | Provider | Managed Language Models |
 | --- | --- |
 | [Playground](/docs/providers/playground#managed-services) | ❌ |
-| [AWS Bedrock](/docs/providers/aws#managed-storage) | ✅ |
+| [AWS Bedrock](/docs/providers/aws#managed-llms) | ✅ |
 | [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ |
-| [GCP Vertex](/docs/providers/gcp#future-improvements) | ❌ |
+| [GCP Vertex](/docs/providers/gcp#managed-llms) | ✅ |
diff --git a/docs/providers/gcp.md b/docs/providers/gcp.md
@@ -61,9 +61,7 @@ The GCP provider does not currently support storing sensitive config values.
 
 ### Managed LLMs
 
-Defang offers integration with managed, cloud-native large language model services with the `x-defang-llm` service extension. Add this extension to any services which use the Bedrock SDKs.
-
-When using [Managed LLMs](/docs/concepts/managed-llms/managed-language-models.md), the Defang CLI provisions an ElastiCache Redis cluster in your account.
+Defang offers integration with managed, cloud-native large language model services with the x-defang-llm service extension. Add this extension to any services which use the [Google Vertex AI SDKs](https://cloud.google.com/vertex-ai/docs/python-sdk/use-vertex-ai-sdk).
 
 ### Future Improvements
 
diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx
@@ -0,0 +1,161 @@
+---
+title: Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI
+sidebar_position: 50
+---
+
+# Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI
+
+Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud, either on **AWS Bedrock** or **GCP Vertex AI**.  
+
+This tutorial shows you how **Defang** makes it easy.
+
+Suppose you start with a compose file like this:
+
+```yaml
+services:
+  app:
+    build:
+      context: .
+    ports:
+      - 3000:3000
+    environment:
+      OPENAI_API_KEY:
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+---
+
+## Add an LLM Service to Your Compose File
+
+You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock or Vertex).
+
+Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service:
+
+```diff
++  llm:
++    image: defangio/openai-access-gateway
++    x-defang-llm: true
++    ports:
++      - target: 80
++        published: 80
++        mode: host
++    environment:
++      - OPENAI_API_KEY
++      - GCP_PROJECT_ID # if using GCP Vertex AI
++      - GCP_REGION # if using GCP Vertex AI, AWS_REGION not necessary for Bedrock
+```
+
+### Notes:
+
+- The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements.
+- `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services.
+- New environment variables:
+  - `GCP_PROJECT_ID` and `GCP_REGION` are needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `GCP_REGION` = us-central1)
+
+:::tip
+**OpenAI Key**
+
+You no longer need your original OpenAI API Key.  
+We recommend generating a random secret for authentication with the gateway:
+
+```bash
+defang config set OPENAI_API_KEY --random
+```
+:::
+
+---
+
+## Redirect Application Traffic
+
+Modify your `app` service to send API calls to the `openai-access-gateway`:
+
+```diff
+ services:
+   app:
+     ports:
+       - 3000:3000
+     environment:
+       OPENAI_API_KEY:
++      OPENAI_BASE_URL: "http://llm/api/v1"
+     healthcheck:
+       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock or GCP Vertex.
+
+---
+
+## Selecting a Model
+
+You should configure your application to specify the model you want to use.
+
+```diff
+ services:
+   app:
+     ports:
+       - 3000:3000
+     environment:
+       OPENAI_API_KEY:
+       OPENAI_BASE_URL: "http://llm/api/v1"
++      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # for Bedrock
++      # MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI
+     healthcheck:
+       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+Choose the correct `MODEL` depending on which cloud provider you are using.
+
+:::info
+**Choosing the Right Model**
+
+- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
+- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup)
+
+# Complete Example Compose File
+
+```yaml
+services:
+  app:
+    build:
+      context: .
+    ports:
+      - 3000:3000
+    environment:
+      OPENAI_API_KEY:
+      OPENAI_BASE_URL: "http://llm/api/v1"
+      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # or your Vertex AI model path
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+
+  llm:
+    image: defangio/openai-access-gateway
+    x-defang-llm: true
+    ports:
+      - target: 80
+        published: 80
+        mode: host
+    environment:
+      - OPENAI_API_KEY
+      - GCP_PROJECT_ID     # required if using Vertex AI
+      - GCP_REGION         # required if using Vertex AI
+```
+
+---
+
+# Environment Variable Matrix
+
+| Variable           | AWS Bedrock | GCP Vertex AI |
+|--------------------|-------------|---------------|
+| `GCP_PROJECT_ID`    | _(not used)_| Required      |
+| `GCP_REGION`        | _(not used)_| Required      |
+| `MODEL`             | Bedrock model ID | Vertex model path |
+
+---
+
+You now have a single app that can:
+
+- Talk to **AWS Bedrock** or **GCP Vertex AI**
+- Use the same OpenAI-compatible client code
+- Easily switch cloud providers by changing a few environment variables
+
diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx