rough draft documenting x-defang-llm

jordanstephens · jordanstephens · commit 2cc1742bd664 · 2025-04-02T11:43:38.000-07:00
diff --git a/docs/concepts/managed-llms/_category_.json b/docs/concepts/managed-llms/_category_.json
@@ -0,0 +1,5 @@
+{
+  "label": "Managed LLMs",
+  "position": 425,
+  "collapsible": true
+}
diff --git a/docs/concepts/managed-llms/managed-language-models.md b/docs/concepts/managed-llms/managed-language-models.md
@@ -0,0 +1,38 @@
+---
+title: Leveraging Managed Language Models with Defang
+description: Defang makes it easy to leverage cloud-native managed language models.
+sidebar_position: 3000
+---
+
+# Managed Language Models
+
+Each cloud provider offers their own managed Large Language Model services. AWS offers Bedrock, GCP offers Vertex, and Digital Ocean offers their GenAI platform. Defang makes it easy to leverage these services in your projects.
+
+## Usage
+
+In order to leverage cloud-native managed language models from your Defang services, all you need to do is add the `x-defang-llm` extension to the service config and Defang will configure the approprate roles and permissions for you.
+
+## Example
+
+Assume you have a web service like the following, which uses the cloud native SDK, for example:
+
+```diff
+ services:
+     app:
+         build:
+             context: .
++        x-defang-llm: true
+```
+
+## Deploying OpenAI-compatible apps
+
+If you already have an OpenAI-compatible application, Defang makes it easy to deploy on your favourite cloud's managed LLM service. See our [OpenAI Access Gateway](/docs/concepts/openai-access-gateway.md)
+
+## Current Support
+
+| Provider | Managed Language Models |
+| --- | --- |
+| [Playground](/docs/providers/playground#managed-large-language-models) | ❌ |
+| [AWS Bedrock](/docs/providers/aws#managed-large-language-models) | ✅ |
+| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ |
+| [GCP Vertex](/docs/providers/gcp#managed-large-language-models) | ❌ |
diff --git a/docs/concepts/managed-llms/openai-gateway.md b/docs/concepts/managed-llms/openai-gateway.md
@@ -0,0 +1,20 @@
+---
+title: Deploying OpenAI-compatible apps with Defang
+description: Defang makes it easy to leverage cloud-native managed language models for your OpenAI-compatible application.
+sidebar_position: 3000
+---
+
+# Deploying OpenAI-compatible applications to cloud-native managed language models with Defang
+
+Defang makes it easy to deploy on your favourite cloud's managed LLM service with our [OpenAI Access Gateway](https://github.com/DefangLabs/openai-access-gateway). This service sits between your application and the cloud service and acts as a compatibility layer. It handles incoming OpenAI requests, translates those requests to the appropriate cloud-native API, handles the native response, and re-constructs an OpenAI-compatible response.
+
+See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx/) which describes how to configure the OpenAI Access Gateway for your application
+
+## Current Support
+
+| Provider | Managed Language Models |
+| --- | --- |
+| [Playground](/docs/providers/playground#managed-services) | ❌ |
+| [AWS Bedrock](/docs/providers/aws#managed-storage) | ✅ |
+| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ |
+| [GCP Vertex](/docs/providers/gcp#future-improvements) | ❌ |
diff --git a/docs/providers/aws/aws.md b/docs/providers/aws/aws.md
@@ -72,6 +72,12 @@ When using [Managed Postgres](/docs/concepts/managed-storage/managed-postgres.md
 
 When using [Managed Redis](/docs/concepts/managed-storage/managed-redis.md), the Defang CLI provisions an ElastiCache Redis cluster in your account.
 
+### Managed large language models
+
+Defang offers integration with managed, cloud-native large language model services with the `x-defang-llm` service extension. Add this extension to any services which use the Bedrock SDKs.
+
+When using [Managed LLMs](/docs/concepts/managed-llms/managed-language-models.md), the Defang CLI provisions an ElastiCache Redis cluster in your account.
+
 ### Managed Resources
 
 Defang will create and manage the following resources in your AWS account from its bootstrap CloudFormation template:
diff --git a/docs/providers/digitalocean/digitalocean.md b/docs/providers/digitalocean/digitalocean.md
@@ -7,7 +7,7 @@ sidebar_position: 010
 # DigitalOcean
 
 :::info
-The Defang DigitalOcean Provider is available for Public Preview as of October 2024. 
+The Defang DigitalOcean Provider is available for Public Preview as of October 2024.
 :::
 
 :::success DigitalOcean Credits
@@ -76,5 +76,6 @@ The following features are still in development for DigitalOcean:
 - [Custom Domains](/docs/concepts//domains.mdx)
 - [Managed Redis](/docs/concepts//managed-storage/managed-redis.md)
 - [Managed Postgres](/docs/concepts/managed-storage/managed-postgres.md)
+- [Managed Language Models](/docs/concepts/managed-llms/managed-language-models.md)
 
 Stay tuned for future updates!
diff --git a/docs/providers/gcp.md b/docs/providers/gcp.md
@@ -59,6 +59,12 @@ The Provider builds and deploys your services using [Google Cloud Run](https://c
 
 The GCP provider does not currently support storing sensitive config values.
 
+### Managed large language models
+
+Defang offers integration with managed, cloud-native large language model services with the `x-defang-llm` service extension. Add this extension to any services which use the Bedrock SDKs.
+
+When using [Managed LLMs](/docs/concepts/managed-llms/managed-language-models.md), the Defang CLI provisions an ElastiCache Redis cluster in your account.
+
 ### Future Improvements
 
 The following features are in active development for GCP:
diff --git a/docs/providers/playground.md b/docs/providers/playground.md
@@ -19,3 +19,7 @@ Overall, the Defang Playground is very similar to deploying to your own cloud ac
 ### Managed services
 
 In essence, the Playground does not support any [managed storage](../concepts/managed-storage) services, ie. `x-defang-postgres` and `x-defang-redis` are ignored when deploying to the Playground. You can however run both Postgres and Redis as regular container services for testing purposes.
+
+### Managed large language models
+
+Defang offers integration with managed, cloud-native large language model services with the `x-defang-llm` service extension when deploying to your own cloud account with BYOC. This extension is not supported in the Defang Playground.
diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx
@@ -0,0 +1,128 @@
+---
+title: Deploying your OpenAI application to AWS and using Bedrock
+sidebar_position: 50
+---
+
+# Deploying your OpenAI application to AWS and using Bedrock
+
+Let's assume you have an app which is using one of the OpenAI client libraries and you want to deploy your app to AWS so you can leverage Bedrock. This tutorial will show you how Defang makes it easy.
+
+Assume you have a compose file like this:
+
+```yaml
+services:
+  app:
+    build:
+        context: .
+    ports:
+      - target: 3000
+        published: 3000
+        protocol: tcp
+        mode: ingress
+    environment:
+      OPENAI_API_KEY:
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+## Add an llm service to your compose file
+
+The first step is to add a new service to your compose file. The `defangio/openai-access-gateway`. This service provides an OpenAI compatible interface to AWS Bedrock. It's easy to configure, first you need to add it to your compose file:
+
+```diff
++  llm:
++    image: defangio/openai-access-gateway
++    x-defang-llm: true
++    ports:
++      - target: 80
++        published: 80
++        protocol: tcp
++        mode: host
++    environment:
++      - OPENAI_API_KEY
++    healthcheck:
++      test: ["CMD", "curl", "-f", "http://localhost/health"]
+```
+
+A few things to note here. First the image is a fork of [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), which a few modifications to make it easier to use. The source code is available [here](https://github.com/DefangLabs/openai-access-gateway). Second: the `x-defang-llm` property. Defang uses extensions like this to signal special handling of certain kinds of services. In this case, it signals to Defang that we need to configure the appropriate IAM Roles and Policies to support your application.
+
+## Redirecting application traffic
+
+Then you need to configure your application to redirect traffic to the openai-access-gateway, like this:
+
+```diff
+ services:
+   app:
+     ports:
+       - target: 3000
+         published: 3000
+         protocol: tcp
+         mode: ingress
+     environment:
+       OPENAI_API_KEY:
++      OPENAI_BASE_URL: "http://llm/api/v1"
++      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
+     healthcheck:
+       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+You will also need to configure your application to use one of the bedrock models. We recommend configuring an environment variable called `MODEL` like this:
+
+## Selecting a model
+
+```diff
+ services:
+   app:
+     ports:
+       - target: 3000
+         published: 3000
+         protocol: tcp
+         mode: ingress
+     environment:
+       OPENAI_API_KEY:
+       OPENAI_BASE_URL: "http://llm/api/v1"
++      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
+     healthcheck:
+       test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+```
+
+## Enabling bedrock model access
+
+AWS currently requires access to be manually configured on a per-model basis in each account. See this guide for [how to enable model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html).
+
+## Your OpenAI key
+
+It's worth noting that you no longer need ot use your original OpenAI API key. We do recommend using _something_ in its place, but feel free to generate a new secret and set it with `defang config set OPENAI_API_KEY`.
+
+## Complete Example Compose File
+
+```yaml
+services:
+  app:
+    build:
+        context: .
+    ports:
+      - target: 3000
+        published: 3000
+        protocol: tcp
+        mode: ingress
+    environment:
+      OPENAI_API_KEY:
+      OPENAI_BASE_URL: "http://llm/api/v1"
+      MODEL: "anthropic.claude-3-sonnet-20240229-v1:0"
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3000/"]
+  llm:
+    image: defangio/openai-access-gateway
+    x-defang-llm: true
+    ports:
+      - target: 80
+        published: 80
+        protocol: tcp
+        mode: host
+    environment:
+      - OPENAI_API_KEY
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost/health"]
+
+```