From bf0ebb833b94730e0104cbb88be40cc707bc66ce Mon Sep 17 00:00:00 2001 From: commit111 Date: Wed, 14 May 2025 18:24:28 -0700 Subject: [PATCH 01/19] add model access tip --- docs/concepts/managed-llms/managed-language-models.md | 4 ++++ .../deploying-openai-apps-aws-bedrock-gcp-vertex.mdx | 5 ++++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/docs/concepts/managed-llms/managed-language-models.md b/docs/concepts/managed-llms/managed-language-models.md index 570adb028..39b4296e7 100644 --- a/docs/concepts/managed-llms/managed-language-models.md +++ b/docs/concepts/managed-llms/managed-language-models.md @@ -12,6 +12,10 @@ Each cloud provider offers their own managed Large Language Model services. AWS In order to leverage cloud-native managed language models from your Defang services, all you need to do is add the `x-defang-llm` extension to the service config and Defang will configure the approprate roles and permissions for you. +:::tip +Ensure you have the necessary permissions to access the model you intend to use. For example, if you are using AWS Bedrock, verify that your account has [model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html). +::: + ## Example Assume you have a web service like the following, which uses the cloud native SDK, for example: diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx index c27320efa..3b48a0630 100644 --- a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx +++ b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx @@ -106,11 +106,15 @@ You should configure your application to specify the model you want to use. Choose the correct `MODEL` depending on which cloud provider you are using. +Ensure you have the necessary permissions to access the model you intend to use. +For example, if you are using AWS Bedrock, verify that your account has [model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html). + :::info **Choosing the Right Model** - For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). - For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup) +::: # Complete Example Compose File @@ -158,4 +162,3 @@ You now have a single app that can: - Talk to **AWS Bedrock** or **GCP Vertex AI** - Use the same OpenAI-compatible client code - Easily switch cloud providers by changing a few environment variables - From 4d5df8907a0ad665bb35ee3a7311ef71fee2570c Mon Sep 17 00:00:00 2001 From: commit111 Date: Wed, 14 May 2025 18:24:52 -0700 Subject: [PATCH 02/19] make title shorter --- .../deploying-openai-apps-aws-bedrock-gcp-vertex.mdx | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx index 3b48a0630..4a7740dab 100644 --- a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx +++ b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx @@ -1,11 +1,11 @@ --- -title: Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI +title: Deploy OpenAI Apps to AWS Bedrock or GCP Vertex AI sidebar_position: 50 --- -# Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI +# Deploy OpenAI Apps to AWS Bedrock or GCP Vertex AI -Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud, either on **AWS Bedrock** or **GCP Vertex AI**. +Let's assume you have an application that uses an OpenAI client library and you want to deploy it to the cloud, either on **AWS Bedrock** or **GCP Vertex AI**. This tutorial shows you how **Defang** makes it easy. From 2b49556bef95b54055b93889fef0ef0424a5cb97 Mon Sep 17 00:00:00 2001 From: commit111 Date: Wed, 14 May 2025 18:24:59 -0700 Subject: [PATCH 03/19] add period --- docs/concepts/managed-llms/openai-access-gateway.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md index 2b0b6b079..663c366bd 100644 --- a/docs/concepts/managed-llms/openai-access-gateway.md +++ b/docs/concepts/managed-llms/openai-access-gateway.md @@ -9,7 +9,7 @@ sidebar_position: 3000 Defang makes it easy to deploy on your favourite cloud's managed LLM service with our [OpenAI Access Gateway](https://github.com/DefangLabs/openai-access-gateway). This service sits between your application and the cloud service and acts as a compatibility layer. It handles incoming OpenAI requests, translates those requests to the appropriate cloud-native API, handles the native response, and re-constructs an OpenAI-compatible response. -See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex/) which describes how to configure the OpenAI Access Gateway for your application +See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex/) which describes how to configure the OpenAI Access Gateway for your application. ## Docker Provider Services @@ -32,6 +32,7 @@ services: ``` Under the hood, when you use the `model` provider, Defang will deploy the **OpenAI Access Gateway** in a private network. This allows you to use the same code for both local development and cloud deployment. + The `x-defang-llm` extension is used to configure the appropriate roles and permissions for your service. See the [Managed Language Models](/docs/concepts/managed-llms/managed-language-models/) page for more details. ## Current Support From 54b33f88620cdbde480e3abcd1996b139c3610cd Mon Sep 17 00:00:00 2001 From: commit111 Date: Wed, 14 May 2025 18:26:07 -0700 Subject: [PATCH 04/19] add period again --- docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx index 4a7740dab..c5d65195d 100644 --- a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx +++ b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx @@ -113,7 +113,7 @@ For example, if you are using AWS Bedrock, verify that your account has [model a **Choosing the Right Model** - For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). -- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup) +- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup). ::: # Complete Example Compose File From a1ff99f21d1136c0f4c7181409d2b25176160b71 Mon Sep 17 00:00:00 2001 From: Eric Liu Date: Thu, 15 May 2025 00:19:38 -0700 Subject: [PATCH 05/19] update docs to include model mapping --- .../concepts/managed-llms/openai-access-gateway.md | 4 ++++ ...eploying-openai-apps-aws-bedrock-gcp-vertex.mdx | 14 ++++++++++---- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md index 2b0b6b079..eb7789acf 100644 --- a/docs/concepts/managed-llms/openai-access-gateway.md +++ b/docs/concepts/managed-llms/openai-access-gateway.md @@ -34,6 +34,10 @@ services: Under the hood, when you use the `model` provider, Defang will deploy the **OpenAI Access Gateway** in a private network. This allows you to use the same code for both local development and cloud deployment. The `x-defang-llm` extension is used to configure the appropriate roles and permissions for your service. See the [Managed Language Models](/docs/concepts/managed-llms/managed-language-models/) page for more details. +## Model Mapping + +Defang supports model mapping through the openai-access-gateway on AWS and GCP. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to the closest matching model name on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively. + ## Current Support | Provider | Managed Language Models | diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx index c27320efa..2cdb9820f 100644 --- a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx +++ b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx @@ -43,7 +43,7 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce + environment: + - OPENAI_API_KEY + - GCP_PROJECT_ID # if using GCP Vertex AI -+ - GCP_REGION # if using GCP Vertex AI, AWS_REGION not necessary for Bedrock ++ - REGION ``` ### Notes: @@ -51,7 +51,8 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce - The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements. - `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services. - New environment variables: - - `GCP_PROJECT_ID` and `GCP_REGION` are needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `GCP_REGION` = us-central1) + - `REGION` is the zone where the services runs (for AWS this is equvilent of AWS_REGION) + - `GCP_PROJECT_ID` is needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `REGION` = us-central1) :::tip **OpenAI Key** @@ -106,6 +107,10 @@ You should configure your application to specify the model you want to use. Choose the correct `MODEL` depending on which cloud provider you are using. +Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to +the closest matching one on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment +variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively. + :::info **Choosing the Right Model** @@ -138,7 +143,7 @@ services: environment: - OPENAI_API_KEY - GCP_PROJECT_ID # required if using Vertex AI - - GCP_REGION # required if using Vertex AI + - REGION ``` --- @@ -148,7 +153,7 @@ services: | Variable | AWS Bedrock | GCP Vertex AI | |--------------------|-------------|---------------| | `GCP_PROJECT_ID` | _(not used)_| Required | -| `GCP_REGION` | _(not used)_| Required | +| `REGION` | Required| Required | | `MODEL` | Bedrock model ID | Vertex model path | --- @@ -158,4 +163,5 @@ You now have a single app that can: - Talk to **AWS Bedrock** or **GCP Vertex AI** - Use the same OpenAI-compatible client code - Easily switch cloud providers by changing a few environment variables +::: From b382aca062eabba9c777ff1ae4dfebf058194c18 Mon Sep 17 00:00:00 2001 From: Eric Liu Date: Thu, 15 May 2025 00:33:12 -0700 Subject: [PATCH 06/19] llm gateway updates --- .../managed-llms/managed-language-models.md | 18 +++++++++--------- ...ying-openai-apps-aws-bedrock-gcp-vertex.mdx | 4 ++-- 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/concepts/managed-llms/managed-language-models.md b/docs/concepts/managed-llms/managed-language-models.md index 570adb028..3fa024721 100644 --- a/docs/concepts/managed-llms/managed-language-models.md +++ b/docs/concepts/managed-llms/managed-language-models.md @@ -8,6 +8,15 @@ sidebar_position: 3000 Each cloud provider offers their own managed Large Language Model services. AWS offers Bedrock, GCP offers Vertex AI, and Digital Ocean offers their GenAI platform. Defang makes it easy to leverage these services in your projects. +## Current Support + +| Provider | Managed Language Models | +| --- | --- | +| [Playground](/docs/providers/playground#managed-large-language-models) | ✅ | +| [AWS Bedrock](/docs/providers/aws#managed-large-language-models) | ✅ | +| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ | +| [GCP Vertex AI](/docs/providers/gcp#managed-large-language-models) | ✅ | + ## Usage In order to leverage cloud-native managed language models from your Defang services, all you need to do is add the `x-defang-llm` extension to the service config and Defang will configure the approprate roles and permissions for you. @@ -27,12 +36,3 @@ Assume you have a web service like the following, which uses the cloud native SD ## Deploying OpenAI-compatible apps If you already have an OpenAI-compatible application, Defang makes it easy to deploy on your favourite cloud's managed LLM service. See our [OpenAI Access Gateway](/docs/concepts/managed-llms/openai-access-gateway) - -## Current Support - -| Provider | Managed Language Models | -| --- | --- | -| [Playground](/docs/providers/playground#managed-large-language-models) | ✅ | -| [AWS Bedrock](/docs/providers/aws#managed-large-language-models) | ✅ | -| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ | -| [GCP Vertex AI](/docs/providers/gcp#managed-large-language-models) | ✅ | diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx index 2cdb9820f..a3cdf8313 100644 --- a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx +++ b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx @@ -3,7 +3,7 @@ title: Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI sidebar_position: 50 --- -# Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI +# Deploying Your OpenAI Application to AWS Bedrock or GCP Vertex AI Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud, either on **AWS Bedrock** or **GCP Vertex AI**. @@ -154,7 +154,7 @@ services: |--------------------|-------------|---------------| | `GCP_PROJECT_ID` | _(not used)_| Required | | `REGION` | Required| Required | -| `MODEL` | Bedrock model ID | Vertex model path | +| `MODEL` | Bedrock model ID / Docker model name | Vertex model / Docker model name | --- From 4753ef61271aac470b739ca9de4f42e7bdcfd276 Mon Sep 17 00:00:00 2001 From: Eric Liu Date: Thu, 15 May 2025 13:30:56 -0700 Subject: [PATCH 07/19] update for for llm deployment and model mapping --- blog/2025-04-11-mar-product-updates.md | 2 +- .../managed-llms/openai-access-gateway.md | 2 +- .../deploying-openai-apps-aws-bedrock.mdx | 164 ++++++++++++++++++ ...x => deploying-openai-apps-gcp-vertex.mdx} | 47 ++--- docs/tutorials/deploying-openai-apps.mdx | 15 ++ 5 files changed, 205 insertions(+), 25 deletions(-) create mode 100644 docs/tutorials/deploying-openai-apps-aws-bedrock.mdx rename docs/tutorials/{deploying-openai-apps-aws-bedrock-gcp-vertex.mdx => deploying-openai-apps-gcp-vertex.mdx} (71%) create mode 100644 docs/tutorials/deploying-openai-apps.mdx diff --git a/blog/2025-04-11-mar-product-updates.md b/blog/2025-04-11-mar-product-updates.md index 959af5e4b..8070ab681 100644 --- a/blog/2025-04-11-mar-product-updates.md +++ b/blog/2025-04-11-mar-product-updates.md @@ -25,7 +25,7 @@ Wow - another month has gone by, time flies when you're having fun! Let us share some important updates regarding what we achieved at Defang in March: -**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploying-openai-apps-aws-bedrock) without making any changes to your code. Support for GCP and DigitalOcean coming soon. +**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploying-openai-apps) without making any changes to your code. Support for GCP and DigitalOcean coming soon. **Defang Pulumi Provider:** Last month, we announced a preview of the [Defang Pulumi Provider](https://github.com/DefangLabs/pulumi-defang), and this month we are excited to announce that V1 is now available in the [Pulumi Registry](https://www.pulumi.com/registry/packages/defang/). As much as we love Docker, we realize there are many real-world apps that have components that (currently) cannot be described completely in a Compose file. With the Defang Pulumi Provider, you can now leverage [the declarative simplicity of Defang with the imperative power of Pulumi](https://docs.defang.io/docs/concepts/pulumi#when-to-use-the-defang-pulumi-provider). diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md index eb7789acf..ee342b342 100644 --- a/docs/concepts/managed-llms/openai-access-gateway.md +++ b/docs/concepts/managed-llms/openai-access-gateway.md @@ -9,7 +9,7 @@ sidebar_position: 3000 Defang makes it easy to deploy on your favourite cloud's managed LLM service with our [OpenAI Access Gateway](https://github.com/DefangLabs/openai-access-gateway). This service sits between your application and the cloud service and acts as a compatibility layer. It handles incoming OpenAI requests, translates those requests to the appropriate cloud-native API, handles the native response, and re-constructs an OpenAI-compatible response. -See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex/) which describes how to configure the OpenAI Access Gateway for your application +See [our tutorial](/docs/tutorials/deploying-openai-apps) which describes how to configure the OpenAI Access Gateway for your application ## Docker Provider Services diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx new file mode 100644 index 000000000..30c2b1956 --- /dev/null +++ b/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx @@ -0,0 +1,164 @@ +--- +title: Deploying your OpenAI Application to AWS Bedrock +sidebar_position: 50 +--- + +# Deploying Your OpenAI Application to AWS Bedrock + +Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud on **AWS Bedrock**. + +This tutorial shows you how **Defang** makes it easy. + +Suppose you start with a compose file like this: + +```yaml +services: + app: + build: + context: . + ports: + - 3000:3000 + environment: + OPENAI_API_KEY: + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:3000/"] +``` + +--- + +## Add an LLM Service to Your Compose File + +You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock). + +Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service: + +```diff ++ llm: ++ image: defangio/openai-access-gateway ++ x-defang-llm: true ++ ports: ++ - target: 80 ++ published: 80 ++ mode: host ++ environment: ++ - OPENAI_API_KEY ++ - REGION +``` + +### Notes: + +- The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements. +- `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services. +- New environment variables: + - `REGION` is the zone where the services runs (for AWS this is the equvilent of AWS_REGION) + +:::tip +**OpenAI Key** + +You no longer need your original OpenAI API Key. +We recommend generating a random secret for authentication with the gateway: + +```bash +defang config set OPENAI_API_KEY --random +``` +::: + +--- + +## Redirect Application Traffic + +Modify your `app` service to send API calls to the `openai-access-gateway`: + +```diff + services: + app: + ports: + - 3000:3000 + environment: + OPENAI_API_KEY: ++ OPENAI_BASE_URL: "http://llm/api/v1" + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:3000/"] +``` + +Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock. + +--- + +## Selecting a Model + +You should configure your application to specify the model you want to use. + +```diff + services: + app: + ports: + - 3000:3000 + environment: + OPENAI_API_KEY: + OPENAI_BASE_URL: "http://llm/api/v1" ++ MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:3000/"] +``` + +Choose the correct `MODEL` depending on which cloud provider you are using. + +:::info +**Choosing the Right Model** + +- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). +::: + +Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to +the closest equilavent on the target platform. If no such match can be found a fallback can be defined to use a known existing model (e.g. ai/mistral). These environment +variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively. + + +:::info +# Complete Example Compose File + +```yaml +services: + app: + build: + context: . + ports: + - 3000:3000 + environment: + OPENAI_API_KEY: + OPENAI_BASE_URL: "http://llm/api/v1" + MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:3000/"] + + llm: + image: defangio/openai-access-gateway + x-defang-llm: true + ports: + - target: 80 + published: 80 + mode: host + environment: + - OPENAI_API_KEY + - REGION +``` + +--- + +# Environment Variable Matrix + +| Variable | AWS Bedrock | +|--------------------|-------------| +| `REGION` | Required| +| `MODEL` | Bedrock model ID / Docker model name | + +--- + +You now have a single app that can: + +- Talk to **GCP Vertex AI** +- Use the same OpenAI-compatible client code +- Easily switch cloud providers by changing a few environment variables +::: + diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-gcp-vertex.mdx similarity index 71% rename from docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx rename to docs/tutorials/deploying-openai-apps-gcp-vertex.mdx index a3cdf8313..a96d9b459 100644 --- a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx +++ b/docs/tutorials/deploying-openai-apps-gcp-vertex.mdx @@ -1,11 +1,11 @@ --- -title: Deploying your OpenAI Application to AWS Bedrock or GCP Vertex AI +title: Deploying your OpenAI Application to GCP Vertex AI sidebar_position: 50 --- -# Deploying Your OpenAI Application to AWS Bedrock or GCP Vertex AI +# Deploying Your OpenAI Application to GCP Vertex AI -Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud, either on **AWS Bedrock** or **GCP Vertex AI**. +Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud on **GCP Vertex AI**. This tutorial shows you how **Defang** makes it easy. @@ -28,7 +28,7 @@ services: ## Add an LLM Service to Your Compose File -You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock or Vertex). +You need to add a new service that acts as a proxy between your app and the backend LLM provider (Vertex). Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service: @@ -42,7 +42,7 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce + mode: host + environment: + - OPENAI_API_KEY -+ - GCP_PROJECT_ID # if using GCP Vertex AI ++ - GCP_PROJECT_ID + - REGION ``` @@ -51,8 +51,8 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce - The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements. - `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services. - New environment variables: - - `REGION` is the zone where the services runs (for AWS this is equvilent of AWS_REGION) - - `GCP_PROJECT_ID` is needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `REGION` = us-central1) + - `REGION` is the zone where the services runs (e.g. us-central1) + - `GCP_PROJECT_ID` is your project to deploy to (e.g. my-project-456789) :::tip **OpenAI Key** @@ -83,7 +83,7 @@ Modify your `app` service to send API calls to the `openai-access-gateway`: test: ["CMD", "curl", "-f", "http://localhost:3000/"] ``` -Now, all OpenAI traffic will be routed through your gateway service and onto AWS Bedrock or GCP Vertex. +Now, all OpenAI traffic will be routed through your gateway service and onto GCP Vertex AI. --- @@ -99,24 +99,25 @@ You should configure your application to specify the model you want to use. environment: OPENAI_API_KEY: OPENAI_BASE_URL: "http://llm/api/v1" -+ MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # for Bedrock -+ # MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI ++ MODEL: "google/gemini-2.5-pro-preview-03-25" # for Vertex AI healthcheck: test: ["CMD", "curl", "-f", "http://localhost:3000/"] ``` Choose the correct `MODEL` depending on which cloud provider you are using. -Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to -the closest matching one on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment -variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively. - :::info **Choosing the Right Model** -- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). - For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup) +::: +Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to +the closest matching one on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment +variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively. + + +:::info # Complete Example Compose File ```yaml @@ -129,7 +130,7 @@ services: environment: OPENAI_API_KEY: OPENAI_BASE_URL: "http://llm/api/v1" - MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" # or your Vertex AI model path + MODEL: "google/gemini-2.5-pro-preview-03-25" healthcheck: test: ["CMD", "curl", "-f", "http://localhost:3000/"] @@ -142,7 +143,7 @@ services: mode: host environment: - OPENAI_API_KEY - - GCP_PROJECT_ID # required if using Vertex AI + - GCP_PROJECT_ID # required if using GCP Vertex AI - REGION ``` @@ -150,17 +151,17 @@ services: # Environment Variable Matrix -| Variable | AWS Bedrock | GCP Vertex AI | -|--------------------|-------------|---------------| -| `GCP_PROJECT_ID` | _(not used)_| Required | -| `REGION` | Required| Required | -| `MODEL` | Bedrock model ID / Docker model name | Vertex model / Docker model name | +| Variable | GCP Vertex AI | +|--------------------|---------------| +| `GCP_PROJECT_ID` | Required | +| `REGION` | Required | +| `MODEL` | Vertex model / Docker model name | --- You now have a single app that can: -- Talk to **AWS Bedrock** or **GCP Vertex AI** +- Talk to **GCP Vertex AI** - Use the same OpenAI-compatible client code - Easily switch cloud providers by changing a few environment variables ::: diff --git a/docs/tutorials/deploying-openai-apps.mdx b/docs/tutorials/deploying-openai-apps.mdx new file mode 100644 index 000000000..f9a84c043 --- /dev/null +++ b/docs/tutorials/deploying-openai-apps.mdx @@ -0,0 +1,15 @@ +--- +title: Deploying your OpenAI Application +sidebar_position: 50 +--- + +# Deploying Your OpenAI application + +Defang currently supports LLM using AWS Bedrock and GCP Vertex AI. Follow the link below for your specific platform. + +- [AWS Bedrock](/docs/tutorials/deploying-openai-apps-aws-bedrock/) +- [GCP Vertex AI](/docs/tutorials/deploying-openai-apps-gcp-vertex/). + + + + From d465ab86d4027b5e26a3e770d606bbe8be97d0ee Mon Sep 17 00:00:00 2001 From: commit111 Date: Thu, 15 May 2025 14:03:21 -0700 Subject: [PATCH 08/19] add gcp to permissions link --- docs/concepts/managed-llms/managed-language-models.md | 2 +- .../deploying-openai-apps-aws-bedrock-gcp-vertex.mdx | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/concepts/managed-llms/managed-language-models.md b/docs/concepts/managed-llms/managed-language-models.md index 39b4296e7..6eea0a4be 100644 --- a/docs/concepts/managed-llms/managed-language-models.md +++ b/docs/concepts/managed-llms/managed-language-models.md @@ -13,7 +13,7 @@ Each cloud provider offers their own managed Large Language Model services. AWS In order to leverage cloud-native managed language models from your Defang services, all you need to do is add the `x-defang-llm` extension to the service config and Defang will configure the approprate roles and permissions for you. :::tip -Ensure you have the necessary permissions to access the model you intend to use. For example, if you are using AWS Bedrock, verify that your account has [model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html). +Ensure you have the necessary permissions to access the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model permissions](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-partner-models?hl=en#grant-permissions). ::: ## Example diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx index c5d65195d..d0b3d5ac5 100644 --- a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx +++ b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx @@ -106,8 +106,8 @@ You should configure your application to specify the model you want to use. Choose the correct `MODEL` depending on which cloud provider you are using. -Ensure you have the necessary permissions to access the model you intend to use. -For example, if you are using AWS Bedrock, verify that your account has [model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html). +Ensure you have the necessary permissions to access the model you intend to use. +To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model permissions](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-partner-models?hl=en#grant-permissions). :::info **Choosing the Right Model** From c355c51a655815ccefb6b7004cdbe845d2ce6456 Mon Sep 17 00:00:00 2001 From: commit111 Date: Thu, 15 May 2025 14:43:37 -0700 Subject: [PATCH 09/19] edit gcp link --- docs/concepts/managed-llms/managed-language-models.md | 2 +- docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/concepts/managed-llms/managed-language-models.md b/docs/concepts/managed-llms/managed-language-models.md index 6eea0a4be..4d9a07d39 100644 --- a/docs/concepts/managed-llms/managed-language-models.md +++ b/docs/concepts/managed-llms/managed-language-models.md @@ -13,7 +13,7 @@ Each cloud provider offers their own managed Large Language Model services. AWS In order to leverage cloud-native managed language models from your Defang services, all you need to do is add the `x-defang-llm` extension to the service config and Defang will configure the approprate roles and permissions for you. :::tip -Ensure you have the necessary permissions to access the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model permissions](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-partner-models?hl=en#grant-permissions). +Ensure you have the necessary permissions to access the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access). ::: ## Example diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx index d0b3d5ac5..e5a5d61a6 100644 --- a/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx +++ b/docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx @@ -107,7 +107,7 @@ You should configure your application to specify the model you want to use. Choose the correct `MODEL` depending on which cloud provider you are using. Ensure you have the necessary permissions to access the model you intend to use. -To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model permissions](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-partner-models?hl=en#grant-permissions). +To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access). :::info **Choosing the Right Model** From 7778339c8b938ae57e6dec39a6d929584dcbd636 Mon Sep 17 00:00:00 2001 From: Eric Liu Date: Thu, 15 May 2025 17:46:10 -0700 Subject: [PATCH 10/19] merge updates to links --- blog/2025-04-11-mar-product-updates.md | 2 +- .../managed-llms/openai-access-gateway.md | 2 +- ...ock.mdx => deploy-openai-apps-aws-bedrock.mdx} | 0 ...rtex.mdx => deploy-openai-apps-gcp-vertex.mdx} | 8 -------- docs/tutorials/deploy-openai-apps.mdx | 15 +++++++++++++++ docs/tutorials/deploying-openai-apps.mdx | 15 --------------- 6 files changed, 17 insertions(+), 25 deletions(-) rename docs/tutorials/{deploying-openai-apps-aws-bedrock.mdx => deploy-openai-apps-aws-bedrock.mdx} (100%) rename docs/tutorials/{deploying-openai-apps-gcp-vertex.mdx => deploy-openai-apps-gcp-vertex.mdx} (85%) create mode 100644 docs/tutorials/deploy-openai-apps.mdx delete mode 100644 docs/tutorials/deploying-openai-apps.mdx diff --git a/blog/2025-04-11-mar-product-updates.md b/blog/2025-04-11-mar-product-updates.md index 8070ab681..6bc3828ef 100644 --- a/blog/2025-04-11-mar-product-updates.md +++ b/blog/2025-04-11-mar-product-updates.md @@ -25,7 +25,7 @@ Wow - another month has gone by, time flies when you're having fun! Let us share some important updates regarding what we achieved at Defang in March: -**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploying-openai-apps) without making any changes to your code. Support for GCP and DigitalOcean coming soon. +**Managed LLMs:** One of the coolest features we have released in a bit is [support for Managed LLMs (such as AWS Bedrock) through the `x-defang-llm` compose service extension](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). When coupled with the `defang/openai-access-gateway` service image, Defang offers the easiest way to [migrate your OpenAI-compatible application to cloud-native managed LLMs](https://docs.defang.io/docs/tutorials/deploy-openai-apps) without making any changes to your code. Support for GCP and DigitalOcean coming soon. **Defang Pulumi Provider:** Last month, we announced a preview of the [Defang Pulumi Provider](https://github.com/DefangLabs/pulumi-defang), and this month we are excited to announce that V1 is now available in the [Pulumi Registry](https://www.pulumi.com/registry/packages/defang/). As much as we love Docker, we realize there are many real-world apps that have components that (currently) cannot be described completely in a Compose file. With the Defang Pulumi Provider, you can now leverage [the declarative simplicity of Defang with the imperative power of Pulumi](https://docs.defang.io/docs/concepts/pulumi#when-to-use-the-defang-pulumi-provider). diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md index 7f3a1f04c..d18fd81da 100644 --- a/docs/concepts/managed-llms/openai-access-gateway.md +++ b/docs/concepts/managed-llms/openai-access-gateway.md @@ -9,7 +9,7 @@ sidebar_position: 3000 Defang makes it easy to deploy on your favourite cloud's managed LLM service with our [OpenAI Access Gateway](https://github.com/DefangLabs/openai-access-gateway). This service sits between your application and the cloud service and acts as a compatibility layer. It handles incoming OpenAI requests, translates those requests to the appropriate cloud-native API, handles the native response, and re-constructs an OpenAI-compatible response. -See [our tutorial](/docs/tutorials/deploying-openai-apps) which describes how to configure the OpenAI Access Gateway for your application +See [our tutorial](/docs/tutorials/deploy-openai-apps) which describes how to configure the OpenAI Access Gateway for your application ## Docker Provider Services diff --git a/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx b/docs/tutorials/deploy-openai-apps-aws-bedrock.mdx similarity index 100% rename from docs/tutorials/deploying-openai-apps-aws-bedrock.mdx rename to docs/tutorials/deploy-openai-apps-aws-bedrock.mdx diff --git a/docs/tutorials/deploying-openai-apps-gcp-vertex.mdx b/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx similarity index 85% rename from docs/tutorials/deploying-openai-apps-gcp-vertex.mdx rename to docs/tutorials/deploy-openai-apps-gcp-vertex.mdx index 9999c38b2..38328fa0f 100644 --- a/docs/tutorials/deploying-openai-apps-gcp-vertex.mdx +++ b/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx @@ -112,12 +112,7 @@ To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazo :::info **Choosing the Right Model** -<<<<<<< HEAD:docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx -- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). -- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup). -======= - For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup) ->>>>>>> eric-update-for-play-ground:docs/tutorials/deploying-openai-apps-gcp-vertex.mdx ::: Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to @@ -172,8 +167,5 @@ You now have a single app that can: - Talk to **GCP Vertex AI** - Use the same OpenAI-compatible client code - Easily switch cloud providers by changing a few environment variables -<<<<<<< HEAD:docs/tutorials/deploying-openai-apps-aws-bedrock-gcp-vertex.mdx -======= ::: ->>>>>>> eric-update-for-play-ground:docs/tutorials/deploying-openai-apps-gcp-vertex.mdx diff --git a/docs/tutorials/deploy-openai-apps.mdx b/docs/tutorials/deploy-openai-apps.mdx new file mode 100644 index 000000000..ba97a1bf3 --- /dev/null +++ b/docs/tutorials/deploy-openai-apps.mdx @@ -0,0 +1,15 @@ +--- +title: Deploy your OpenAI Apps +sidebar_position: 45 +--- + +# Deploy Your OpenAI Apps + +Defang currently supports LLM using AWS Bedrock and GCP Vertex AI. Follow the link below for your specific platform. + +- [AWS Bedrock](/docs/tutorials/deploy-openai-apps-aws-bedrock/) +- [GCP Vertex AI](/docs/tutorials/deploy-openai-apps-gcp-vertex/). + + + + diff --git a/docs/tutorials/deploying-openai-apps.mdx b/docs/tutorials/deploying-openai-apps.mdx deleted file mode 100644 index f9a84c043..000000000 --- a/docs/tutorials/deploying-openai-apps.mdx +++ /dev/null @@ -1,15 +0,0 @@ ---- -title: Deploying your OpenAI Application -sidebar_position: 50 ---- - -# Deploying Your OpenAI application - -Defang currently supports LLM using AWS Bedrock and GCP Vertex AI. Follow the link below for your specific platform. - -- [AWS Bedrock](/docs/tutorials/deploying-openai-apps-aws-bedrock/) -- [GCP Vertex AI](/docs/tutorials/deploying-openai-apps-gcp-vertex/). - - - - From be3be604ffa7767f3c59ba821262b57fd229a901 Mon Sep 17 00:00:00 2001 From: Eric Liu Date: Thu, 15 May 2025 20:26:46 -0700 Subject: [PATCH 11/19] review updates --- docs/concepts/managed-llms/openai-access-gateway.md | 2 +- docs/tutorials/deploy-openai-apps-aws-bedrock.mdx | 8 ++++---- docs/tutorials/deploy-openai-apps-gcp-vertex.mdx | 4 ++-- docs/tutorials/deploy-openai-apps.mdx | 4 ---- 4 files changed, 7 insertions(+), 11 deletions(-) diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md index d18fd81da..b18798154 100644 --- a/docs/concepts/managed-llms/openai-access-gateway.md +++ b/docs/concepts/managed-llms/openai-access-gateway.md @@ -37,7 +37,7 @@ The `x-defang-llm` extension is used to configure the appropriate roles and perm ## Model Mapping -Defang supports model mapping through the openai-access-gateway on AWS and GCP. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to the closest matching model name on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively. +Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway) on AWS and GCP. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to the closest matching model name on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively. ## Current Support diff --git a/docs/tutorials/deploy-openai-apps-aws-bedrock.mdx b/docs/tutorials/deploy-openai-apps-aws-bedrock.mdx index 30c2b1956..29de2cdaa 100644 --- a/docs/tutorials/deploy-openai-apps-aws-bedrock.mdx +++ b/docs/tutorials/deploy-openai-apps-aws-bedrock.mdx @@ -110,11 +110,11 @@ Choose the correct `MODEL` depending on which cloud provider you are using. - For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). ::: -Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to +Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. `ai/lama3.3`) and maps it to the closest equilavent on the target platform. If no such match can be found a fallback can be defined to use a known existing model (e.g. ai/mistral). These environment variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively. - + :::info # Complete Example Compose File @@ -151,13 +151,13 @@ services: | Variable | AWS Bedrock | |--------------------|-------------| | `REGION` | Required| -| `MODEL` | Bedrock model ID / Docker model name | +| `MODEL` | Bedrock model ID or Docker model name, for example `meta.llama3-3-70b-instruct-v1:0` or `ai/lama3.3` | --- You now have a single app that can: -- Talk to **GCP Vertex AI** +- Talk to **AWS Bedrock** - Use the same OpenAI-compatible client code - Easily switch cloud providers by changing a few environment variables ::: diff --git a/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx b/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx index 38328fa0f..98643e3ae 100644 --- a/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx +++ b/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx @@ -117,7 +117,7 @@ To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazo Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to the closest matching one on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment -variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively. +variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively. :::info @@ -158,7 +158,7 @@ services: |--------------------|---------------| | `GCP_PROJECT_ID` | Required | | `REGION` | Required | -| `MODEL` | Vertex model / Docker model name | +| `MODEL` | Vertex model or Docker model name, for example `publishers/meta/models/llama-3.3-70b-instruct-maas` or `ai/llama3.3` | --- diff --git a/docs/tutorials/deploy-openai-apps.mdx b/docs/tutorials/deploy-openai-apps.mdx index ba97a1bf3..9493cfd74 100644 --- a/docs/tutorials/deploy-openai-apps.mdx +++ b/docs/tutorials/deploy-openai-apps.mdx @@ -9,7 +9,3 @@ Defang currently supports LLM using AWS Bedrock and GCP Vertex AI. Follow the li - [AWS Bedrock](/docs/tutorials/deploy-openai-apps-aws-bedrock/) - [GCP Vertex AI](/docs/tutorials/deploy-openai-apps-gcp-vertex/). - - - - From 418771ecfd9d133de7d7565b8a622f26f4906ec6 Mon Sep 17 00:00:00 2001 From: "Linda L." Date: Fri, 16 May 2025 10:00:46 -0700 Subject: [PATCH 12/19] Apply suggestions from code review --- .../managed-llms/managed-language-models.md | 2 +- .../concepts/managed-llms/openai-access-gateway.md | 4 ++-- docs/tutorials/deploy-openai-apps-aws-bedrock.mdx | 14 +++++++------- docs/tutorials/deploy-openai-apps-gcp-vertex.mdx | 8 ++++---- 4 files changed, 14 insertions(+), 14 deletions(-) diff --git a/docs/concepts/managed-llms/managed-language-models.md b/docs/concepts/managed-llms/managed-language-models.md index ee4fe4c43..6b6ffed2e 100644 --- a/docs/concepts/managed-llms/managed-language-models.md +++ b/docs/concepts/managed-llms/managed-language-models.md @@ -39,4 +39,4 @@ Assume you have a web service like the following, which uses the cloud native SD ## Deploying OpenAI-compatible apps -If you already have an OpenAI-compatible application, Defang makes it easy to deploy on your favourite cloud's managed LLM service. See our [OpenAI Access Gateway](/docs/concepts/managed-llms/openai-access-gateway) +If you already have an OpenAI-compatible application, Defang makes it easy to deploy on your favourite cloud's managed LLM service. See our [OpenAI Access Gateway](/docs/concepts/managed-llms/openai-access-gateway). diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md index b18798154..185539456 100644 --- a/docs/concepts/managed-llms/openai-access-gateway.md +++ b/docs/concepts/managed-llms/openai-access-gateway.md @@ -9,7 +9,7 @@ sidebar_position: 3000 Defang makes it easy to deploy on your favourite cloud's managed LLM service with our [OpenAI Access Gateway](https://github.com/DefangLabs/openai-access-gateway). This service sits between your application and the cloud service and acts as a compatibility layer. It handles incoming OpenAI requests, translates those requests to the appropriate cloud-native API, handles the native response, and re-constructs an OpenAI-compatible response. -See [our tutorial](/docs/tutorials/deploy-openai-apps) which describes how to configure the OpenAI Access Gateway for your application +See [our tutorial](/docs/tutorials/deploy-openai-apps) which describes how to configure the OpenAI Access Gateway for your application. ## Docker Provider Services @@ -37,7 +37,7 @@ The `x-defang-llm` extension is used to configure the appropriate roles and perm ## Model Mapping -Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway) on AWS and GCP. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to the closest matching model name on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively. +Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway) on AWS and GCP. This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to the closest matching model name on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. `ai/mistral`). These environment variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively. ## Current Support diff --git a/docs/tutorials/deploy-openai-apps-aws-bedrock.mdx b/docs/tutorials/deploy-openai-apps-aws-bedrock.mdx index 29de2cdaa..d90c7c7ec 100644 --- a/docs/tutorials/deploy-openai-apps-aws-bedrock.mdx +++ b/docs/tutorials/deploy-openai-apps-aws-bedrock.mdx @@ -1,9 +1,9 @@ --- -title: Deploying your OpenAI Application to AWS Bedrock +title: Deploy OpenAI Apps to AWS Bedrock sidebar_position: 50 --- -# Deploying Your OpenAI Application to AWS Bedrock +# Deploy OpenAI Apps to AWS Bedrock Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud on **AWS Bedrock**. @@ -50,7 +50,7 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce - The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements. - `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services. - New environment variables: - - `REGION` is the zone where the services runs (for AWS this is the equvilent of AWS_REGION) + - `REGION` is the zone where the services runs (for AWS, this is the equivalent of AWS_REGION) :::tip **OpenAI Key** @@ -110,9 +110,9 @@ Choose the correct `MODEL` depending on which cloud provider you are using. - For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). ::: -Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. `ai/lama3.3`) and maps it to -the closest equilavent on the target platform. If no such match can be found a fallback can be defined to use a known existing model (e.g. ai/mistral). These environment -variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively. +Alternatively, Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway). This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to +the closest equivalent on the target platform. If no such match can be found ,a fallback can be defined to use a known existing model (e.g. `ai/mistral`). These environment +variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively. :::info @@ -151,7 +151,7 @@ services: | Variable | AWS Bedrock | |--------------------|-------------| | `REGION` | Required| -| `MODEL` | Bedrock model ID or Docker model name, for example `meta.llama3-3-70b-instruct-v1:0` or `ai/lama3.3` | +| `MODEL` | Bedrock model ID or Docker model name, for example `meta.llama3-3-70b-instruct-v1:0` or `ai/llama3.3` | --- diff --git a/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx b/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx index 98643e3ae..30338341e 100644 --- a/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx +++ b/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx @@ -28,7 +28,7 @@ services: ## Add an LLM Service to Your Compose File -You need to add a new service that acts as a proxy between your app and the backend LLM provider (Vertex). +You need to add a new service that acts as a proxy between your app and the backend LLM provider (Vertex AI). Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service: @@ -112,11 +112,11 @@ To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazo :::info **Choosing the Right Model** -- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`) [See available Vertex models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup) +- For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`). [See available Vertex AI models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup). ::: -Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to -the closest matching one on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment +Alternatively, Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway). This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to +the closest matching one on the target platform. If no such match can be found, it can fallback onto a known existing model (e.g. `ai/mistral`). These environment variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively. From e6fbc320d01b82179f4975101a0d50dc7097603c Mon Sep 17 00:00:00 2001 From: commit111 Date: Fri, 16 May 2025 10:30:11 -0700 Subject: [PATCH 13/19] move llm pages into folders --- docs/tutorials/deploy-openai-apps.mdx | 11 ----------- .../deploy-openai-apps/_category_.json | 5 +++++ .../aws-bedrock.mdx} | 17 ++++++++--------- .../deploy-openai-apps/deploy-openai-apps.mdx | 11 +++++++++++ .../gcp-vertex.mdx} | 15 +++++++-------- 5 files changed, 31 insertions(+), 28 deletions(-) delete mode 100644 docs/tutorials/deploy-openai-apps.mdx create mode 100644 docs/tutorials/deploy-openai-apps/_category_.json rename docs/tutorials/{deploy-openai-apps-aws-bedrock.mdx => deploy-openai-apps/aws-bedrock.mdx} (91%) create mode 100644 docs/tutorials/deploy-openai-apps/deploy-openai-apps.mdx rename docs/tutorials/{deploy-openai-apps-gcp-vertex.mdx => deploy-openai-apps/gcp-vertex.mdx} (92%) diff --git a/docs/tutorials/deploy-openai-apps.mdx b/docs/tutorials/deploy-openai-apps.mdx deleted file mode 100644 index 9493cfd74..000000000 --- a/docs/tutorials/deploy-openai-apps.mdx +++ /dev/null @@ -1,11 +0,0 @@ ---- -title: Deploy your OpenAI Apps -sidebar_position: 45 ---- - -# Deploy Your OpenAI Apps - -Defang currently supports LLM using AWS Bedrock and GCP Vertex AI. Follow the link below for your specific platform. - -- [AWS Bedrock](/docs/tutorials/deploy-openai-apps-aws-bedrock/) -- [GCP Vertex AI](/docs/tutorials/deploy-openai-apps-gcp-vertex/). diff --git a/docs/tutorials/deploy-openai-apps/_category_.json b/docs/tutorials/deploy-openai-apps/_category_.json new file mode 100644 index 000000000..9fdae3a63 --- /dev/null +++ b/docs/tutorials/deploy-openai-apps/_category_.json @@ -0,0 +1,5 @@ +{ + "label": "Deploy OpenAI Apps on Managed LLMs", + "position": 45, + "collapsible": true +} diff --git a/docs/tutorials/deploy-openai-apps-aws-bedrock.mdx b/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx similarity index 91% rename from docs/tutorials/deploy-openai-apps-aws-bedrock.mdx rename to docs/tutorials/deploy-openai-apps/aws-bedrock.mdx index d90c7c7ec..48fc38c19 100644 --- a/docs/tutorials/deploy-openai-apps-aws-bedrock.mdx +++ b/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx @@ -1,7 +1,9 @@ --- -title: Deploy OpenAI Apps to AWS Bedrock +title: AWS Bedrock sidebar_position: 50 --- +import React from 'react'; +import {useColorMode} from '@docusaurus/theme-common'; # Deploy OpenAI Apps to AWS Bedrock @@ -9,7 +11,7 @@ Let's assume you have an app that uses an OpenAI client library and you want to This tutorial shows you how **Defang** makes it easy. -Suppose you start with a compose file like this: +Suppose you start with a Compose file like this: ```yaml services: @@ -107,16 +109,15 @@ Choose the correct `MODEL` depending on which cloud provider you are using. :::info **Choosing the Right Model** -- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`) [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). +- For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`). [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). ::: Alternatively, Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway). This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to the closest equivalent on the target platform. If no such match can be found ,a fallback can be defined to use a known existing model (e.g. `ai/mistral`). These environment variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively. - -:::info -# Complete Example Compose File + +## Complete Example Compose File ```yaml services: @@ -146,7 +147,7 @@ services: --- -# Environment Variable Matrix +## Environment Variable Matrix | Variable | AWS Bedrock | |--------------------|-------------| @@ -160,5 +161,3 @@ You now have a single app that can: - Talk to **AWS Bedrock** - Use the same OpenAI-compatible client code - Easily switch cloud providers by changing a few environment variables -::: - diff --git a/docs/tutorials/deploy-openai-apps/deploy-openai-apps.mdx b/docs/tutorials/deploy-openai-apps/deploy-openai-apps.mdx new file mode 100644 index 000000000..ae206d30f --- /dev/null +++ b/docs/tutorials/deploy-openai-apps/deploy-openai-apps.mdx @@ -0,0 +1,11 @@ +--- +title: Deploy OpenAI Apps on Managed LLMs +sidebar_position: 45 +--- + +# Deploy OpenAI Apps on Managed LLMs + +Defang currently supports using Managed LLMs with AWS Bedrock and GCP Vertex AI. Follow the link below for your specific platform. + +- [AWS Bedrock](/docs/tutorials/deploy-openai-apps/aws-bedrock/) +- [GCP Vertex AI](/docs/tutorials/deploy-openai-apps/gcp-vertex/) diff --git a/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx b/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx similarity index 92% rename from docs/tutorials/deploy-openai-apps-gcp-vertex.mdx rename to docs/tutorials/deploy-openai-apps/gcp-vertex.mdx index 30338341e..3b9da9b2f 100644 --- a/docs/tutorials/deploy-openai-apps-gcp-vertex.mdx +++ b/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx @@ -1,7 +1,9 @@ --- -title: Deploy OpenAI Apps to GCP Vertex AI +title: GCP Vertex AI sidebar_position: 50 --- +import React from 'react'; +import {useColorMode} from '@docusaurus/theme-common'; # Deploy OpenAI Apps to GCP Vertex AI @@ -9,7 +11,7 @@ Let's assume you have an application that uses an OpenAI client library and you This tutorial shows you how **Defang** makes it easy. -Suppose you start with a compose file like this: +Suppose you start with a Compose file like this: ```yaml services: @@ -120,8 +122,7 @@ the closest matching one on the target platform. If no such match can be found, variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively. -:::info -# Complete Example Compose File +## Complete Example Compose File ```yaml services: @@ -152,13 +153,13 @@ services: --- -# Environment Variable Matrix +## Environment Variable Matrix | Variable | GCP Vertex AI | |--------------------|---------------| | `GCP_PROJECT_ID` | Required | | `REGION` | Required | -| `MODEL` | Vertex model or Docker model name, for example `publishers/meta/models/llama-3.3-70b-instruct-maas` or `ai/llama3.3` | +| `MODEL` | Vertex model ID or Docker model name, for example `publishers/meta/models/llama-3.3-70b-instruct-maas` or `ai/llama3.3` | --- @@ -167,5 +168,3 @@ You now have a single app that can: - Talk to **GCP Vertex AI** - Use the same OpenAI-compatible client code - Easily switch cloud providers by changing a few environment variables -::: - From cc04ac2bdd5edf9bba9aa913d70edc15ef1624c7 Mon Sep 17 00:00:00 2001 From: commit111 Date: Fri, 16 May 2025 10:34:38 -0700 Subject: [PATCH 14/19] add metadata descriptions --- docs/tutorials/deploy-openai-apps/aws-bedrock.mdx | 1 + docs/tutorials/deploy-openai-apps/deploy-openai-apps.mdx | 3 ++- docs/tutorials/deploy-openai-apps/gcp-vertex.mdx | 1 + 3 files changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx b/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx index 48fc38c19..152b16b13 100644 --- a/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx +++ b/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx @@ -1,6 +1,7 @@ --- title: AWS Bedrock sidebar_position: 50 +description: Deploy OpenAI Apps to AWS Bedrock using Defang. --- import React from 'react'; import {useColorMode} from '@docusaurus/theme-common'; diff --git a/docs/tutorials/deploy-openai-apps/deploy-openai-apps.mdx b/docs/tutorials/deploy-openai-apps/deploy-openai-apps.mdx index ae206d30f..e7637d1cb 100644 --- a/docs/tutorials/deploy-openai-apps/deploy-openai-apps.mdx +++ b/docs/tutorials/deploy-openai-apps/deploy-openai-apps.mdx @@ -1,11 +1,12 @@ --- title: Deploy OpenAI Apps on Managed LLMs sidebar_position: 45 +description: Deploy OpenAI Apps on Managed LLMs --- # Deploy OpenAI Apps on Managed LLMs -Defang currently supports using Managed LLMs with AWS Bedrock and GCP Vertex AI. Follow the link below for your specific platform. +Defang currently supports using Managed LLMs on AWS Bedrock and GCP Vertex AI. Follow the link below for your specific platform. - [AWS Bedrock](/docs/tutorials/deploy-openai-apps/aws-bedrock/) - [GCP Vertex AI](/docs/tutorials/deploy-openai-apps/gcp-vertex/) diff --git a/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx b/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx index 3b9da9b2f..f8e19129a 100644 --- a/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx +++ b/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx @@ -1,6 +1,7 @@ --- title: GCP Vertex AI sidebar_position: 50 +description: Deploy OpenAI Apps to GCP Vertex AI using Defang. --- import React from 'react'; import {useColorMode} from '@docusaurus/theme-common'; From e920df09f24c157a4d37b9bdd521bf012eab3161 Mon Sep 17 00:00:00 2001 From: commit111 Date: Fri, 16 May 2025 11:05:08 -0700 Subject: [PATCH 15/19] rename docker model provider services + fix comma --- docs/concepts/managed-llms/openai-access-gateway.md | 2 +- docs/tutorials/deploy-openai-apps/aws-bedrock.mdx | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md index 185539456..839e57d79 100644 --- a/docs/concepts/managed-llms/openai-access-gateway.md +++ b/docs/concepts/managed-llms/openai-access-gateway.md @@ -11,7 +11,7 @@ It handles incoming OpenAI requests, translates those requests to the appropriat See [our tutorial](/docs/tutorials/deploy-openai-apps) which describes how to configure the OpenAI Access Gateway for your application. -## Docker Provider Services +## Docker Model Provider Services As of Docker Compose v2.35 and Docker Desktop v4.41, Compose introduces a new service type called `provider` that allows you to declare platform capabilities required by your application. For AI models, you use the `model` type to declare model dependencies. This will expose an OpenAI compatible API for your service. Check the [Docker Model Runner documentation](https://docs.docker.com/compose/how-tos/model-runner/) for more details. diff --git a/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx b/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx index 152b16b13..1b2b80491 100644 --- a/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx +++ b/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx @@ -114,7 +114,7 @@ Choose the correct `MODEL` depending on which cloud provider you are using. ::: Alternatively, Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway). This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to -the closest equivalent on the target platform. If no such match can be found ,a fallback can be defined to use a known existing model (e.g. `ai/mistral`). These environment +the closest equivalent on the target platform. If no such match can be found, a fallback can be defined to use a known existing model (e.g. `ai/mistral`). These environment variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively. From fc84200e780895f020d71c9092c01916636cfa92 Mon Sep 17 00:00:00 2001 From: "Linda L." Date: Fri, 16 May 2025 11:11:05 -0700 Subject: [PATCH 16/19] Apply suggestions from code review Co-authored-by: Jordan Stephens --- .../managed-llms/managed-language-models.md | 2 +- .../managed-llms/openai-access-gateway.md | 6 +++++- .../deploy-openai-apps/aws-bedrock.mdx | 11 ++++++----- .../tutorials/deploy-openai-apps/gcp-vertex.mdx | 17 +++++++++-------- 4 files changed, 21 insertions(+), 15 deletions(-) diff --git a/docs/concepts/managed-llms/managed-language-models.md b/docs/concepts/managed-llms/managed-language-models.md index 6b6ffed2e..8623e3206 100644 --- a/docs/concepts/managed-llms/managed-language-models.md +++ b/docs/concepts/managed-llms/managed-language-models.md @@ -21,7 +21,7 @@ Each cloud provider offers their own managed Large Language Model services. AWS In order to leverage cloud-native managed language models from your Defang services, all you need to do is add the `x-defang-llm` extension to the service config and Defang will configure the approprate roles and permissions for you. -:::tip +:::info Ensure you have the necessary permissions to access the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access). ::: diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md index 839e57d79..0c9b31336 100644 --- a/docs/concepts/managed-llms/openai-access-gateway.md +++ b/docs/concepts/managed-llms/openai-access-gateway.md @@ -37,7 +37,11 @@ The `x-defang-llm` extension is used to configure the appropriate roles and perm ## Model Mapping -Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway) on AWS and GCP. This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to the closest matching model name on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. `ai/mistral`). These environment variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively. +Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway) on AWS and GCP. This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to the closest matching model name on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. `ai/mistral`). + +This can be configured through the following environment variables: +* `USE_MODEL_MAPPING` (default to true) - configures whether or not model mapping should be enabled. +* `FALLBACK_MODEL` (no default) - configure a model which will be used if model mapping fails to find a target model. ## Current Support diff --git a/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx b/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx index 1b2b80491..46dcc81c7 100644 --- a/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx +++ b/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx @@ -11,8 +11,11 @@ import {useColorMode} from '@docusaurus/theme-common'; Let's assume you have an app that uses an OpenAI client library and you want to deploy it to the cloud on **AWS Bedrock**. This tutorial shows you how **Defang** makes it easy. +:::info +You must [configure AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) for each model you intend to use in your AWS account. +::: -Suppose you start with a Compose file like this: +Suppose you start with a `compose.yaml` file with one `app` service, like this: ```yaml services: @@ -31,9 +34,7 @@ services: ## Add an LLM Service to Your Compose File -You need to add a new service that acts as a proxy between your app and the backend LLM provider (Bedrock). - -Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service: +You can use AWS Bedrock without changing your `app` code by introducing a new [`defangio/openai-access-gateway`](https://github.com/DefangLabs/openai-access-gateway) service. We'll call the new service `llm`. This new service will act as a proxy between your application and AWS Bedrock, and will transparently handle converting your OpenAI requests into AWS Bedrock requests and Bedrock responses into OpenAI responses. This allows you to use AWS Bedrock with your existing OpenAI client SDK. ```diff + llm: @@ -161,4 +162,4 @@ You now have a single app that can: - Talk to **AWS Bedrock** - Use the same OpenAI-compatible client code -- Easily switch cloud providers by changing a few environment variables +- Easily switch between models or cloud providers by changing a few environment variables diff --git a/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx b/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx index f8e19129a..ed7ee1807 100644 --- a/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx +++ b/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx @@ -11,8 +11,11 @@ import {useColorMode} from '@docusaurus/theme-common'; Let's assume you have an application that uses an OpenAI client library and you want to deploy it to the cloud using **GCP Vertex AI**. This tutorial shows you how **Defang** makes it easy. +:::info +You must [configure GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access) for each model you intend to use in your GCP account. +::: -Suppose you start with a Compose file like this: +Suppose you start with a `compose.yaml` file with one `app` service, like this: ```yaml services: @@ -31,9 +34,7 @@ services: ## Add an LLM Service to Your Compose File -You need to add a new service that acts as a proxy between your app and the backend LLM provider (Vertex AI). - -Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway)** service: +You can use Vertex AI without changing your `app` code by introducing a new [`defangio/openai-access-gateway`](https://github.com/DefangLabs/openai-access-gateway) service. We'll call the new service `llm`. This new service will act as a proxy between your application and Vertex AI, and will transparently handle converting your OpenAI requests into Vertex AI requests and Vertex AI responses into OpenAI responses. This allows you to use Vertex AI with your existing OpenAI client SDK. ```diff + llm: @@ -54,8 +55,8 @@ Add **Defang's [openai-access-gateway](https://github.com/DefangLabs/openai-acce - The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements. - `x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services. - New environment variables: - - `REGION` is the zone where the services runs (e.g. us-central1) - - `GCP_PROJECT_ID` is your project to deploy to (e.g. my-project-456789) + - `REGION` is the zone where the services runs (e.g. `us-central1`) + - `GCP_PROJECT_ID` is your project to deploy to (e.g. `my-project-456789`) :::tip **OpenAI Key** @@ -148,7 +149,7 @@ services: mode: host environment: - OPENAI_API_KEY - - GCP_PROJECT_ID # required if using GCP Vertex AI + - GCP_PROJECT_ID - REGION ``` @@ -168,4 +169,4 @@ You now have a single app that can: - Talk to **GCP Vertex AI** - Use the same OpenAI-compatible client code -- Easily switch cloud providers by changing a few environment variables +- Easily switch between models or cloud providers by changing a few environment variables From a848ec258bc83f1dc11dec6eabcc2605d5070563 Mon Sep 17 00:00:00 2001 From: commit111 Date: Fri, 16 May 2025 11:15:39 -0700 Subject: [PATCH 17/19] apply code review changes --- docs/concepts/managed-llms/managed-language-models.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/concepts/managed-llms/managed-language-models.md b/docs/concepts/managed-llms/managed-language-models.md index 8623e3206..0b1c69e61 100644 --- a/docs/concepts/managed-llms/managed-language-models.md +++ b/docs/concepts/managed-llms/managed-language-models.md @@ -21,12 +21,14 @@ Each cloud provider offers their own managed Large Language Model services. AWS In order to leverage cloud-native managed language models from your Defang services, all you need to do is add the `x-defang-llm` extension to the service config and Defang will configure the approprate roles and permissions for you. +## Example + :::info -Ensure you have the necessary permissions to access the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access). +Ensure you have enabled model access for the model you intend to use: +* [Configure AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) +* [Configure GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access) ::: -## Example - Assume you have a web service like the following, which uses the cloud native SDK, for example: ```diff From e195fea6d022dfaae4a546914f7cf3c680ffa893 Mon Sep 17 00:00:00 2001 From: commit111 Date: Fri, 16 May 2025 11:21:37 -0700 Subject: [PATCH 18/19] add links + comma --- docs/concepts/managed-llms/openai-access-gateway.md | 2 +- docs/tutorials/deploy-openai-apps/aws-bedrock.mdx | 2 +- docs/tutorials/deploy-openai-apps/gcp-vertex.mdx | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/concepts/managed-llms/openai-access-gateway.md b/docs/concepts/managed-llms/openai-access-gateway.md index 0c9b31336..093243df6 100644 --- a/docs/concepts/managed-llms/openai-access-gateway.md +++ b/docs/concepts/managed-llms/openai-access-gateway.md @@ -37,7 +37,7 @@ The `x-defang-llm` extension is used to configure the appropriate roles and perm ## Model Mapping -Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway) on AWS and GCP. This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to the closest matching model name on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. `ai/mistral`). +Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway) on AWS and GCP. This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to the closest matching model name on the target platform. If no such match can be found, it can fallback onto a known existing model (e.g. `ai/mistral`). This can be configured through the following environment variables: * `USE_MODEL_MAPPING` (default to true) - configures whether or not model mapping should be enabled. diff --git a/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx b/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx index 46dcc81c7..fd24b7e7b 100644 --- a/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx +++ b/docs/tutorials/deploy-openai-apps/aws-bedrock.mdx @@ -114,7 +114,7 @@ Choose the correct `MODEL` depending on which cloud provider you are using. - For **AWS Bedrock**, use a Bedrock model ID (e.g., `anthropic.claude-3-sonnet-20240229-v1:0`). [See available Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). ::: -Alternatively, Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway). This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to +Alternatively, Defang supports [model mapping](/docs/concepts/managed-llms/openai-access-gateway/#model-mapping) through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway). This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to the closest equivalent on the target platform. If no such match can be found, a fallback can be defined to use a known existing model (e.g. `ai/mistral`). These environment variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively. diff --git a/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx b/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx index ed7ee1807..dfa48d095 100644 --- a/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx +++ b/docs/tutorials/deploy-openai-apps/gcp-vertex.mdx @@ -119,7 +119,7 @@ To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazo - For **GCP Vertex AI**, use a full model path (e.g., `google/gemini-2.5-pro-preview-03-25`). [See available Vertex AI models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#client-setup). ::: -Alternatively, Defang supports model mapping through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway). This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to +Alternatively, Defang supports [model mapping](/docs/concepts/managed-llms/openai-access-gateway/#model-mapping) through the [openai-access-gateway](https://github.com/DefangLabs/openai-access-gateway). This takes a model with a Docker naming convention (e.g. `ai/llama3.3`) and maps it to the closest matching one on the target platform. If no such match can be found, it can fallback onto a known existing model (e.g. `ai/mistral`). These environment variables are `USE_MODEL_MAPPING` (default to true) and `FALLBACK_MODEL` (no default), respectively. From 59ced8e073205ad6e16a19e5630014294db3d05e Mon Sep 17 00:00:00 2001 From: commit111 Date: Fri, 16 May 2025 12:10:29 -0700 Subject: [PATCH 19/19] add sample tip --- docs/concepts/managed-llms/managed-language-models.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/concepts/managed-llms/managed-language-models.md b/docs/concepts/managed-llms/managed-language-models.md index 0b1c69e61..6c240c782 100644 --- a/docs/concepts/managed-llms/managed-language-models.md +++ b/docs/concepts/managed-llms/managed-language-models.md @@ -42,3 +42,7 @@ Assume you have a web service like the following, which uses the cloud native SD ## Deploying OpenAI-compatible apps If you already have an OpenAI-compatible application, Defang makes it easy to deploy on your favourite cloud's managed LLM service. See our [OpenAI Access Gateway](/docs/concepts/managed-llms/openai-access-gateway). + +:::tip +Defang has a [*Managed LLM sample*](https://github.com/DefangLabs/samples/tree/main/samples/managed-llm) that uses the OpenAI Access Gateway, and a [*Managed LLM with Docker Model Provider sample*](https://github.com/DefangLabs/samples/tree/main/samples/managed-llm-provider) that uses a Docker Model Provider. +:::