-
Notifications
You must be signed in to change notification settings - Fork 6
x-defang-llm documentation #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
{ | ||
"label": "Managed LLMs", | ||
"position": 425, | ||
"collapsible": true | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
--- | ||
title: Leveraging Managed Language Models with Defang | ||
description: Defang makes it easy to leverage cloud-native managed language models. | ||
sidebar_position: 3000 | ||
--- | ||
|
||
# Managed Language Models | ||
|
||
Each cloud provider offers their own managed Large Language Model services. AWS offers Bedrock, GCP offers Vertex, and Digital Ocean offers their GenAI platform. Defang makes it easy to leverage these services in your projects. | ||
|
||
## Usage | ||
|
||
In order to leverage cloud-native managed language models from your Defang services, all you need to do is add the `x-defang-llm` extension to the service config and Defang will configure the approprate roles and permissions for you. | ||
|
||
## Example | ||
|
||
Assume you have a web service like the following, which uses the cloud native SDK, for example: | ||
|
||
```diff | ||
services: | ||
app: | ||
build: | ||
context: . | ||
+ x-defang-llm: true | ||
``` | ||
|
||
## Deploying OpenAI-compatible apps | ||
|
||
If you already have an OpenAI-compatible application, Defang makes it easy to deploy on your favourite cloud's managed LLM service. See our [OpenAI Access Gateway](/docs/concepts/openai-access-gateway.md) | ||
|
||
## Current Support | ||
|
||
| Provider | Managed Language Models | | ||
| --- | --- | | ||
| [Playground](/docs/providers/playground#managed-large-language-models) | ❌ | | ||
| [AWS Bedrock](/docs/providers/aws#managed-large-language-models) | ✅ | | ||
| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ | | ||
| [GCP Vertex](/docs/providers/gcp#managed-large-language-models) | ❌ | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
--- | ||
title: Deploying OpenAI-compatible apps with Defang | ||
description: Defang makes it easy to leverage cloud-native managed language models for your OpenAI-compatible application. | ||
sidebar_position: 3000 | ||
--- | ||
|
||
# Deploying OpenAI-compatible applications to cloud-native managed language models with Defang | ||
|
||
Defang makes it easy to deploy on your favourite cloud's managed LLM service with our [OpenAI Access Gateway](https://github.com/DefangLabs/openai-access-gateway). This service sits between your application and the cloud service and acts as a compatibility layer. It handles incoming OpenAI requests, translates those requests to the appropriate cloud-native API, handles the native response, and re-constructs an OpenAI-compatible response. | ||
|
||
See [our tutorial](/docs/tutorials/deploying-openai-apps-aws-bedrock.mdx/) which describes how to configure the OpenAI Access Gateway for your application | ||
|
||
## Current Support | ||
|
||
| Provider | Managed Language Models | | ||
| --- | --- | | ||
| [Playground](/docs/providers/playground#managed-services) | ❌ | | ||
| [AWS Bedrock](/docs/providers/aws#managed-storage) | ✅ | | ||
| [DigitalOcean GenAI](/docs/providers/digitalocean#future-improvements) | ❌ | | ||
| [GCP Vertex](/docs/providers/gcp#future-improvements) | ❌ | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
--- | ||
title: Deploying your OpenAI application to AWS and using Bedrock | ||
sidebar_position: 50 | ||
--- | ||
|
||
# Deploying your OpenAI application to AWS and using Bedrock | ||
|
||
Let's assume you have an app which is using one of the OpenAI client libraries and you want to deploy your app to AWS so you can leverage Bedrock. This tutorial will show you how Defang makes it easy. | ||
|
||
Assume you have a compose file like this: | ||
|
||
```yaml | ||
services: | ||
app: | ||
build: | ||
context: . | ||
ports: | ||
- target: 3000 | ||
published: 3000 | ||
protocol: tcp | ||
mode: ingress | ||
environment: | ||
OPENAI_API_KEY: | ||
healthcheck: | ||
test: ["CMD", "curl", "-f", "http://localhost:3000/"] | ||
``` | ||
|
||
## Add an llm service to your compose file | ||
|
||
The first step is to add a new service to your compose file. The `defangio/openai-access-gateway`. This service provides an OpenAI compatible interface to AWS Bedrock. It's easy to configure, first you need to add it to your compose file: | ||
|
||
```diff | ||
+ llm: | ||
+ image: defangio/openai-access-gateway | ||
+ x-defang-llm: true | ||
+ ports: | ||
+ - target: 80 | ||
+ published: 80 | ||
+ protocol: tcp | ||
+ mode: host | ||
+ environment: | ||
+ - OPENAI_API_KEY | ||
+ healthcheck: | ||
+ test: ["CMD", "curl", "-f", "http://localhost/health"] | ||
``` | ||
|
||
A few things to note here. First the image is a fork of [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), which a few modifications to make it easier to use. The source code is available [here](https://github.com/DefangLabs/openai-access-gateway). Second: the `x-defang-llm` property. Defang uses extensions like this to signal special handling of certain kinds of services. In this case, it signals to Defang that we need to configure the appropriate IAM Roles and Policies to support your application. | ||
|
||
## Redirecting application traffic | ||
|
||
Then you need to configure your application to redirect traffic to the openai-access-gateway, like this: | ||
|
||
```diff | ||
services: | ||
app: | ||
ports: | ||
- target: 3000 | ||
published: 3000 | ||
protocol: tcp | ||
mode: ingress | ||
environment: | ||
OPENAI_API_KEY: | ||
+ OPENAI_BASE_URL: "http://llm/api/v1" | ||
+ MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" | ||
healthcheck: | ||
test: ["CMD", "curl", "-f", "http://localhost:3000/"] | ||
``` | ||
|
||
You will also need to configure your application to use one of the bedrock models. We recommend configuring an environment variable called `MODEL` like this: | ||
|
||
## Selecting a model | ||
|
||
```diff | ||
services: | ||
app: | ||
ports: | ||
- target: 3000 | ||
published: 3000 | ||
protocol: tcp | ||
mode: ingress | ||
environment: | ||
OPENAI_API_KEY: | ||
OPENAI_BASE_URL: "http://llm/api/v1" | ||
+ MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" | ||
healthcheck: | ||
test: ["CMD", "curl", "-f", "http://localhost:3000/"] | ||
``` | ||
|
||
## Enabling bedrock model access | ||
|
||
AWS currently requires access to be manually configured on a per-model basis in each account. See this guide for [how to enable model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html). | ||
|
||
## Your OpenAI key | ||
|
||
It's worth noting that you no longer need ot use your original OpenAI API key. We do recommend using _something_ in its place, but feel free to generate a new secret and set it with `defang config set OPENAI_API_KEY`. | ||
|
||
## Complete Example Compose File | ||
|
||
```yaml | ||
services: | ||
app: | ||
build: | ||
context: . | ||
ports: | ||
- target: 3000 | ||
published: 3000 | ||
protocol: tcp | ||
mode: ingress | ||
environment: | ||
OPENAI_API_KEY: | ||
OPENAI_BASE_URL: "http://llm/api/v1" | ||
MODEL: "anthropic.claude-3-sonnet-20240229-v1:0" | ||
healthcheck: | ||
test: ["CMD", "curl", "-f", "http://localhost:3000/"] | ||
llm: | ||
image: defangio/openai-access-gateway | ||
x-defang-llm: true | ||
ports: | ||
- target: 80 | ||
published: 80 | ||
protocol: tcp | ||
mode: host | ||
environment: | ||
- OPENAI_API_KEY | ||
healthcheck: | ||
test: ["CMD", "curl", "-f", "http://localhost/health"] | ||
|
||
``` |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.