Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 2 additions & 12 deletions samples/managed-llm-provider/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ You can configure the `MODEL` and `ENDPOINT_URL` for the LLM separately for loca

Ensure you have enabled model access for the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access).

To learn about available LLM models in Defang, please see our [Model Mapping documentation](https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping).

For more about Managed LLMs in Defang, please see our [Managed LLMs documentation](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models).

### Docker Model Provider
Expand All @@ -36,18 +38,6 @@ To run the application locally, you can use the following command:
docker compose -f compose.dev.yaml up --build
```

## Configuration

For this sample, you will need to provide the following [configuration](https://docs.defang.io/docs/concepts/configuration):

> Note that if you are using the 1-click deploy option, you can set these values as secrets in your GitHub repository and the action will automatically deploy them for you.

### `MODEL`
The Model ID of the LLM you are using for your application. For example, `anthropic.claude-3-haiku-20240307-v1:0`.
```bash
defang config set MODEL
```

## Deployment

> [!NOTE]
Expand Down
3 changes: 2 additions & 1 deletion samples/managed-llm-provider/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ services:
restart: always
environment:
- ENDPOINT_URL=http://llm/api/v1/chat/completions # endpoint to the Provider Service
- MODEL=anthropic.claude-3-haiku-20240307-v1:0 # LLM model ID used in the Provider Service
- MODEL=us.amazon.nova-micro-v1:0 # LLM model ID used in the Provider Service
# For other models, see https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping
healthcheck:
test: ["CMD", "python3", "-c", "import sys, urllib.request; urllib.request.urlopen(sys.argv[1]).read()", "http://localhost:8000/"]
interval: 30s
Expand Down
14 changes: 2 additions & 12 deletions samples/managed-llm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ You can configure the `MODEL` and `ENDPOINT_URL` for the LLM separately for loca

Ensure you have enabled model access for the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access).

To learn about available LLM models in Defang, please see our [Model Mapping documentation](https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping).

For more about Managed LLMs in Defang, please see our [Managed LLMs documentation](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models).

### Defang OpenAI Access Gateway
Expand All @@ -39,18 +41,6 @@ To run the application locally, you can use the following command:
docker compose -f compose.dev.yaml up --build
```

## Configuration

For this sample, you will need to provide the following [configuration](https://docs.defang.io/docs/concepts/configuration):

> Note that if you are using the 1-click deploy option, you can set these values as secrets in your GitHub repository and the action will automatically deploy them for you.

### `MODEL`
The Model ID of the LLM you are using for your application. For example, `anthropic.claude-3-haiku-20240307-v1:0`.
```bash
defang config set MODEL
```

## Deployment

> [!NOTE]
Expand Down
5 changes: 2 additions & 3 deletions samples/managed-llm/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ services:
restart: always
environment:
- ENDPOINT_URL=http://llm/api/v1/chat/completions # endpoint to the gateway service
- MODEL=anthropic.claude-3-haiku-20240307-v1:0 # LLM model ID used for the gateway
- MODEL=us.amazon.nova-micro-v1:0 # LLM model ID used for the gateway.
# For other models, see https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping
- OPENAI_API_KEY=FAKE_TOKEN # the actual value will be ignored when using the gateway, but it should match the one in the llm service
healthcheck:
test: ["CMD", "python3", "-c", "import sys, urllib.request; urllib.request.urlopen(sys.argv[1]).read()", "http://localhost:8000/"]
Expand All @@ -29,8 +30,6 @@ services:
mode: host
environment:
- OPENAI_API_KEY=FAKE_TOKEN # this value must match the one in the app service
- USE_MODEL_MAPPING=false
- DEBUG=true
# if using GCP for BYOC deployment, add these environment variables:
# - GCP_PROJECT_ID=${GCP_PROJECT_ID}
# - GCP_REGION=${GCP_REGION}
Loading