diff --git a/samples/managed-llm-provider/README.md b/samples/managed-llm-provider/README.md index 543a77af..470f5adc 100644 --- a/samples/managed-llm-provider/README.md +++ b/samples/managed-llm-provider/README.md @@ -14,6 +14,8 @@ You can configure the `MODEL` and `ENDPOINT_URL` for the LLM separately for loca Ensure you have enabled model access for the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access). +To learn about available LLM models in Defang, please see our [Model Mapping documentation](https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping). + For more about Managed LLMs in Defang, please see our [Managed LLMs documentation](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). ### Docker Model Provider @@ -36,18 +38,6 @@ To run the application locally, you can use the following command: docker compose -f compose.dev.yaml up --build ``` -## Configuration - -For this sample, you will need to provide the following [configuration](https://docs.defang.io/docs/concepts/configuration): - -> Note that if you are using the 1-click deploy option, you can set these values as secrets in your GitHub repository and the action will automatically deploy them for you. - -### `MODEL` -The Model ID of the LLM you are using for your application. For example, `anthropic.claude-3-haiku-20240307-v1:0`. -```bash -defang config set MODEL -``` - ## Deployment > [!NOTE] diff --git a/samples/managed-llm-provider/compose.yaml b/samples/managed-llm-provider/compose.yaml index 22aa83dc..f939abe8 100644 --- a/samples/managed-llm-provider/compose.yaml +++ b/samples/managed-llm-provider/compose.yaml @@ -8,7 +8,8 @@ services: restart: always environment: - ENDPOINT_URL=http://llm/api/v1/chat/completions # endpoint to the Provider Service - - MODEL=anthropic.claude-3-haiku-20240307-v1:0 # LLM model ID used in the Provider Service + - MODEL=default # LLM model ID used in the Provider Service + # For other models, see https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping healthcheck: test: ["CMD", "python3", "-c", "import sys, urllib.request; urllib.request.urlopen(sys.argv[1]).read()", "http://localhost:8000/"] interval: 30s diff --git a/samples/managed-llm/README.md b/samples/managed-llm/README.md index 6cd81c54..bee55401 100644 --- a/samples/managed-llm/README.md +++ b/samples/managed-llm/README.md @@ -17,6 +17,8 @@ You can configure the `MODEL` and `ENDPOINT_URL` for the LLM separately for loca Ensure you have enabled model access for the model you intend to use. To do this, you can check your [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) or [GCP Vertex AI model access](https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access). +To learn about available LLM models in Defang, please see our [Model Mapping documentation](https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping). + For more about Managed LLMs in Defang, please see our [Managed LLMs documentation](https://docs.defang.io/docs/concepts/managed-llms/managed-language-models). ### Defang OpenAI Access Gateway @@ -39,18 +41,6 @@ To run the application locally, you can use the following command: docker compose -f compose.dev.yaml up --build ``` -## Configuration - -For this sample, you will need to provide the following [configuration](https://docs.defang.io/docs/concepts/configuration): - -> Note that if you are using the 1-click deploy option, you can set these values as secrets in your GitHub repository and the action will automatically deploy them for you. - -### `MODEL` -The Model ID of the LLM you are using for your application. For example, `anthropic.claude-3-haiku-20240307-v1:0`. -```bash -defang config set MODEL -``` - ## Deployment > [!NOTE] diff --git a/samples/managed-llm/compose.yaml b/samples/managed-llm/compose.yaml index 31ce0754..fcb3c156 100644 --- a/samples/managed-llm/compose.yaml +++ b/samples/managed-llm/compose.yaml @@ -8,7 +8,8 @@ services: restart: always environment: - ENDPOINT_URL=http://llm/api/v1/chat/completions # endpoint to the gateway service - - MODEL=anthropic.claude-3-haiku-20240307-v1:0 # LLM model ID used for the gateway + - MODEL=default # LLM model ID used for the gateway. + # For other models, see https://docs.defang.io/docs/concepts/managed-llms/openai-access-gateway#model-mapping - OPENAI_API_KEY=FAKE_TOKEN # the actual value will be ignored when using the gateway, but it should match the one in the llm service healthcheck: test: ["CMD", "python3", "-c", "import sys, urllib.request; urllib.request.urlopen(sys.argv[1]).read()", "http://localhost:8000/"] @@ -29,8 +30,6 @@ services: mode: host environment: - OPENAI_API_KEY=FAKE_TOKEN # this value must match the one in the app service - - USE_MODEL_MAPPING=false - - DEBUG=true # if using GCP for BYOC deployment, add these environment variables: # - GCP_PROJECT_ID=${GCP_PROJECT_ID} # - GCP_REGION=${GCP_REGION}