You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/concepts/managed-llms/openai-access-gateway.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,6 +34,10 @@ services:
34
34
Under the hood, when you use the `model` provider, Defang will deploy the **OpenAI Access Gateway** in a private network. This allows you to use the same code for both local development and cloud deployment.
35
35
The `x-defang-llm` extension is used to configure the appropriate roles and permissions for your service. See the [Managed Language Models](/docs/concepts/managed-llms/managed-language-models/) page for more details.
36
36
37
+
## Model Mapping
38
+
39
+
Defang supports model mapping through the openai-access-gateway on AWS and GCP. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to the closest matching model name on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively.
+ - GCP_REGION # if using GCP Vertex AI, AWS_REGION not necessary for Bedrock
46
+
+ - REGION
47
47
```
48
48
49
49
### Notes:
50
50
51
51
- The container image is based on [aws-samples/bedrock-access-gateway](https://github.com/aws-samples/bedrock-access-gateway), with enhancements.
52
52
-`x-defang-llm: true` signals to **Defang** that this service should be configured to use target platform AI services.
53
53
- New environment variables:
54
-
-`GCP_PROJECT_ID` and `GCP_REGION` are needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `GCP_REGION` = us-central1)
54
+
-`REGION` is the zone where the services runs (for AWS this is equvilent of AWS_REGION)
55
+
-`GCP_PROJECT_ID` is needed if using **Vertex AI**. (e.g.` GCP_PROJECT_ID` = my-project-456789 and `REGION` = us-central1)
55
56
56
57
:::tip
57
58
**OpenAI Key**
@@ -106,6 +107,10 @@ You should configure your application to specify the model you want to use.
106
107
107
108
Choose the correct `MODEL` depending on which cloud provider you are using.
108
109
110
+
Alternatively, Defang supports model mapping through the openai-access-gateway. This takes a model with a Docker naming convention (e.g. ai/lama3.3) and maps it to
111
+
the closest matching one on the target platform. If no such match can be found it can fallback onto a known existing model (e.g. ai/mistral). These environment
112
+
variables are USE_MODEL_MAPPING (default to true) and FALLBACK_MODEL (no default), respectively.
0 commit comments