Skip to content

Commit 1386fd2

Browse files
authored
feat(inference): update custom models
1 parent 7dcfe6e commit 1386fd2

File tree

1 file changed

+29
-0
lines changed

1 file changed

+29
-0
lines changed

pages/managed-inference/reference-content/supported-models.mdx

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,8 +58,37 @@ The model files need to include:
5858
- `tokenizer_config.json` file as `chat_template` field
5959
- `chat_template.json` file as `chat_template` field
6060

61+
The model type need to either be:
62+
- `chat`
63+
- `vision`
64+
- `multimodal` (`chat` and `vision` currently)
65+
- `embedding`
66+
6167
For security reasons, models containing arbitrary code execution such as [`pickle`](https://docs.python.org/3/library/pickle.html) format are not supported.
6268

69+
### Supported API
70+
71+
Depending on the model type, specific endpoints and features will be supported.
72+
73+
#### Chat models
74+
75+
Chat API will be expposed for this model under `/v1/chat/completions` endpoint.
76+
**Structured outputs** or **Function calling** are not yet supported for custom models.
77+
78+
#### Vision models
79+
80+
Chat API will be expposed for this model under `/v1/chat/completions` endpoint.
81+
**Structured outputs** or **Function calling** are not yet supported for custom models.
82+
83+
#### Multimodal models (vision and chat)
84+
85+
These models will be treated similarly to both Chat and Vision models.
86+
87+
#### Embedding models
88+
89+
Embeddings API will be exposed for this model under `/v1/embeddings` endpoint.
90+
91+
6392
### Custom model lifecycle
6493

6594
Currently, custom model deployments are considered to be valid for a long term, and we will ensure any updatse or changes to Managed Inference will not impact existing deployments.

0 commit comments

Comments
 (0)