You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pages/public_cloud/ai_machine_learning/endpoints_guide_07_virtual_models/guide.en-gb.md
+11-9Lines changed: 11 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
2
title: AI Endpoints - Using Virtual Models
3
3
excerpt: Learn how to use OVHcloud AI Endpoints Virtual Models
4
-
updated: 2025-08-06
4
+
updated: 2025-08-14
5
5
---
6
6
7
7
> [!primary]
@@ -13,7 +13,7 @@ updated: 2025-08-06
13
13
14
14
Choosing the right Large Language Model (LLM) is not always straightforward. Models vary in strengths, performance, cost, and licensing, and new ones appear regularly—often outperforming previous options. This rapid evolution makes it essential to match your choice to your specific needs, while staying ready to adapt as better models emerge.
15
15
16
-
To make this easier, we developed a system of virtual models where instead of requesting a hard-coded model, you specify the expected specifications of the model you need (size, price, ...) and we resolve it to the currently best matching model in our catalog. In this guide, we'll see the different capabilities of this feature and how to use it with your OpenAI compatible code.
16
+
To make this easier, we developed a system of virtual models where instead of requesting a hard-coded model, you specify the expected specifications of the model you need (size, price, etc.) and we resolve it to the currently best matching model in our catalog. In this guide, we'll see the different capabilities of this feature and how to use it with your OpenAI compatible code.
17
17
18
18
## Requirements
19
19
@@ -28,7 +28,7 @@ The examples provided during this guide can be used with one of the following en
28
28
>> pip install openai
29
29
>>```
30
30
>>
31
-
>>**Curl**
31
+
>**Curl**
32
32
>>
33
33
>> A standard terminal, with [curl](https://curl.se/) installed on the system.
34
34
>>
@@ -43,23 +43,25 @@ Follow the instructions in the [AI Endpoints - Getting Started](/pages/public_cl
43
43
## Model DSL
44
44
45
45
When you request a LLM generation through our unified endpoint, you can provide in the OpenAI-compliant `model` field a model DSL query instead of a hardcoded model name.
46
-
These queries are divided into three parts: tag, ranker, and condition.
47
-
- **Tag**: A tag can be a model series (llama, mistral, codestral, ...), a publisher (meta-llama, mistralai, ...) or use case tag (code_chat, code_completion, summarization, ...)
48
-
- **Ranker**: The ranker defines a model's capability compared to other models. We support multiple rankers such as fastest, cheapest, biggest, latest or smallest
46
+
47
+
These queries are divided into three parts: tag, ranker, and condition:
48
+
49
+
- **Tag**: A tag can be a model series (llama, mistral, codestral, ...), a publisher (meta-llama, mistralai, ...) or use case tag (code_chat, code_completion, summarization, etc.)
50
+
- **Ranker**: The ranker defines a model's capability compared to other models. We support multiple rankers such as fastest, cheapest, biggest, latest or smallest.
49
51
- **Condition**: The condition allows you to filter models based on strict requirements on some of the model specifications such as context_size, max_tokens and input_cost. These conditions support basic operators (<, >, =).
50
52
51
53
Below are some example queries and the models they currently resolve to. Please note that the resolved model can change, as we continuously update our catalog with new model releases.
52
54
53
55
| Model Query | Current Target Model | Usage |
54
56
|-----------|-----------|-----------|
55
-
| code_chat@latest | Qwen3-32B | The most recently released model optimized for code chat tasks |
56
-
| meta-llama@latest | Llama-3.1-8B-Instruct | The latest Meta-released LLaMA model |
57
+
| code_chat@latest | Qwen3-32B | The most recently released model optimized for code chat tasks |
58
+
| meta-llama@latest | Llama-3.1-8B-Instruct | The latest Meta-released LLaMA model |
57
59
| mistral@latest?context_size > 100000 | Mistral-Small-3.2-24B-Instruct-2506 | The latest Mistral model with a context window greater than 100k tokens |
58
60
| llama@biggest?input_cost<0.5 | Llama-3.1-8B-Instruct | The largest LLaMA model whose input token cost is under €0.50 per 1M tokens |
59
61
60
62
You can visit our [catalog](https://endpoints.ai.cloud.ovh.net/catalog) to learn more about the different model specifications.
61
63
62
-
## Example Usage
64
+
### Example Usage
63
65
64
66
The following code samples provide a simple example on how to query our API with a model query.
0 commit comments