Skip to content

Commit ca203c8

Browse files
authored
feat(ai): Support for Qwen2.5-coder-32b-instruct (#4098)
1 parent 086ab7c commit ca203c8

File tree

5 files changed

+152
-1
lines changed

5 files changed

+152
-1
lines changed
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
---
2+
meta:
3+
title: How to query code models
4+
description: Learn how to interact with powerful language models specialized in code using Scaleway's Generative APIs service.
5+
content:
6+
h1: How to query code models
7+
paragraph: Learn how to interact with powerful language models specialized in code using Scaleway's Generative APIs service.
8+
tags: generative-apis ai-data language-models code-models chat-completions-api
9+
dates:
10+
validation: 2024-12-09
11+
posted: 2024-12-09
12+
---
13+
14+
Scaleway's Generative APIs service allows users to interact with powerful code models hosted on the platform.
15+
16+
Code models are inherently language models specialized in **understanding code**, **generating code** and **fixing code**.
17+
18+
As such, they will be available through the same interfaces as language models:
19+
- The Scaleway [console](https://console.scaleway.com) provides complete [playground](/ai-data/generative-apis/how-to/query-language-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time.
20+
- Via the [Chat API](/ai-data/generative-apis/how-to/query-language-models/#querying-language-models-via-api)
21+
For more information on how to query language models, read [our dedicated documentation](/ai-data/generative-apis/how-to/query-language-models/).
22+
23+
Code models are also ideal AI assistants when added to IDEs (integrated development environments).
24+
25+
<Macro id="requirements" />
26+
27+
- A Scaleway account logged into the [console](https://console.scaleway.com)
28+
- [Owner](/identity-and-access-management/iam/concepts/#owner) status or [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allowing you to perform actions in the intended Organization
29+
- A valid [API key](/identity-and-access-management/iam/how-to/create-api-keys/) for API authentication
30+
- An IDE such as VS Code or JetBrains
31+
32+
## Install Continue in your IDE
33+
34+
[Continue](https://www.continue.dev/) is an [open-source code assistant](https://github.com/continuedev/continue) to connect AI models to your IDE.
35+
36+
To get Continue, simply hit `install`
37+
- on the [Continue extension page in Visual Studio Marketplace](https://marketplace.visualstudio.com/items?itemName=Continue.continue)
38+
- or on the [Continue extension page in JetBrains Marketplace](https://plugins.jetbrains.com/plugin/22707-continue)
39+
40+
## Configure Scaleway as an API provider in Continue
41+
42+
Continue's `config.json` file will set models and providers allowed for chat, autocompletion etc.
43+
Here is an example configuration with Scaleway's OpenAI-compatible provider:
44+
45+
```json
46+
"models": [
47+
{
48+
"model": "qwen2.5-coder-32b-instruct",
49+
"title": "Qwen2.5-coder",
50+
"apiBase": "https://api.scaleway.ai/v1/",
51+
"provider": "openai",
52+
"apiKey": "###SCW SECRET KEY###",
53+
"useLegacyCompletionsEndpoint": false
54+
}
55+
]
56+
```
57+
58+
<Message type="tip">
59+
The config.json file is typically stored as $HOME/.continue/config.json on Linux/macOS systems, and %USERPROFILE%\.continue\config.json on Windows.
60+
</Message>
61+
62+
Read more about how to set up your `config.json` on the [official Continue documentation](https://docs.continue.dev/reference).
63+

ai-data/generative-apis/reference-content/rate-limits.mdx

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ content:
77
paragraph: Find our service limits in tokens per minute and queries per minute
88
tags: generative-apis ai-data rate-limits
99
dates:
10-
validation: 2024-10-30
10+
validation: 2024-12-09
1111
posted: 2024-08-27
1212
---
1313

@@ -25,6 +25,7 @@ Any model served through Scaleway Generative APIs gets limited by:
2525
| `llama-3.1-70b-instruct` | 300 | 100K |
2626
| `mistral-nemo-instruct-2407`| 300 | 100K |
2727
| `pixtral-12b-2409`| 300 | 100K |
28+
| `qwen2.5-32b-instruct`| 300 | 100K |
2829

2930
### Embedding models
3031

ai-data/generative-apis/reference-content/supported-models.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ Our [Chat API](/ai-data/generative-apis/how-to/query-language-models) has built-
2525
| Meta | `llama-3.1-70b-instruct` | 128k | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) |
2626
| Mistral | `mistral-nemo-instruct-2407` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) |
2727
| Mistral | `pixtral-12b-2409` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Pixtral-12B-2409) |
28+
| Qwen | `qwen-2.5-coder-32b-instruct` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) |
2829

2930

3031
<Message type="tip">
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
---
2+
meta:
3+
title: Understanding the Qwen2.5-Coder-32B-Instruct model
4+
description: Deploy your own secure Qwen2.5-Coder-32B-Instruct model with Scaleway Managed Inference. Privacy-focused, fully managed.
5+
content:
6+
h1: Understanding the Qwen2.5-Coder-32B-Instruct model
7+
paragraph: This page provides information on the Qwen2.5-Coder-32B-Instruct model
8+
tags:
9+
dates:
10+
validation: 2024-12-08
11+
posted: 2024-12-08
12+
categories:
13+
- ai-data
14+
---
15+
16+
## Model overview
17+
18+
| Attribute | Details |
19+
|-----------------|------------------------------------|
20+
| Provider | [Qwen](https://qwenlm.github.io/) |
21+
| License | [Apache 2.0](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE) |
22+
| Compatible Instances | H100, H100-2 (INT8) |
23+
| Context Length | up to 128k tokens |
24+
25+
## Model names
26+
27+
```bash
28+
qwen/qwen2.5-coder-32b-instruct:int8
29+
```
30+
31+
## Compatible Instances
32+
33+
| Instance type | Max context length |
34+
| ------------- |-------------|
35+
| H100 | 128k (INT8)
36+
| H100-2 | 128k (INT8)
37+
38+
## Model introduction
39+
40+
Qwen2.5-coder is your intelligent programming assistant familiar with more than 40 programming languages.
41+
With Qwen2.5-coder deployed at Scaleway, your company can benefit from code generation, AI-assisted code repair, and code reasoning.
42+
43+
## Why is it useful?
44+
45+
- Qwen2.5-coder achieved the best performance on multiple popular code generation benchmarks (EvalPlus, LiveCodeBench, BigCodeBench), outranking many open-source models and providing competitive performance with GPT-4o.
46+
- This model is versatile. While demonstrating strong and comprehensive coding abilities, it also possesses good general and mathematical skills.
47+
48+
## How to use it
49+
50+
### Sending Managed Inference requests
51+
52+
To perform inference tasks with your Qwen2.5-coder deployed at Scaleway, use the following command:
53+
54+
```bash
55+
curl -s \
56+
-H "Authorization: Bearer <IAM API key>" \
57+
-H "Content-Type: application/json" \
58+
--request POST \
59+
--url "https://<Deployment UUID>.ifr.fr-par.scaleway.com/v1/chat/completions" \
60+
--data '{"model":"qwen/qwen2.5-coder-32b-instruct:int8", "messages":[{"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful code assistant."},{"role": "user","content": "Write a quick sort algorithm."}], "max_tokens": 1000, "temperature": 0.8, "stream": false}'
61+
```
62+
63+
<Message type="tip">
64+
The model name allows Scaleway to put your prompts in the expected format.
65+
</Message>
66+
67+
<Message type="note">
68+
Ensure that the `messages` array is properly formatted with roles (system, user, assistant) and content.
69+
</Message>
70+
71+
### Receiving Inference responses
72+
73+
Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server.
74+
Process the output data according to your application's needs. The response will contain the output generated by the LLM model based on the input provided in the request.
75+
76+
<Message type="note">
77+
Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/ai-data/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently.
78+
</Message>

menu/navigation.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -720,6 +720,10 @@
720720
{
721721
"label": "Moshiko-0.1-8b model",
722722
"slug": "moshiko-0.1-8b"
723+
},
724+
{
725+
"label": "Qwen2.5-coder-32b-instruct model",
726+
"slug": "qwen2.5-coder-32b-instruct"
723727
}
724728
],
725729
"label": "Additional Content",
@@ -757,6 +761,10 @@
757761
"label": "Query embedding models",
758762
"slug": "query-embedding-models"
759763
},
764+
{
765+
"label": "Query code models",
766+
"slug": "query-code-models"
767+
},
760768
{
761769
"label": "Use structured outputs",
762770
"slug": "use-structured-outputs"

0 commit comments

Comments
 (0)