Skip to content
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions ai-data/generative-apis/how-to/query-code-models.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
---
meta:
title: How to query code models
description: Learn how to interact with powerful language models specialized in code using Scaleway's Generative APIs service.
content:
h1: How to query code models
paragraph: Learn how to interact with powerful language models specialized in code using Scaleway's Generative APIs service.
tags: generative-apis ai-data language-models code-models chat-completions-api
dates:
validation: 2024-12-08
posted: 2024-12-08
---

Scaleway's Generative APIs service allows users to interact with powerful code models hosted on the platform.

Code models are inherently language models specialized in *understanding code*, *generating code* and *fixing code*.

As such, they will be available through the same interfaces as language models:
- The Scaleway [console](https://console.scaleway.com) provides complete [playground](/ai-data/generative-apis/how-to/query-language-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time.
- Via the [Chat API](/ai-data/generative-apis/how-to/query-language-models/#querying-language-models-via-api)
For more information on how to query language models, find [our dedicated documentation]((/ai-data/generative-apis/how-to/query-language-models/).

Code models are also ideal AI assistants when added to IDEs (integrated development environments).

<Macro id="requirements" />

- A Scaleway account logged into the [console](https://console.scaleway.com)
- [Owner](/identity-and-access-management/iam/concepts/#owner) status or [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allowing you to perform actions in the intended Organization
- A valid [API key](/identity-and-access-management/iam/how-to/create-api-keys/) for API authentication
- An IDE such as VS Code or JetBrains

## Install Continue in your IDE

[Continue](https://www.continue.dev/) is an [open-source code assistant](https://github.com/continuedev/continue) to connect AI models to your IDE.

To get Continue, simply hit `install`
- on the [Continue extension page in Visual Studio Marketplace](https://marketplace.visualstudio.com/items?itemName=Continue.continue)
- or on the [Continue extension page in JetBrains Marketplace](https://plugins.jetbrains.com/plugin/22707-continue)

## Configure Scaleway as an API provider in Continue

Continue's `config.json` file will set models and providers allowed for chat, autocompletion etc.
Here is an example configuration:

```json
"models": [
{
"model": "qwen2.5-coder-32b-instruct",
"title": "Qwen2.5-coder",
"apiBase": "https://api.scaleway.ai/v1/",
"provider": "openai",
"apiKey": "###SCW SECRET KEY###",
"useLegacyCompletionsEndpoint": false
}
]
```

Read more about how to set up your `config.json` on the [official Continue documentation](https://docs.continue.dev/reference).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file is typically stored as $HOME/.continue/config.json on linux/macos systems and %USERPROFILE%\.continue\config.json on windows systems, may be nice to add this as a note because it is not very well documented.


Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Our [Chat API](/ai-data/generative-apis/how-to/query-language-models) has built-
| Meta | `llama-3.1-70b-instruct` | 128k | [Llama 3.1 Community License Agreement](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) |
| Mistral | `mistral-nemo-instruct-2407` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) |
| Mistral | `pixtral-12b-2409` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Pixtral-12B-2409) |
| Qwen | `qwen-2.5-coder-32b-instruct` | 128k | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) |


<Message type="tip">
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
meta:
title: Understanding the Qwen2.5-Coder-32B-Instruct model
description: Deploy your own secure Qwen2.5-Coder-32B-Instruct model with Scaleway Managed Inference. Privacy-focused, fully managed.
content:
h1: Understanding the Qwen2.5-Coder-32B-Instruct model
paragraph: This page provides information on the Qwen2.5-Coder-32B-Instruct model
tags:
dates:
validation: 2024-12-08
posted: 2024-12-08
categories:
- ai-data
---

## Model overview

| Attribute | Details |
|-----------------|------------------------------------|
| Provider | [Qwen](https://qwenlm.github.io/) |
| License | [Apache 2.0](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE) |
| Compatible Instances | H100, H100-2 (INT8) |
| Context Length | up to 128k tokens |

## Model names

```bash
qwen/qwen2.5-coder-32b-instruct:int8
```

## Compatible Instances

| Instance type | Max context length |
| ------------- |-------------|
| H100 | 128k (INT8)
| H100-2 | 128k (INT8)

## Model introduction

Qwen2.5-coder is your intelligent programming assistant familiar more than 40 programming languages.
With Qwen2.5-coder deployed at Scaleway, your company can benefit from code generation, AI-assisted code repair, and code reasoning.

## Why is it useful?

- Qwen2.5-coder achieved the best performance on multiple popular code generation benchmarks (EvalPlus, LiveCodeBench, BigCodeBench), outranking many open-source models and giving competitive performance with GPT-4o.
- This model is versatile. While demonstrating strong and comprehensive coding abilities, it also possesses good general and mathematical skills.

## How to use it

### Sending Managed Inference requests

To perform inference tasks with your Qwen2.5-coder deployed at Scaleway, use the following command:

```bash
curl -s \
-H "Authorization: Bearer <IAM API key>" \
-H "Content-Type: application/json" \
--request POST \
--url "https://<Deployment UUID>.ifr.fr-par.scaleway.com/v1/chat/completions" \
--data '{"model":"qwen/qwen2.5-coder-32b-instruct:int8", "messages":[{"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful code assistant."},{"role": "user","content": "Write a quick sort algorithm."}], "max_tokens": 1000, "temperature": 1, "stream": false}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temperature may be a bit high there no?

```

Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting.

<Message type="tip">
The model name allows Scaleway to put your prompts in the expected format.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure there is added value here, what did you want to express?

</Message>

<Message type="note">
Ensure that the `messages` array is properly formatted with roles (system, user, assistant) and content.
</Message>

### Receiving Inference responses

Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server.
Process the output data according to your application's needs. The response will contain the output generated by the LLM model based on the input provided in the request.

<Message type="note">
Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/ai-data/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently.
</Message>
8 changes: 8 additions & 0 deletions menu/navigation.json
Original file line number Diff line number Diff line change
Expand Up @@ -720,6 +720,10 @@
{
"label": "Moshiko-0.1-8b model",
"slug": "moshiko-0.1-8b"
},
{
"label": "Qwen2.5-coder-32b-instruct model",
"slug": "qwen2.5-coder-32b-instruct"
}
],
"label": "Additional Content",
Expand Down Expand Up @@ -757,6 +761,10 @@
"label": "Query embedding models",
"slug": "query-embedding-models"
},
{
"label": "Query code models",
"slug": "query-code-models"
},
{
"label": "Use structured outputs",
"slug": "use-structured-outputs"
Expand Down
Loading