-
Notifications
You must be signed in to change notification settings - Fork 261
feat(ai): introducing Qwen-2.5-coder #4092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
279487b
e796381
142d8ec
db1734b
4987e4e
40a66e7
0dd00b2
d355001
68c84de
5181723
5a8b2a8
adeb194
c0fc5a9
c57d56b
0c0e301
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,59 @@ | ||
| --- | ||
| meta: | ||
| title: How to query code models | ||
| description: Learn how to interact with powerful language models specialized in code using Scaleway's Generative APIs service. | ||
| content: | ||
| h1: How to query code models | ||
| paragraph: Learn how to interact with powerful language models specialized in code using Scaleway's Generative APIs service. | ||
| tags: generative-apis ai-data language-models code-models chat-completions-api | ||
| dates: | ||
| validation: 2024-12-08 | ||
| posted: 2024-12-08 | ||
| --- | ||
|
|
||
| Scaleway's Generative APIs service allows users to interact with powerful code models hosted on the platform. | ||
|
|
||
| Code models are inherently language models specialized in *understanding code*, *generating code* and *fixing code*. | ||
|
|
||
| As such, they will be available through the same interfaces as language models: | ||
| - The Scaleway [console](https://console.scaleway.com) provides complete [playground](/ai-data/generative-apis/how-to/query-language-models/#accessing-the-playground), aiming to test models, adapt parameters, and observe how these changes affect the output in real-time. | ||
| - Via the [Chat API](/ai-data/generative-apis/how-to/query-language-models/#querying-language-models-via-api) | ||
| For more information on how to query language models, find [our dedicated documentation]((/ai-data/generative-apis/how-to/query-language-models/). | ||
tgenaitay marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| Code models are also ideal AI assistants when added to IDEs (integrated development environments). | ||
|
|
||
| <Macro id="requirements" /> | ||
|
|
||
| - A Scaleway account logged into the [console](https://console.scaleway.com) | ||
| - [Owner](/identity-and-access-management/iam/concepts/#owner) status or [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allowing you to perform actions in the intended Organization | ||
| - A valid [API key](/identity-and-access-management/iam/how-to/create-api-keys/) for API authentication | ||
| - An IDE such as VS Code or JetBrains | ||
|
|
||
| ## Install Continue in your IDE | ||
|
|
||
| [Continue](https://www.continue.dev/) is an [open-source code assistant](https://github.com/continuedev/continue) to connect AI models to your IDE. | ||
|
|
||
| To get Continue, simply hit `install` | ||
| - on the [Continue extension page in Visual Studio Marketplace](https://marketplace.visualstudio.com/items?itemName=Continue.continue) | ||
| - or on the [Continue extension page in JetBrains Marketplace](https://plugins.jetbrains.com/plugin/22707-continue) | ||
|
|
||
| ## Configure Scaleway as an API provider in Continue | ||
|
|
||
| Continue's `config.json` file will set models and providers allowed for chat, autocompletion etc. | ||
| Here is an example configuration: | ||
|
|
||
| ```json | ||
| "models": [ | ||
| { | ||
| "model": "qwen2.5-coder-32b-instruct", | ||
| "title": "Qwen2.5-coder", | ||
| "apiBase": "https://api.scaleway.ai/v1/", | ||
| "provider": "openai", | ||
| "apiKey": "###SCW SECRET KEY###", | ||
| "useLegacyCompletionsEndpoint": false | ||
| } | ||
| ] | ||
| ``` | ||
|
|
||
| Read more about how to set up your `config.json` on the [official Continue documentation](https://docs.continue.dev/reference). | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The file is typically stored as |
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,80 @@ | ||
| --- | ||
| meta: | ||
| title: Understanding the Qwen2.5-Coder-32B-Instruct model | ||
| description: Deploy your own secure Qwen2.5-Coder-32B-Instruct model with Scaleway Managed Inference. Privacy-focused, fully managed. | ||
| content: | ||
| h1: Understanding the Qwen2.5-Coder-32B-Instruct model | ||
| paragraph: This page provides information on the Qwen2.5-Coder-32B-Instruct model | ||
| tags: | ||
| dates: | ||
| validation: 2024-12-08 | ||
| posted: 2024-12-08 | ||
| categories: | ||
| - ai-data | ||
| --- | ||
|
|
||
| ## Model overview | ||
|
|
||
| | Attribute | Details | | ||
| |-----------------|------------------------------------| | ||
| | Provider | [Qwen](https://qwenlm.github.io/) | | ||
| | License | [Apache 2.0](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE) | | ||
| | Compatible Instances | H100, H100-2 (INT8) | | ||
| | Context Length | up to 128k tokens | | ||
|
|
||
| ## Model names | ||
|
|
||
| ```bash | ||
| qwen/qwen2.5-coder-32b-instruct:int8 | ||
| ``` | ||
|
|
||
| ## Compatible Instances | ||
|
|
||
| | Instance type | Max context length | | ||
| | ------------- |-------------| | ||
| | H100 | 128k (INT8) | ||
| | H100-2 | 128k (INT8) | ||
|
|
||
| ## Model introduction | ||
|
|
||
| Qwen2.5-coder is your intelligent programming assistant familiar more than 40 programming languages. | ||
| With Qwen2.5-coder deployed at Scaleway, your company can benefit from code generation, AI-assisted code repair, and code reasoning. | ||
|
|
||
| ## Why is it useful? | ||
|
|
||
| - Qwen2.5-coder achieved the best performance on multiple popular code generation benchmarks (EvalPlus, LiveCodeBench, BigCodeBench), outranking many open-source models and giving competitive performance with GPT-4o. | ||
| - This model is versatile. While demonstrating strong and comprehensive coding abilities, it also possesses good general and mathematical skills. | ||
|
|
||
| ## How to use it | ||
|
|
||
| ### Sending Managed Inference requests | ||
|
|
||
| To perform inference tasks with your Qwen2.5-coder deployed at Scaleway, use the following command: | ||
|
|
||
| ```bash | ||
| curl -s \ | ||
| -H "Authorization: Bearer <IAM API key>" \ | ||
| -H "Content-Type: application/json" \ | ||
| --request POST \ | ||
| --url "https://<Deployment UUID>.ifr.fr-par.scaleway.com/v1/chat/completions" \ | ||
| --data '{"model":"qwen/qwen2.5-coder-32b-instruct:int8", "messages":[{"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful code assistant."},{"role": "user","content": "Write a quick sort algorithm."}], "max_tokens": 1000, "temperature": 1, "stream": false}' | ||
|
||
| ``` | ||
|
|
||
| Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. | ||
|
|
||
| <Message type="tip"> | ||
| The model name allows Scaleway to put your prompts in the expected format. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure there is added value here, what did you want to express? |
||
| </Message> | ||
|
|
||
| <Message type="note"> | ||
| Ensure that the `messages` array is properly formatted with roles (system, user, assistant) and content. | ||
| </Message> | ||
|
|
||
| ### Receiving Inference responses | ||
|
|
||
| Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server. | ||
| Process the output data according to your application's needs. The response will contain the output generated by the LLM model based on the input provided in the request. | ||
|
|
||
| <Message type="note"> | ||
| Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/ai-data/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently. | ||
| </Message> | ||
Uh oh!
There was an error while loading. Please reload this page.