Skip to content

Commit 88732fb

Browse files
tgenaitayRoRoJjcirinosclwy
authored
feat(ai): models updates (#4112)
* fix(ai): remove list to avoid duplicate * fix(ai): changed to scaleway as provider * feat(ai): introducing Llama 3.3 * fix(ai): changed to alphabetical order * fix(ai): changed posted dates * Apply suggestions from code review Co-authored-by: Jessica <[email protected]> --------- Co-authored-by: Rowena Jones <[email protected]> Co-authored-by: Jessica <[email protected]>
1 parent 3ec7c74 commit 88732fb

File tree

5 files changed

+101
-21
lines changed

5 files changed

+101
-21
lines changed

ai-data/generative-apis/how-to/query-code-models.mdx

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,7 @@ Here is an example configuration with Scaleway's OpenAI-compatible provider:
4848
{
4949
"model": "qwen2.5-coder-32b-instruct",
5050
"title": "Qwen2.5-coder",
51-
"apiBase": "https://api.scaleway.ai/v1/",
52-
"provider": "openai",
51+
"provider": "scaleway",
5352
"apiKey": "###SCW SECRET KEY###"
5453
}
5554
]

ai-data/generative-apis/how-to/use-function-calling.mdx

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,10 +25,7 @@ Function calling allows a large language model (LLM) to interact with external t
2525

2626
## Supported models
2727

28-
* llama-3.1-8b-instruct
29-
* llama-3.1-70b-instruct
30-
* mistral-nemo-instruct-2407
31-
* pixtral-12b-2409
28+
All the [chat models](/ai-data/generative-apis/reference-content/supported-models/#chat-models) hosted by Scaleway support function calling.
3229

3330
## Understanding function calling
3431

ai-data/managed-inference/reference-content/function-calling-support.mdx

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ content:
77
paragraph: Function calling allows models to connect to external tools.
88
tags:
99
dates:
10-
validation: 2024-11-18
10+
validation: 2024-12-12
1111
posted: 2024-10-25
1212
categories:
1313
- ai-data
@@ -27,6 +27,7 @@ The following models in Scaleway's Managed Inference library can call tools as p
2727

2828
* meta/llama-3.1-8b-instruct
2929
* meta/llama-3.1-70b-instruct
30+
* meta/llama-3.3-70b-instruct
3031
* mistral/mistral-7b-instruct-v0.3
3132
* mistral/mistral-nemo-instruct-2407
3233
* mistral/pixtral-12b-2409
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
---
2+
meta:
3+
title: Understanding the Llama-3.3-70b-instruct model
4+
description: Deploy your own secure Llama-3.3-70b-instruct model with Scaleway Managed Inference. Privacy-focused, fully managed.
5+
content:
6+
h1: Understanding the Llama-3.3-70b-instruct model
7+
paragraph: This page provides information on the Llama-3.3-70b-instruct model
8+
tags:
9+
dates:
10+
validation: 2024-12-12
11+
posted: 2024-12-12
12+
categories:
13+
- ai-data
14+
---
15+
16+
## Model overview
17+
18+
| Attribute | Details |
19+
|-----------------|------------------------------------|
20+
| Provider | [Meta](https://www.llama.com/) |
21+
| License | [Llama 3.3 community](https://www.llama.com/llama3_3/license/) |
22+
| Compatible Instances | H100-2 (BF16) |
23+
| Context length | Up to 70k tokens |
24+
25+
## Model names
26+
27+
```bash
28+
meta/llama-3.3-70b-instruct:bf16
29+
```
30+
31+
## Compatible Instances
32+
33+
| Instance type | Max context length |
34+
| ------------- |-------------|
35+
| H100-2 | 62k (BF16) |
36+
37+
## Model introduction
38+
39+
Released December 6, 2024, Meta’s Llama 3.3 70b is a fine-tune of the [Llama 3.1 70b](/ai-data/managed-inference/reference-content/llama-3.1-70b) model.
40+
This model is still text-only (text in/text out). However, Llama 3.3 was designed to approach the performance of Llama 3.1 405B on some applications.
41+
42+
## Why is it useful?
43+
44+
- Llama 3.3 uses the same prompt format as Llama 3.1. Prompts written for Llama 3.1 work unchanged with Llama 3.3.
45+
- Llama 3.3 supports 7 languages in addition to English: French, German, Hindi, Italian, Portuguese, Spanish, and Thai.
46+
47+
## How to use it
48+
49+
### Sending Managed Inference requests
50+
51+
To perform inference tasks with your Llama-3.3 deployed at Scaleway, use the following command:
52+
53+
```bash
54+
curl -s \
55+
-H "Authorization: Bearer <IAM API key>" \
56+
-H "Content-Type: application/json" \
57+
--request POST \
58+
--url "https://<Deployment UUID>.ifr.fr-par.scaleway.com/v1/chat/completions" \
59+
--data '{"model":"meta/llama-3.3-70b-instruct:bf16", "messages":[{"role": "user","content": "There is a llama in my garden, what should I do?"}], "max_tokens": 500, "temperature": 0.7, "stream": false}'
60+
```
61+
62+
Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting.
63+
64+
<Message type="tip">
65+
The model name allows Scaleway to put your prompts in the expected format.
66+
</Message>
67+
68+
<Message type="note">
69+
Ensure that the `messages` array is properly formatted with roles (system, user, assistant) and content.
70+
</Message>
71+
72+
### Receiving Inference responses
73+
74+
Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server.
75+
Process the output data according to your application's needs. The response will contain the output generated by the LLM model based on the input provided in the request.
76+
77+
<Message type="note">
78+
Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/ai-data/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently.
79+
</Message>

menu/navigation.json

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -669,6 +669,10 @@
669669
"label": "Support for function calling in Scaleway Managed Inference",
670670
"slug": "function-calling-support"
671671
},
672+
{
673+
"label": "BGE-Multilingual-Gemma2 model",
674+
"slug": "bge-multilingual-gemma2"
675+
},
672676
{
673677
"label": "Llama-3-8b-instruct model",
674678
"slug": "llama-3-8b-instruct"
@@ -690,13 +694,17 @@
690694
"slug": "llama-3.1-nemotron-70b-instruct"
691695
},
692696
{
693-
"label": "Mistral-nemo-instruct-2407 model",
694-
"slug": "mistral-nemo-instruct-2407"
697+
"label": "Llama-3.3-70b-instruct model",
698+
"slug": "llama-3.3-70b-instruct"
695699
},
696700
{
697701
"label": "Mistral-7b-instruct-v0.3 model",
698702
"slug": "mistral-7b-instruct-v0.3"
699703
},
704+
{
705+
"label": "Mistral-nemo-instruct-2407 model",
706+
"slug": "mistral-nemo-instruct-2407"
707+
},
700708
{
701709
"label": "Mixtral-8x7b-instruct-v0.1 model",
702710
"slug": "mixtral-8x7b-instruct-v0.1"
@@ -705,18 +713,6 @@
705713
"label": "Molmo-72b-0924 model",
706714
"slug": "molmo-72b-0924"
707715
},
708-
{
709-
"label": "Sentence-t5-xxl model",
710-
"slug": "sentence-t5-xxl"
711-
},
712-
{
713-
"label": "BGE-Multilingual-Gemma2 model",
714-
"slug": "bge-multilingual-gemma2"
715-
},
716-
{
717-
"label": "Pixtral-12b-2409 model",
718-
"slug": "pixtral-12b-2409"
719-
},
720716
{
721717
"label": "Moshika-0.1-8b model",
722718
"slug": "moshika-0.1-8b"
@@ -725,9 +721,17 @@
725721
"label": "Moshiko-0.1-8b model",
726722
"slug": "moshiko-0.1-8b"
727723
},
724+
{
725+
"label": "Pixtral-12b-2409 model",
726+
"slug": "pixtral-12b-2409"
727+
},
728728
{
729729
"label": "Qwen2.5-coder-32b-instruct model",
730730
"slug": "qwen2.5-coder-32b-instruct"
731+
},
732+
{
733+
"label": "Sentence-t5-xxl model",
734+
"slug": "sentence-t5-xxl"
731735
}
732736
],
733737
"label": "Additional Content",

0 commit comments

Comments
 (0)