Skip to content

Commit 8d58430

Browse files
authored
Merge pull request #110 from mistralai/doc/v0.0.63
Update docs to v0.0.63
2 parents 6a0a7a3 + 99237c7 commit 8d58430

File tree

7 files changed

+307
-51
lines changed

7 files changed

+307
-51
lines changed

docs/capabilities/finetuning.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -222,7 +222,7 @@ curl https://api.mistral.ai/v1/files \
222222

223223
## Create a fine-tuning job
224224
The next step is to create a fine-tuning job.
225-
- model: the specific model you would like to fine-tune. The choices are `open-mistral-7b` (v0.3) and `mistral-small-latest` (`mistral-small-2402`).
225+
- model: the specific model you would like to fine-tune. The choices are `open-mistral-7b` (v0.3), `mistral-small-latest` (`mistral-small-2402`), `codestral-latest` (`codestral-2405`), `open-mistral-nemo` and , `mistral-large-latest` (`mistral-large-2407`).
226226
- training_files: a collection of training file IDs, which can consist of a single file or multiple files
227227
- validation_files: a collection of validation file IDs, which can consist of a single file or multiple files
228228
- hyperparameters: two adjustable hyperparameters, "training_step" and "learning_rate", that users can modify.

docs/deployment/cloud/overview.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,6 @@ In particular, Mistral's optimized commercial models are available on:
99

1010
- [Azure AI](../azure)
1111
- [AWS Bedrock](../aws)
12+
- [Google Cloud Vertex AI Model Garden](../vertex)
1213
- Snowflake Cortex
1314

docs/deployment/cloud/vertex.mdx

Lines changed: 252 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,252 @@
1+
---
2+
id: vertex
3+
title: Vertex AI
4+
sidebar_position: 3.23
5+
---
6+
7+
import Tabs from '@theme/Tabs';
8+
import TabItem from '@theme/TabItem';
9+
10+
11+
You can deploy the following Mistral AI models from Google Cloud Vertex AI's Model Garden:
12+
13+
- Mistral NeMo
14+
- Codestral (instruct and FIM modes)
15+
- Mistral Large
16+
17+
## Pre-requisites
18+
19+
In order to query the model you will need:
20+
21+
- Access to a Google Cloud Project with the Vertex AI API enabled
22+
- Relevant IAM permissions to be able to enable the model and query endpoints through the following roles:
23+
- [Vertex AI User IAM role](https://cloud.google.com/vertex-ai/docs/general/access-control#aiplatform.user).
24+
- Consumer Procurement Entitlement Manager role
25+
26+
On the client side, you will also need:
27+
- The `gcloud` CLI to authenticate against the Google Cloud APIs, please refer to
28+
[this page](https://cloud.google.com/docs/authentication/provide-credentials-adc#google-idp)
29+
for more details.
30+
- A Python virtual environment with the `mistralai-google-cloud` client package installed.
31+
- The following environment variables properly set up:
32+
- `GOOGLE_PROJECT_ID`: a Google Cloud Project ID with the the Vertex AI API enabled
33+
- `GOOGLE_REGION`: a Google Cloud region where Mistral models are available
34+
(e.g. `europe-west4`)
35+
36+
## Querying the models (instruct mode)
37+
38+
39+
<Tabs>
40+
<TabItem value="python" label="Python">
41+
42+
```python
43+
import httpx
44+
import google.auth
45+
from google.auth.transport.requests import Request
46+
import os
47+
48+
49+
def get_credentials() -> str:
50+
credentials, project_id = google.auth.default(
51+
scopes=["https://www.googleapis.com/auth/cloud-platform"]
52+
)
53+
credentials.refresh(Request())
54+
return credentials.token
55+
56+
57+
def build_endpoint_url(
58+
region: str,
59+
project_id: str,
60+
model_name: str,
61+
model_version: str,
62+
streaming: bool = False,
63+
) -> str:
64+
base_url = f"https://{region}-aiplatform.googleapis.com/v1/"
65+
project_fragment = f"projects/{project_id}"
66+
location_fragment = f"locations/{region}"
67+
specifier = "streamRawPredict" if streaming else "rawPredict"
68+
model_fragment = f"publishers/mistralai/models/{model_name}@{model_version}"
69+
url = f"{base_url}{'/'.join([project_fragment, location_fragment, model_fragment])}:{specifier}"
70+
return url
71+
72+
73+
# Retrieve Google Cloud Project ID and Region from environment variables
74+
project_id = os.environ.get("GOOGLE_PROJECT_ID")
75+
region = os.environ.get("GOOGLE_REGION")
76+
77+
# Retrieve Google Cloud credentials.
78+
access_token = get_credentials()
79+
80+
model = "mistral-nemo" # Replace with the model you want to use
81+
model_version = "2407" # Replace with the model version you want to use
82+
is_streamed = False # Change to True to stream token responses
83+
84+
# Build URL
85+
url = build_endpoint_url(
86+
project_id=project_id,
87+
region=region,
88+
model_name=model,
89+
model_version=model_version,
90+
streaming=is_streamed
91+
)
92+
93+
# Define query headers
94+
headers = {
95+
"Authorization": f"Bearer {access_token}",
96+
"Accept": "application/json",
97+
}
98+
99+
# Define POST payload
100+
data = {
101+
"model": model,
102+
"messages": [{"role": "user", "content": "Who is the best French painter?"}],
103+
"stream": is_streamed,
104+
}
105+
# Make the call
106+
with httpx.Client() as client:
107+
resp = client.post(url, json=data, headers=headers, timeout=None)
108+
print(resp.text)
109+
110+
```
111+
112+
</TabItem>
113+
<TabItem value="curl" label="cURL">
114+
115+
```bash
116+
MODEL="mistral-nemo"
117+
MODEL_VERSION="2407"
118+
119+
url="https://$GOOGLE_REGION-aiplatform.googleapis.com/v1/projects/$GOOGLE_PROJECT_ID/locations/$GOOGLE_REGION/publishers/mistralai/models/$MODEL@$MODEL_VERSION:rawPredict"
120+
121+
curl \
122+
-X POST \
123+
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
124+
-H "Content-Type: application/json" \
125+
$url \
126+
--data '{
127+
"model": "'"$MODEL"'",
128+
"temperature": 0,
129+
"messages": [
130+
{"role": "user", "content": "What is the best French cheese?"}
131+
]
132+
}'
133+
134+
```
135+
</TabItem>
136+
</Tabs>
137+
138+
## Querying Codestral in FIM mode
139+
140+
141+
<Tabs>
142+
<TabItem value="python" label="Python">
143+
144+
```python
145+
import httpx
146+
import google.auth
147+
from google.auth.transport.requests import Request
148+
import os
149+
150+
151+
def get_credentials() -> str:
152+
credentials, project_id = google.auth.default(
153+
scopes=["https://www.googleapis.com/auth/cloud-platform"]
154+
)
155+
credentials.refresh(Request())
156+
return credentials.token
157+
158+
159+
def build_endpoint_url(
160+
region: str,
161+
project_id: str,
162+
model_name: str,
163+
model_version: str,
164+
streaming: bool = False,
165+
) -> str:
166+
base_url = f"https://{region}-aiplatform.googleapis.com/v1/"
167+
project_fragment = f"projects/{project_id}"
168+
location_fragment = f"locations/{region}"
169+
specifier = "streamRawPredict" if streaming else "rawPredict"
170+
model_fragment = f"publishers/mistralai/models/{model_name}@{model_version}"
171+
url = f"{base_url}{'/'.join([project_fragment, location_fragment, model_fragment])}:{specifier}"
172+
return url
173+
174+
175+
# Retrieve Google Cloud Project ID and Region from environment variables
176+
project_id = os.environ.get("GOOGLE_PROJECT_ID")
177+
region = os.environ.get("GOOGLE_REGION")
178+
179+
# Retrieve Google Cloud credentials.
180+
access_token = get_credentials()
181+
182+
model = "codestral"
183+
model_version = "2405"
184+
is_streamed = False # Change to True to stream token responses
185+
186+
# Build URL
187+
url = build_endpoint_url(
188+
project_id=project_id,
189+
region=region,
190+
model_name=model,
191+
model_version=model_version,
192+
streaming=is_streamed
193+
)
194+
195+
# Define query headers
196+
headers = {
197+
"Authorization": f"Bearer {access_token}",
198+
"Accept": "application/json",
199+
}
200+
201+
# Define POST payload
202+
data = {
203+
"model": model,
204+
"prompt": "def say_hello(name: str) -> str:",
205+
"suffix": "return n_words"
206+
}
207+
# Make the call
208+
with httpx.Client() as client:
209+
resp = client.post(url, json=data, headers=headers, timeout=None)
210+
print(resp.text)
211+
212+
213+
```
214+
215+
</TabItem>
216+
<TabItem value="curl" label="cURL">
217+
218+
```bash
219+
MODEL="codestral"
220+
MODEL_VERSION="2405"
221+
222+
url="https://$GOOGLE_REGION-aiplatform.googleapis.com/v1/projects/$GOOGLE_PROJECT_ID/locations/$GOOGLE_REGION/publishers/mistralai/models/$MODEL@$MODEL_VERSION:rawPredict"
223+
224+
225+
curl \
226+
-X POST \
227+
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
228+
-H "Content-Type: application/json" \
229+
$url \
230+
--data '{
231+
"model":"'"$MODEL"'",
232+
"prompt": "def count_words_in_file(file_path: str) -> int:",
233+
"suffix": "return n_words"
234+
}'
235+
236+
```
237+
</TabItem>
238+
</Tabs>
239+
240+
241+
## Going further
242+
243+
For more information and examples, you can check:
244+
245+
- The Google Cloud [Partner Models](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/mistral)
246+
documentation page.
247+
- The Vertex Model Cards for [Mistral Large](https://console.cloud.google.com/vertex-ai/publishers/mistralai/model-garden/mistral-large),
248+
[Mistral-NeMo](https://console.cloud.google.com/vertex-ai/publishers/mistralai/model-garden/mistral-nemo) and
249+
[Codestral](https://console.cloud.google.com/vertex-ai/publishers/mistralai/model-garden/codestral).
250+
- The [Getting Started Colab Notebook](https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/generative_ai/mistralai_intro.ipynb)
251+
for Mistral models on Vertex, along with the [source file on GitHub](https://github.com/GoogleCloudPlatform/vertex-ai-samples/tree/main/notebooks/official/generative_ai/mistralai_intro.ipynb).
252+

docs/getting-started/Open-weight-models.mdx

Lines changed: 9 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -4,22 +4,12 @@ title: Open-weight models
44
sidebar_position: 1.4
55
---
66

7-
We open-source both pre-trained models and fine-tuned models. These models are not tuned for safety as we want to empower users to test and refine moderation based on their use cases. For safer models, follow our [guardrailing tutorial](/capabilities/guardrailing).
8-
9-
| Model | Available Open-weight|Available via API| Description | Max Tokens| API Endpoints|
10-
|--------------------|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:|
11-
| Mistral 7B | :heavy_check_mark: <br/> Apache2 |:heavy_check_mark: |The first dense model released by Mistral AI, perfect for experimentation, customization, and quick iteration. At the time of the release, it matched the capabilities of models up to 30B parameters. Learn more on our [blog post](https://mistral.ai/news/announcing-mistral-7b/)| 32k | `open-mistral-7b`|
12-
| Mixtral 8x7B |:heavy_check_mark: <br/> Apache2 | :heavy_check_mark: |A sparse mixture of experts model. As such, it leverages up to 45B parameters but only uses about 12B during inference, leading to better inference throughput at the cost of more vRAM. Learn more on the dedicated [blog post](https://mistral.ai/news/mixtral-of-experts/)| 32k | `open-mixtral-8x7b`|
13-
| Mixtral 8x22B |:heavy_check_mark: <br/> Apache2 | :heavy_check_mark: |A bigger sparse mixture of experts model. As such, it leverages up to 141B parameters but only uses about 39B during inference, leading to better inference throughput at the cost of more vRAM. Learn more on the dedicated [blog post](https://mistral.ai/news/mixtral-8x22b/)| 64k | `open-mixtral-8x22b`|
14-
| Codestral |:heavy_check_mark: <br/> MNPL|:heavy_check_mark: | A cutting-edge generative model that has been specifically designed and optimized for code generation tasks, including fill-in-the-middle and code completion | 32k | `codestral-latest`|
15-
| Codestral Mamba | :heavy_check_mark: <br/> Apache2 | :heavy_check_mark: | A Mamba 2 language model specialized in code generation. Learn more on our [blog post](https://mistral.ai/news/codestral-mamba/) | 256k | `open-codestral-mamba`|
16-
| Mathstral | :heavy_check_mark: <br/> Apache2 | | A math-specific 7B model designed for math reasoning and scientific tasks. Learn more on our [blog post](https://mistral.ai/news/mathstral/) | 32k | NA|
17-
| Mistral NeMo | :heavy_check_mark: <br/> Apache2 | :heavy_check_mark: | A 12B model built with the partnership with Nvidia. It is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. Learn more on our [blog post](https://mistral.ai/news/mistral-nemo/) | 128k | `open-mistral-nemo`|
7+
We open-source both pre-trained models and instruction-tuned models. These models are not tuned for safety as we want to empower users to test and refine moderation based on their use cases. For safer models, follow our [guardrailing tutorial](/capabilities/guardrailing).
188

199
## License
2010
- Mistral 7B, Mixtral 8x7B, Mixtral 8x22B, Codestral Mamba, Mathstral, and Mistral NeMo are under [Apache 2 License](https://choosealicense.com/licenses/apache-2.0/), which permits their use without any constraints.
2111
- Codestral is under [Mistral AI Non-Production (MNPL) License](https://mistral.ai/licences/MNPL-0.1.md).
22-
12+
- Mistral Large is under [Mistral Research License](https://mistral.ai/licenses/MRL-0.1.md).
2313

2414
## Downloading
2515

@@ -37,10 +27,11 @@ We open-source both pre-trained models and fine-tuned models. These models are n
3727
| Mixtral-8x22B-Instruct-v0.1/ <br/> Mixtral-8x22B-Instruct-v0.3 | [Hugging Face](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1) <br/> [raw_weights](https://models.mistralcdn.com/mixtral-8x22b-v0-3/mixtral-8x22B-Instruct-v0.3.tar) (md5sum: `471a02a6902706a2f1e44a693813855b`)|- 32768 vocabulary size |
3828
| Mixtral-8x22B-v0.3 | [raw_weights](https://models.mistralcdn.com/mixtral-8x22b-v0-3/mixtral-8x22B-v0.3.tar) (md5sum: `a2fa75117174f87d1197e3a4eb50371a`) | - 32768 vocabulary size <br/> - Supports v3 Tokenizer |
3929
| Codestral-22B-v0.1 | [Hugging Face](https://huggingface.co/mistralai/Codestral-22B-v0.1) <br/> [raw_weights](https://models.mistralcdn.com/codestral-22b-v0-1/codestral-22B-v0.1.tar) (md5sum: `1ea95d474a1d374b1d1b20a8e0159de3`) | - 32768 vocabulary size <br/> - Supports v3 Tokenizer |
40-
| Codestral-Mamba-7B-v0.1 | [Hugging Face](https://huggingface.co/mistralai/mamba-codestral-7B-v0.1) <br/> [raw_weights](https://models.mistralcdn.com/codestral-mamba-7b-v0-1/codestral-mamba-7B-v0.1.tar)(md5sum: `d3993e4024d1395910c55db0d11db163`) | - 32768 vocabulary size <br/> - Supports v3 Tokenizer |
41-
| Mathstral-7B-v0.1 | [Hugging Face](https://huggingface.co/mistralai/mathstral-7B-v0.1) <br/> [raw_weights](https://models.mistralcdn.com/mathstral-7b-v0-1/mathstral-7B-v0.1.tar)(md5sum: `5f05443e94489c261462794b1016f10b`) | - 32768 vocabulary size <br/> - Supports v3 Tokenizer |
42-
| Mistral-NeMo-Base-2407 | [Hugging Face](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407) <br/> [raw_weights](https://models.mistralcdn.com/mistral-nemo-2407/mistral-nemo-base-2407.tar)(md5sum: `c5d079ac4b55fc1ae35f51f0a3c0eb83`) | - 131k vocabulary size <br/> - Supports tekken.json tokenizer |
43-
| Mistral-NeMo-Instruct-2407 | [Hugging Face](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) <br/> [raw_weights](https://models.mistralcdn.com/mistral-nemo-2407/mistral-nemo-instruct-2407.tar)(md5sum: `296fbdf911cb88e6f0be74cd04827fe7`) | - 131k vocabulary size <br/> - Supports tekken.json tokenizer <br/> - Supports function calling |
30+
| Codestral-Mamba-7B-v0.1 | [Hugging Face](https://huggingface.co/mistralai/mamba-codestral-7B-v0.1) <br/> [raw_weights](https://models.mistralcdn.com/codestral-mamba-7b-v0-1/codestral-mamba-7B-v0.1.tar) (md5sum: `d3993e4024d1395910c55db0d11db163`) | - 32768 vocabulary size <br/> - Supports v3 Tokenizer |
31+
| Mathstral-7B-v0.1 | [Hugging Face](https://huggingface.co/mistralai/mathstral-7B-v0.1) <br/> [raw_weights](https://models.mistralcdn.com/mathstral-7b-v0-1/mathstral-7B-v0.1.tar) (md5sum: `5f05443e94489c261462794b1016f10b`) | - 32768 vocabulary size <br/> - Supports v3 Tokenizer |
32+
| Mistral-NeMo-Base-2407 | [Hugging Face](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407) <br/> [raw_weights](https://models.mistralcdn.com/mistral-nemo-2407/mistral-nemo-base-2407.tar) (md5sum: `c5d079ac4b55fc1ae35f51f0a3c0eb83`) | - 131k vocabulary size <br/> - Supports tekken.json tokenizer |
33+
| Mistral-NeMo-Instruct-2407 | [Hugging Face](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) <br/> [raw_weights](https://models.mistralcdn.com/mistral-nemo-2407/mistral-nemo-instruct-2407.tar) (md5sum: `296fbdf911cb88e6f0be74cd04827fe7`) | - 131k vocabulary size <br/> - Supports tekken.json tokenizer <br/> - Supports function calling |
34+
| Mistral-Large-Instruct-2407 | [Hugging Face](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407) <br/> [raw_weights](https://models.mistralcdn.com/mistral-large-2407/mistral-large-instruct-2407.tar) (md5sum: `fc602155f9e39151fba81fcaab2fa7c4`)| - 32768 vocabulary size <br/> - Supports v3 Tokenizer <br/> - Supports function calling |
4435

4536

4637
## Sizes
@@ -53,7 +44,8 @@ We open-source both pre-trained models and fine-tuned models. These models are n
5344
| Codestral-22B-v0.1 | 22.2B | 22.2B | 60 |
5445
| Codestral-Mamba-7B-v0.1 | 7.3B | 7.3B | 16 |
5546
| Mathstral-7B-v0.1 | 7.3B | 7.3B | 16 |
56-
| Mistral-NeMo-12B-v0.1 | 12B | 12B | 28 - bf16 <br/> 16 - fp8 |
47+
| Mistral-NeMo-Instruct-2407 | 12B | 12B | 28 - bf16 <br/> 16 - fp8 |
48+
| Mistral-Large-Instruct-2407 | 123B | 123B | 228 |
5749

5850
## How to run?
5951
Check out [mistral-inference](https://github.com/mistralai/mistral-inference/), a Python package for running our models. You can install `mistral-inference` by

docs/getting-started/changelog.mdx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,10 @@ sidebar_position: 1.8
66

77
This is the list of changes to the Mistral API.
88

9+
July 24, 2024
10+
- We released Mistral Large 2 (`mistral-large-2407`).
11+
- We added fine-tuning support for Codestral, Mistral Nemo and Mistral Large. Now the model choices for fine-tuning are `open-mistral-7b` (v0.3), `mistral-small-latest` (`mistral-small-2402`), `codestral-latest` (`codestral-2405`), `open-mistral-nemo` and , `mistral-large-latest` (`mistral-large-2407`)
12+
913
July 18, 2024
1014
- We released Mistral NeMo (`open-mistral-nemo`).
1115

0 commit comments

Comments
 (0)