Releases: mistralai/mistral-common
v1.9.1 Patch Release
Refactor online streaming processing and allow for dynamic streaming delay
What's Changed
- Add AGENTS.md by @juliendenize in #182
- fix: correct typos 'occurence' and 'recieved' by @thecaptain789 in #185
- [Audio] Refactor streaming logic by @patrickvonplaten in #187
New Contributors
- @thecaptain789 made their first contribution in #185
Full Changelog: v1.9.0...v1.9.1
v1.9.0 - Stream my audio 🎙️
Mistral-Common can now process streaming requests
import numpy as np
from mistral_common.audio import Audio
from mistral_common.protocol.instruct.chunk import RawAudio
from mistral_common.protocol.transcription.request import (
StreamingMode,
TranscriptionRequest,
)
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
# 1. Load the tokenizer with audio support
tokenizer = MistralTokenizer.from_hf_hub(
"mistralai/Voxtral-Mini-4B-Realtime-2602"
)
# 2. Create sample audio data (or load from a file)
sampling_rate = 16_000
duration_s = 2.0
audio_array = np.random.uniform(-1, 1, size=int(duration_s * sampling_rate)).astype(np.float32)
audio = Audio(
audio_array=audio_array,
sampling_rate=sampling_rate,
format="wav",
)
# 3. Create the streaming transcription request
request = TranscriptionRequest(
audio=RawAudio(
data=audio.to_base64("wav"),
format="wav",
),
streaming=StreamingMode.ONLINE, # or StreamingMode.OFFLINE
language=None,
)
# 4. Encode the request
tokenized = tokenizer.encode_transcription(request)
# 5. Access the results
print(f"Tokens: {tokenized.tokens}")
print(f"Number of tokens: {len(tokenized.tokens)}")
print(f"Number of audio segments: {len(tokenized.audios)}")See https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602 for more info.
What's Changed
- Add new token logic asrstr by @patrickvonplaten in #172
- [Backward comp] Still need the _control_tokens for vLLM by @patrickvonplaten in #173
- Release 1.8.8 by @juliendenize in #174
- [Audio] Update padding by @patrickvonplaten in #175
- [Audio] Improve padding for streaming by @patrickvonplaten in #177
- Add audio_encoder to Tokenizer V13 by @amosyou in #180
- Release v1.9.0 - Audio streaming by @patrickvonplaten in #179
- Fix image tests with downloads by @juliendenize in #181
- Enhance accessibility by @juliendenize in #176
New Contributors
Full Changelog: v1.8.7...v1.9.0
v1.8.8: Backward comp
What's Changed
- Add new token logic asrstr by @patrickvonplaten in #172
- [Backward comp] Still need the _control_tokens for vLLM by @patrickvonplaten in #173
Full Changelog: v1.8.7...v1.8.8
v1.8.7: Refactoring and bug fixes.
What's Changed
- Remove the index field from assistant tool_calls. by @tobrun in #165
- Rename get control -> to get special & add is_special by @patrickvonplaten in #164
- Add TextChunk support to ToolMessage by @juliendenize in #170
- Version 1.8.7 by @juliendenize in #171
New Contributors
Full Changelog: v1.8.6...v1.8.7
v1.8.6: rm Python 3.9, bug fixes.
What's Changed
- Remove deprecated imports in docs. by @juliendenize in #138
- Add normalizer and validator utils by @juliendenize in #140
- Refactor private aggregate messages for InstructRequestNormalizer by @juliendenize in #141
- test: improve unit test for is_opencv_installed by @PrasanaaV in #143
- Optimize spm decode function by @juliendenize in #144
- Add get_one_valid_tokenizer_file by @juliendenize in #142
- Remove Python 3.9 support by @juliendenize in #145
- Correctly pass
revisionandtokento hf_api by @juliendenize in #149 - Fix assertion in test_convert_text_chunk and tool_call by @patrickvonplaten in #152
- Pins GH actions by @arcanis in #160
- Add usage restrictions regarding third-party rights. by @juliendenize in #161
- Improve tekken logging message for vocabulary by @juliendenize in #162
- Set version 1.8.6 by @juliendenize in #151
New Contributors
- @PrasanaaV made their first contribution in #143
- @arcanis made their first contribution in #160
Full Changelog: v1.8.5...v1.8.6
v1.8.5: Patch Release
What's Changed
- Make model field optional in TranscriptionRequest by @juliendenize in #128
- Remove all responses and embedding requests. Add transcription docs. by @juliendenize in #133
- Add chunk file by @juliendenize in #129
- allow message content to be empty string by @mingfang in #135
- Add test empty content for AssistantMessage v7 by @juliendenize in #136
- v1.8.5 by @juliendenize in #137
New Contributors
Full Changelog: v1.8.4...v1.8.5
v1.8.4: optional dependencies and allow random padding on ChatCompletionResponseStreamResponse
What's Changed
- Update experimental.md by @juliendenize in #124
- Make sentencepiece optional and refactor optional imports by @juliendenize in #126
- Improve UX for contributing by @juliendenize in #127
- feat: allow random padding on ChatCompletionResponseStreamResponse by @aac228 in #131
New Contributors
Full Changelog: v1.8.3...v1.8.4
v1.8.3: Add an experimental REST API
What's Changed
- Add a FastAPI app by @juliendenize in #113
We released an experimental REST API leveraging Fast API to handle requests from tokenization, through generation via calls to an engine, to detokenization.
For a detailed documentation see [https://mistralai.github.io/mistral-common/usage/experimental/].
Here is how to launch the server:
pip install mistral-common[server]
mistral_common serve mistralai/Magistral-Small-2507 \
--host 127.0.0.1 --port 8000 \
--engine-url http://127.0.0.1:8080 --engine-backend llama_cpp \
--timeout 60Then you can see the Swagger at: http://localhost:8000.
Full Changelog: v1.8.2...v1.8.3
v1.8.2: Add ThinkChunk
What's Changed
- Add think chunk by @juliendenize in #122
Now you can use TextChunk and ThinkChunk in your SystemMessage or AssistantMessage:
from mistral_common.protocol.instruct.messages import SystemMessage, TextChunk, ThinkChunk
system_message = SystemMessage(
content = [
TextChunk(text="First draft your thinking process (inner monologue) until you arrive at a response. Format your response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and the response in the same language as the input.\n\nYour thinking process must follow the template below:"),
ThinkChunk(
thinking="Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate the response. Use the same language as the input.",
closed=True,
),
TextChunk(text="Here, provide a self-contained response.")
],
)Full Changelog: v1.8.1...v1.8.2
v1.8.1: Add AudioURLChunk
What's Changed
- Add AudioURLChunk by @juliendenize in #120
Now you can use http(s) URLs, file paths and base64 string (without specifying format) in your content chunks thanks to AudioURLChunk !
from mistral_common.protocol.instruct.messages import AudioURL, AudioURLChunk, TextChunk, UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
repo_id = "mistralai/Voxtral-Mini-3B-2507"
tokenizer = MistralTokenizer.from_hf_hub(repo_id)
text_chunk = TextChunk(
text="Wat do you think about this audio?"
)
user_msg = UserMessage(
content=[
AudioURLChunk(audio_url=AudioURL(url="https://freewavesamples.com/files/Ouch-6.wav")),
text_chunk,
]
)
request = ChatCompletionRequest(messages=[user_msg])
tokenized = tokenizer.encode_chat_completion(request)
# pass tokenized.tokens to your favorite audio model
print(tokenized.tokens)
print(tokenized.audios)
# print text to visually see tokens
print(tokenized.text)Full Changelog: v1.8.0...v1.8.1