Skip to content

Commit 9991a74

Browse files
authored
feat(genapi): update faq for audio models
1 parent 2386556 commit 9991a74

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

pages/generative-apis/faq.mdx

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,10 @@ Note that in this example, the first line where the free tier applies will not d
8383
### What are tokens and how are they counted?
8484
A token is the minimum unit of content that is seen and processed by a model. Hence, token definitions depend on input types:
8585
- For text, on average, `1` token corresponds to `~4` characters, and thus `0.75` words (as words are on average five characters long)
86-
- For images, `1` token corresponds to a square of pixels. For example, `mistral-small-3.1-24b-instruct-2503` model image tokens of `28x28` pixels (28-pixels height, and 28-pixels width, hence `784` pixels in total).
86+
- For images, `1` token corresponds to a square of pixels. For example, `mistral-small-3.1-24b-instruct-2503` model image tokens are `28x28` pixels (28-pixels height, and 28-pixels width, hence `784` pixels in total).
87+
- For audio:
88+
- `1` token corresponds to a time duration. For example, `voxtral-small-24b-2507` model audio tokens are `80` milliseconds.
89+
- Some models process audio by chunks having a minimum duration. For example, `voxtral-small-24b-2507` model process audio by `30` seconds chunks. This means an audio of `13` seconds will be considered `375` tokens (`30` seconds / `0.08` seconds). And an audio of `178` seconds will considered `2 250` tokens (`30` seconds * `6` / `0.08` seconds).
8790

8891
The exact token count and definition depend on [tokenizers](https://huggingface.co/learn/llm-course/en/chapter2/4) used by each model. When this difference is significant (such as for image processing), you can find detailed information in each model's documentation (for instance in [`mistral-small-3.1-24b-instruct-2503` size limit documentation](/managed-inference/reference-content/model-catalog/#mistral-small-31-24b-instruct-2503)). When the model is open, you can also find this information in the model files on platforms such as Hugging Face, usually in the `tokenizer_config.json` file.
8992

0 commit comments

Comments
 (0)