Skip to content

Commit aa30753

Browse files
Copilotgewarren
andcommitted
Add data package requirement and cross-references
Co-authored-by: gewarren <[email protected]>
1 parent ae6a023 commit aa30753

File tree

2 files changed

+7
-0
lines changed

2 files changed

+7
-0
lines changed

docs/ai/conceptual/understanding-tokens.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,7 @@ Generative AI services might also be limited regarding the maximum number of tok
103103

104104
## Related content
105105

106+
- [Use Microsoft.ML.Tokenizers for text tokenization](../how-to/use-tokenizers.md)
106107
- [How generative AI and LLMs work](how-genai-and-llms-work.md)
107108
- [Understand embeddings](embeddings.md)
108109
- [Work with vector databases](vector-databases.md)

docs/ai/how-to/use-tokenizers.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,12 @@ Install the Microsoft.ML.Tokenizers NuGet package:
2424
dotnet add package Microsoft.ML.Tokenizers
2525
```
2626

27+
For Tiktoken models (like GPT-4), you also need to install the corresponding data package:
28+
29+
```dotnetcli
30+
dotnet add package Microsoft.ML.Tokenizers.Data.O200kBase
31+
```
32+
2733
## Key features
2834

2935
The Microsoft.ML.Tokenizers library provides:

0 commit comments

Comments
 (0)