Skip to content

Commit 3decfdf

Browse files
Merge pull request #244361 from farzad528/docs-editor/vector-search-how-to-chunk-doc-1688772307
Update vector-search-how-to-chunk-documents.md
2 parents 86862cc + e478cc6 commit 3decfdf

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

articles/search/vector-search-how-to-chunk-documents.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ This article describes several approaches for chunking large documents so that y
1919

2020
## Why is chunking important?
2121

22-
The models used to generate embedding vectors have maximum limits on the text fragments provided as input. For example, the maximum length of input text for the [Azure OpenAI](/azure/cognitive-services/openai/how-to/embeddings) embedding models is 8,191 tokens. Given that each token is around 4 tokens for common OpenAI models, this maximum limit is equivalent to around 6000 words of text. If you're using these models to generate embeddings, it's critical that the input text stays under the limit. Partitioning your content into chunks ensures that your data can be processed by the Large Language Models (LLM) used for indexing and queries.
22+
The models used to generate embedding vectors have maximum limits on the text fragments provided as input. For example, the maximum length of input text for the [Azure OpenAI](/azure/cognitive-services/openai/how-to/embeddings) embedding models is 8,191 tokens. Given that each token is around 4 characters of text for common OpenAI models, this maximum limit is equivalent to around 6000 words of text. If you're using these models to generate embeddings, it's critical that the input text stays under the limit. Partitioning your content into chunks ensures that your data can be processed by the Large Language Models (LLM) used for indexing and queries.
2323

2424
## How chunking fits into the workflow
2525

@@ -127,3 +127,4 @@ This sample is built on LangChain, Azure OpenAI, and Azure Cognitive Search.
127127
+ [Learn how to generate embeddings](/azure/cognitive-services/openai/how-to/embeddings?tabs=console)
128128
+ [Tutorial: Explore Azure OpenAI Service embeddings and document search](/azure/cognitive-services/openai/tutorials/embeddings?tabs=command-line)
129129

130+

0 commit comments

Comments
 (0)