Skip to content

Commit 8095258

Browse files
Merge pull request #270458 from aahill/ingestion-2
comma
2 parents 1febf05 + 284f376 commit 8095258

File tree

2 files changed

+3
-2
lines changed

2 files changed

+3
-2
lines changed

articles/ai-services/openai/concepts/use-your-data.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -328,7 +328,7 @@ Azure OpenAI On Your Data processes your documents by splitting them into chunks
328328

329329
#### Setting chunk size for your use case
330330

331-
The default chunk size is 1024 tokens. However, given the uniqueness of your data, you might find a different chunk size (such as 256, 512, or 1,536 tokens) more effective.
331+
The default chunk size is 1,024 tokens. However, given the uniqueness of your data, you might find a different chunk size (such as 256, 512, or 1,536 tokens) more effective.
332332

333333
Adjusting the chunk size can enhance your chatbot's performance. While finding the optimal chunk size requires some trial and error, start by considering the nature of your dataset. A smaller chunk size is generally better for datasets with direct facts and less context, while a larger chunk size might be beneficial for more contextual information, though it could affect retrieval performance.
334334

articles/ai-services/openai/whats-new.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,10 @@ Azure OpenAI Studio now provides a Risks & Safety dashboard for each of your dep
2424

2525
[Use the Risks & Safety monitor](./how-to/risks-safety-monitor.md)
2626

27-
### Elasticsearch database support for Azure OpenAI On Your Data
27+
### Azure OpenAI On Your Data updates
2828

2929
- You can now connect to an Elasticsearch vector database to be used with [Azure OpenAI On Your Data](./concepts/use-your-data.md?tabs=elasticsearch#supported-data-sources).
30+
- You can use the [chunk size parameter](./concepts/use-your-data.md#chunk-size-preview) during data ingestion to set the maximum number of tokens of any given chunk of data in your index.
3031

3132
### 2024-02-01 general availability (GA) API released
3233

0 commit comments

Comments
 (0)