You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|**Chunking Strategy**|**FixedLength** or **TokenSize**| String enum |**FixedLength**: Split the content, based on the number of characters <br><br>**TokenSize**: Split the content, based on the number of tokens. <br><br>Default: **FixedLength**||
148
+
|**Chunking Strategy**|**FixedLength** or **TokenSize**| String enum |**FixedLength**: Split the content, based on the number of characters <br><br>**TokenSize**: Split the content, based on the number of tokens. <br><br>Default: **FixedLength**| Not applicable |
149
149
|**Text**| <*content-to-chunk*> | Any | The content to chunk. | See [Limits and configuration reference guide](logic-apps-limits-and-config.md#character-limits)|
|**MaxPageLength**| <*max-char-per-chunk*> | Integer | The maximum number of characters per content chunk. <br><br>Default: **5000**| Minimum: **1**|
156
156
|**PageOverlapLength**| <*number-of-overlapping-characters*> | Integer | The number of characters from the end of the previous chunk to include in the next chunk. This setting helps you avoid losing important information when splitting content into chunks and preserves continuity and context across chunks. <br><br>Default: **0** - No overlapping characters exist. | Minimum: **0**|
157
157
|**Language**| <*language*> | String | The [language](/azure/ai-services/language-service/language-detection/language-support) to use for the resulting chunks. <br><br>Default: **en-us**| Not applicable |
|**TokenSize**| <*max-tokens-per-chunk*> | Integer | The maximum number of tokens per content chunk. <br><br>Default: None |- Minimum: 1 <br><br>- Maximum: 8000 |
164
-
|**Encoding model**| <*encoding-method*> | String enum | The [encoding method]() to use: **cl100k_base**, **cl200k_base**, **p50k_base**, **p50k_edit**, **r50k_base** <br><br>Default: None| Not applicable |
163
+
|**TokenSize**| <*max-tokens-per-chunk*> | Integer | The maximum number of tokens per content chunk. <br><br>Default: None | Minimum: **1** <br>Maximum: **8000**|
164
+
|**Encoding model**| <*encoding-method*> | String enum | The encoding model to use: <br><br>- Default: **cl100k_base (gpt4, gpt-3.5-turbo, gpt-35-turbo)** <br><br>- **r50k_base (gpt-3)** <br><br>- **p50k_base (gpt-3)** <br><br>- **p50k_edit (gpt-3)** <br><br>- **cl200k_base (gpt-4o)** <br><br>For more information, see [OpenAI - Models overview](https://platform.openai.com/docs/models/overview).| Not applicable |
0 commit comments