Skip to content

Commit 3f512b6

Browse files
Merge pull request #228657 from Joe-Darling/joedarling/SplitSkillChanges
Updated Split skill default value to match updated value and updated associated language skills with a tip to use that default value for the best results when using the skill
2 parents 29da782 + 565044c commit 3f512b6

8 files changed

+8
-8
lines changed

articles/search/cognitive-search-skill-custom-entity-lookup.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ Microsoft.Skills.Text.CustomEntityLookupSkill
2424

2525
## Data limits
2626

27-
+ The maximum input record size supported is 256 MB. If you need to break up your data before sending it to the custom entity lookup skill, consider using the [Text Split skill](cognitive-search-skill-textsplit.md).
27+
+ The maximum input record size supported is 256 MB. If you need to break up your data before sending it to the custom entity lookup skill, consider using the [Text Split skill](cognitive-search-skill-textsplit.md). If you do use a text split skill, set the page length to 5000 for the best performance.
2828
+ The maximum size of the custom entity definition is 10 MB if it's provided as an external file, specified through the "entitiesDefinitionUri" parameter.
2929
+ If the entities are defined inline using the "inlineEntitiesDefinition" parameter, the maximum size is 10 KB.
3030

articles/search/cognitive-search-skill-entity-linking-v3.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Microsoft.Skills.Text.V3.EntityLinkingSkill
2525

2626
## Data limits
2727

28-
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to break up your data before sending it to the EntityLinking skill, consider using the [Text Split skill](cognitive-search-skill-textsplit.md).
28+
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to break up your data before sending it to the EntityLinking skill, consider using the [Text Split skill](cognitive-search-skill-textsplit.md). If you do use a text split skill, set the page length to 5000 for the best performance.
2929

3030
## Skill parameters
3131

articles/search/cognitive-search-skill-entity-recognition.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ The **Entity Recognition** skill (v2) extracts entities of different types from
2727
Microsoft.Skills.Text.EntityRecognitionSkill
2828

2929
## Data limits
30-
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to break up your data before sending it to the key phrase extractor, consider using the [Text Split skill](cognitive-search-skill-textsplit.md).
30+
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to break up your data before sending it to the key phrase extractor, consider using the [Text Split skill](cognitive-search-skill-textsplit.md). If you do use a text split skill, set the page length to 5000 for the best performance.
3131

3232
## Skill parameters
3333

articles/search/cognitive-search-skill-keyphrases.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ This capability is useful if you need to quickly identify the main talking point
2323
Microsoft.Skills.Text.KeyPhraseExtractionSkill
2424

2525
## Data limits
26-
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to break up your data before sending it to the key phrase extractor, consider using the [Text Split skill](cognitive-search-skill-textsplit.md).
26+
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to break up your data before sending it to the key phrase extractor, consider using the [Text Split skill](cognitive-search-skill-textsplit.md). If you do use a text split skill, set the page length to 5000 for the best performance.
2727

2828
## Skill parameters
2929

articles/search/cognitive-search-skill-named-entity-recognition.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ The **Named Entity Recognition** skill (v2) extracts named entities from text. A
2828
Microsoft.Skills.Text.NamedEntityRecognitionSkill
2929

3030
## Data limits
31-
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to break up your data before sending it to the key phrase extractor, consider using the [Text Split skill](cognitive-search-skill-textsplit.md).
31+
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to break up your data before sending it to the key phrase extractor, consider using the [Text Split skill](cognitive-search-skill-textsplit.md). If you do use a text split skill, set the page length to 5000 for the best performance.
3232

3333
## Skill parameters
3434

articles/search/cognitive-search-skill-pii-detection.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Microsoft.Skills.Text.PIIDetectionSkill
2525

2626
## Data limits
2727

28-
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to chunk your data before sending it to the skill, consider using the [Text Split skill](cognitive-search-skill-textsplit.md).
28+
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to chunk your data before sending it to the skill, consider using the [Text Split skill](cognitive-search-skill-textsplit.md). If you do use a text split skill, set the page length to 5000 for the best performance.
2929

3030
## Skill parameters
3131

articles/search/cognitive-search-skill-text-translation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ Microsoft.Skills.Text.TranslationSkill
3131

3232
## Data limits
3333

34-
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to break up your data before sending it to the text translation skill, consider using the [Text Split skill](cognitive-search-skill-textsplit.md).
34+
The maximum size of a record should be 50,000 characters as measured by [`String.Length`](/dotnet/api/system.string.length). If you need to break up your data before sending it to the text translation skill, consider using the [Text Split skill](cognitive-search-skill-textsplit.md). If you do use a text split skill, set the page length to 5000 for the best performance.
3535

3636
## Skill parameters
3737

articles/search/cognitive-search-skill-textsplit.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Parameters are case-sensitive.
2626
| Parameter name | Description |
2727
|--------------------|-------------|
2828
| `textSplitMode` | Either `pages` or `sentences` |
29-
| `maximumPageLength` | Only applies if `textSplitMode` is set to `pages`. This refers to the maximum page length in characters as measured by `String.Length`. The minimum value is 300, the maximum is 100000, and the default value is 10000. The algorithm will do its best to break the text on sentence boundaries, so the size of each chunk may be slightly less than `maximumPageLength`. |
29+
| `maximumPageLength` | Only applies if `textSplitMode` is set to `pages`. This refers to the maximum page length in characters as measured by `String.Length`. The minimum value is 300, the maximum is 100000, and the default value is 5000. The algorithm will do its best to break the text on sentence boundaries, so the size of each chunk may be slightly less than `maximumPageLength`. |
3030
| `defaultLanguageCode` | (optional) One of the following language codes: `am, bs, cs, da, de, en, es, et, fr, he, hi, hr, hu, fi, id, is, it, ja, ko, lv, no, nl, pl, pt-PT, pt-BR, ru, sk, sl, sr, sv, tr, ur, zh-Hans`. Default is English (en). Few things to consider:<ul><li>Providing a language code is useful to avoid cutting a word in half for non-whitespace languages such as Chinese, Japanese, and Korean.</li><li>If you do not know the language (i.e. you need to split the text for input into the [LanguageDetectionSkill](cognitive-search-skill-language-detection.md)), the default of English (en) should be sufficient. </li></ul> |
3131

3232

0 commit comments

Comments
 (0)