Skip to content

Commit 4ed54c8

Browse files
authored
Merge pull request #106430 from nicholasdbrady/patch-1
Update prepare-dataset.md
2 parents d0ce998 + 6b1d789 commit 4ed54c8

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

articles/cognitive-services/openai/how-to/prepare-dataset.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ The first step of customizing your model is to prepare a high quality dataset. T
2424
- Each completion should start with a whitespace due to our tokenization, which tokenizes most words with a preceding whitespace.
2525
- Each completion should end with a fixed stop sequence to inform the model when the completion ends. A stop sequence could be `\n`, `###`, or any other token that doesn't appear in any completion.
2626
- For inference, you should format your prompts in the same way as you did when creating the training dataset, including the same separator. Also specify the same stop sequence to properly truncate the completion.
27+
- The dataset cannot exceed 100 Mb in total file size.
2728

2829
## Best practices
2930

0 commit comments

Comments
 (0)