Replies: 1 comment
-
Could you clarify without pre-tokenizing? The dataset in your first code block is not tokenized. Axolotl tokenizes it. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi @winglian,
I have a question regarding fine-tuning a model without pre-tokenizing. I am unclear about the correct configuration settings in this context.
If the original fine-tuning with pre-tokenize for the dataset is as follows:
Would replacing it with the following be appropriate for this need:
Would replacing the initial configuration with the
pretraining_dataset
configuration be suitable for my purpose of fine-tuning without pre-tokenizing? Are there any specific implications or differences I should be aware of when opting for thepretraining_dataset
configuration over the datasets configuration in this context?I look forward to your guidance, Thank you in advance.
Beta Was this translation helpful? Give feedback.
All reactions