Fine-Tuning Gemma 270M (KeyError: input_ids) #3509
-
|
Hi, I want to fine-tune this dataset using the Gemma 270m notebook: https://huggingface.co/datasets/neurae/dnd_style_intents However, I'm getting the following error: Here's my notebook: https://gist.github.com/cemalgnlts/1701373bab2f4e2365dd628b9d038e31 I'm not sure what I should do — I didn't have any issues with the chess dataset, but when I added a different dataset, this error occurred. Thanks. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
Actually, the |
Beta Was this translation helpful? Give feedback.
Actually, the
train_on_responses_onlyfunction expects pre-tokenized text. After reviewing your notebook, I think you applied the chat template and passed raw text directly totrain_on_responses_only, but it requires tokenized input — that’s why it’s throwing the'input_ids'error.