Fine-Tuning Gemma 270M (KeyError: input_ids) #3509

cemalgnlts · 2025-10-25T20:45:07Z

cemalgnlts
Oct 25, 2025

Hi,

I want to fine-tune this dataset using the Gemma 270m notebook: https://huggingface.co/datasets/neurae/dnd_style_intents

However, I'm getting the following error: In [11] KeyError: 'input_ids'

Here's my notebook: https://gist.github.com/cemalgnlts/1701373bab2f4e2365dd628b9d038e31

I'm not sure what I should do — I didn't have any issues with the chess dataset, but when I added a different dataset, this error occurred.

Thanks.

Answered by Parveshiiii

Nov 6, 2025

Actually, the train_on_responses_only function expects pre-tokenized text. After reviewing your notebook, I think you applied the chat template and passed raw text directly to train_on_responses_only, but it requires tokenized input — that’s why it’s throwing the 'input_ids' error.

View full answer

Parveshiiii · 2025-11-06T12:53:43Z

Parveshiiii
Nov 6, 2025

Actually, the train_on_responses_only function expects pre-tokenized text. After reviewing your notebook, I think you applied the chat template and passed raw text directly to train_on_responses_only, but it requires tokenized input — that’s why it’s throwing the 'input_ids' error.

1 reply

cemalgnlts Dec 6, 2025
Author

This was my first attempt, I hadn't paid attention. Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fine-Tuning Gemma 270M (KeyError: input_ids) #3509

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Fine-Tuning Gemma 270M (KeyError: input_ids) #3509

Uh oh!

cemalgnlts Oct 25, 2025

Replies: 1 comment · 1 reply

Uh oh!

Parveshiiii Nov 6, 2025

Uh oh!

cemalgnlts Dec 6, 2025 Author

cemalgnlts
Oct 25, 2025

Replies: 1 comment 1 reply

Parveshiiii
Nov 6, 2025

cemalgnlts Dec 6, 2025
Author