Unexpected generated sentences when using llama

Hi,
I used the codes in this repo to finetune open llama model, to reduce the finetuning time, when I generate dataset, I only use one prompt for training, valadation and test set on Beauty. I use random indexing and use the original setting in your repo. And then when I evaluate, I found that the generated output sequences are full of unexpected chracters, like '(@*$^)(*Y(8'. And also, when I want to use the codes to finetune other llama series model, the generated sentences become to be full of '!' .
Can anyone give a hint about this? Is this the problem od tokenizer?

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected generated sentences when using llama #36

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unexpected generated sentences when using llama #36

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions