Skip to content

Packing logic for SFT incorrectly passed to TitanTrainer #546

@joecummings

Description

@joecummings

🐛 Describe the bug

See title. This causes incorrect masking. The model can and will learn this so eventually loss will go down, but if you try to eval the model it will have bad performance.

Versions

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions