Packing logic for SFT incorrectly passed to TitanTrainer

### 🐛 Describe the bug

See title. This causes incorrect masking. The model can and will learn this so eventually loss will go down, but if you try to eval the model it will have bad performance. 

### Versions

_No response_