Skip to content

Conversation

@dushyantbehl
Copy link
Collaborator

While doing our testing of using Iterable Datasets in few training runs we hit this error where the collator being used at training time was RemoveColumnsCollator. It seems that at times trl internally wraps the collator we pass either DataCollatorForSeq2Seq or DataCollatorForCompletionOnlyLM inside this RemoveColumnsCollator to remove columns at batch creation time.

The simple fix for making this collator work with the padding free plugin is to unwrap the actual collator and try to build a replacement for the same.

@dushyantbehl dushyantbehl requested a review from fabianlim as a code owner March 12, 2025 10:21
@dushyantbehl
Copy link
Collaborator Author

@fabianlim opening up the PR for simple review for now..I will be doing more testing and sharing the error details we noticed before and after this patch.
Please advice if you want me to add some unit tests for this change.

@kmehant
Copy link
Collaborator

kmehant commented Mar 12, 2025

related - huggingface/transformers#34830

@dushyantbehl Just in case if you wish to fix this from transformers end there is some interest from HF team for a PR. If you wish to work on this, you can take this up as you see it fit.

@dushyantbehl
Copy link
Collaborator Author

Thanks @kmehant surely I'll check it out and fix

@dushyantbehl dushyantbehl marked this pull request as ready for review March 17, 2025 12:05
@kmehant
Copy link
Collaborator

kmehant commented Apr 10, 2025

related - #87

@kmehant
Copy link
Collaborator

kmehant commented Apr 10, 2025

From @fabianlim's comment we should ideally disable or gracefully fail when multipack is requested when allowing for streaming + padding free. May be this is something to be added to fms-hf-tuning.

@dushyantbehl can you check other concerns from Fabian from this issue and see if we have covered - #87 (comment)?

@dushyantbehl
Copy link
Collaborator Author

dushyantbehl commented Apr 10, 2025

If I read @fabianlim 's comment on #87 (comment)

It basically comes down to not supporting arbitrary collators inside RemoveColumnsCollator but only support DataCollatorForSeq2Seq or DataCollatorForCompletionOnlyLM and that is exactly what this PR is doing...it checks inside the remove columns collator and performs all checks to see if it is indeed one of the two supported or not.
So I am assuming the PR is to his spirit and not breaking the ideology. Let me know if you disagree here.

On the use of multipack + streaming we can just have a check in the fms-hf-tuning repo and gracefully fail if multipack is requested with streaming.

@kmehant
Copy link
Collaborator

kmehant commented Jun 12, 2025

@fabianlim FYA ^ thank you.

@kmehant
Copy link
Collaborator

kmehant commented Nov 25, 2025

@dushyantbehl can you rebase and fix the conflicts?

@dushyantbehl
Copy link
Collaborator Author

@kmehant I would let you or @romitjain pick this up and we can close this PR.

@dushyantbehl
Copy link
Collaborator Author

@kmehant the change was minimal so I just rebased the PR lets go ahead with this one itself so its less work for you both

@kmehant kmehant merged commit 8c05e58 into foundation-model-stack:main Nov 25, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants