-
Notifications
You must be signed in to change notification settings - Fork 217
Remove ftfy pin #1189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove ftfy pin #1189
Conversation
Signed-off-by: Sarah Yurick <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, no comments
jrbourbeau
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
VibhuJawa
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Sarah Yurick <[email protected]> Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Sarah Yurick <[email protected]>
Closes #906.
The only part of our codebase that uses ftfy is the
UnicodeReformatter, which is tested here: https://github.com/NVIDIA-NeMo/Curator/blob/main/tests/stages/text/modules/test_modifiers.py. Assuming the test still passes, we should be safe to merge.Greptile Overview
Updated On: 2025-10-20 21:51:00 UTC
Greptile Summary
This PR removes the version pin on the
ftfydependency, changing it fromftfy==6.1.1toftfyin thepyproject.tomlfile. The change addresses issue #906 and allows the package manager to install any compatible version of ftfy rather than being locked to version 6.1.1. Within the NeMo Curator codebase, ftfy is exclusively used by theUnicodeReformattermodule for text cleaning operations, which has existing test coverage intests/stages/text/modules/test_modifiers.py. This change aligns with best practices for dependency management by relying on semantic versioning rather than hard pins, giving users more flexibility while still maintaining compatibility through ftfy's API stability.Important Files Changed
Changed Files
ftfy==6.1.1toftfy) in the text_cpu optional dependenciesConfidence score: 4/5
UnicodeReformatter), though slightly lowered because the PR depends on runtime validation rather than compile-time guarantees that newer ftfy versions maintain API compatibility