Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 27, 2026

Fixing Hugging Face Transformers API Compatibility Issues

Summary

Fixed CI test failures caused by deprecated Hugging Face transformers v5.0+ API changes with backward compatibility for v4.26+.

Completed Changes

  • Identified two main issues:
    1. as_target_tokenizer() context manager is deprecated (line 214 in flaml/automl/nlp/huggingface/utils.py)
    2. tokenizer= parameter in Seq2SeqTrainer/TrainerForAuto is deprecated (line 1204 in flaml/automl/model.py)
  • Removed as_target_tokenizer() context manager from tokenize_onedataframe() function
    • This context manager is deprecated in transformers v4.29+ and removed in v5.0+
    • The function is used for all tasks (classification, regression, seq2seq)
    • Not needed as FLAML manually handles label creation for seq2seq tasks
  • Added backward-compatible parameter selection for TrainerForAuto initialization
    • Uses processing_class for transformers >= 4.44.0
    • Uses tokenizer for transformers < 4.44.0 (v4.26-4.43)
    • Falls back to tokenizer if version check fails
    • Compatible with FLAML's requirement of transformers >= 4.26
  • Improved exception handling to use specific exception types
  • Ran pre-commit run --all-files to ensure all formatting is correct

Verification Complete

  • Created and ran custom test scripts
    • Verified tokenization works without as_target_tokenizer
    • Verified backward compatibility with version-based parameter selection
    • Verified fallback mechanism works correctly
    • All tests passed successfully
  • Pre-commit checks passed (all hooks pass)
  • Code formatted with black (line-length 120)
  • Ruff checks passed (no new issues)
  • Code review completed and addressed feedback
  • CodeQL security scan passed (no vulnerabilities)

Files Changed

  • flaml/automl/nlp/huggingface/utils.py - Removed deprecated context manager
  • flaml/automl/model.py - Added version-aware parameter selection for backward compatibility

Impact

Minimal changes that fix compatibility with transformers v5.0+ while maintaining backward compatibility with v4.26+. Expected to resolve all failing NLP tests across different transformers versions.

Original prompt

Fix CI test failures caused by recent Hugging Face transformers API changes in microsoft/FLAML.

Context:

  • Workflow job: https://github.com/microsoft/FLAML/actions/runs/21382596194/job/61552346711#step:14:1
  • Failing tests:
    • test/automl/test_custom_hp.py::test_custom_hp_nlp
    • test/nlp/test_autohf_classificationhead.py::test_switch_* (multiple)
    • test/nlp/test_autohf_cv.py::test_cv
    • test/nlp/test_autohf_multichoice_classification.py::test_mcc
    • test/nlp/test_default.py::test_points_to_evaluate, test_zero_shot_nomodel
  • Errors:
    • AttributeError: BertTokenizer has no attribute as_target_tokenizer
    • TypeError: Seq2SeqTrainer.init() got an unexpected keyword argument 'tokenizer'

Task:

  1. Locate the code paths in FLAML that call tokenizer.as_target_tokenizer() and update them to be compatible with current transformers versions (likely using tokenizer context manager replacements such as tokenizer with text_target / target kwargs or tokenizer methods that supersede as_target_tokenizer).
  2. Locate where Seq2SeqTrainer is initialized with tokenizer= and update to current API (e.g., use processing_class, data_collator, or remove deprecated argument depending on current transformers best practice).
  3. Update any related tests or helper functions if needed to maintain compatibility across supported transformers versions.
  4. Ensure all failing tests above pass.

Deliverable:

  • Code changes in the appropriate FLAML modules plus any necessary test adjustments.
  • Keep changes minimal and focused on compatibility fixes for transformers API changes.

This pull request was created from Copilot chat.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix CI test failures due to Hugging Face transformers changes Fix transformers v5 API compatibility: remove as_target_tokenizer, use processing_class Jan 27, 2026
Copilot AI requested a review from thinkall January 27, 2026 06:02
@thinkall
Copy link
Collaborator

@copilot , will the new code work for Transformers v4?

Support both tokenizer (v4.26-4.43) and processing_class (v4.44+) parameters based on installed transformers version. Fallback to tokenizer if version check fails.

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 27, 2026

@copilot , will the new code work for Transformers v4?

Yes, the updated code now works with both Transformers v4 and v5. It uses version detection:

  • v4.44.0+: Uses processing_class parameter
  • v4.26-4.43: Uses tokenizer parameter
  • Fallback: Uses tokenizer if version check fails

This maintains compatibility with FLAML's requirement of transformers>=4.26 while supporting the latest v5.0.

Commit: 0e276e8

Use specific exception types (ImportError, AttributeError, ValueError) instead of broad Exception catch for better error handling.

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Copilot AI changed the title Fix transformers v5 API compatibility: remove as_target_tokenizer, use processing_class Fix transformers API compatibility: support v4.26+ and v5.0+ with version-aware parameter selection Jan 27, 2026
@thinkall
Copy link
Collaborator

@copilot , run pre-commit run --all-files to fix formatting.

Applied black formatting to fix code style across the repository.

Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 27, 2026

@copilot , run pre-commit run --all-files to fix formatting.

Done. All pre-commit checks pass.

Commit: a6e5917

@thinkall thinkall marked this pull request as ready for review January 27, 2026 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants