Skip to content

Conversation

@gautamvarmadatla
Copy link

What does this PR do?

Fixes #42971

This PR adds a Claude Skill for the huggingface/transformers to help contributors navigate the codebase and common development workflows more efficiently

What’s included

  • A repo-specific Claude Skill (SKILL.md and corresponding reference files ) describing:
    • Key library entry points and directory map (models, configs, tokenizers, generation, pipelines, trainer, etc.)
    • Common contributor workflows
    • Conventions and gotchas that help Claude give higher-quality, repo-aligned guidance

What’s not included

  • Claude Code plugin support is not implemented in this PR.
    The original issue mentions a plugin request as well, but this PR focuses on delivering the Skill first as a minimal, useful step. Plugin support can be handled in a follow-up PR.

How to test

  • Load the repository in Claude and verify the Skill is discovered.
  • Ask a few repo-navigation questions (e.g., “Where do model configs live?” / “What tests should I run after changing X?”) and confirm Claude follows the Skill’s structure and pointers.

A few of the many examples I tested include questions like:

  • API existence / anti-hallucination check:
    “Does Transformers have a public argument called temperature_decay on generate()? If yes, show the exact signature location. If no, point to the closest real knobs and where they’re defined.”

  • Repo navigation / backend dispatch:
    “Where is the logic that decides which backend (PyTorch vs TensorFlow vs Flax) gets used when calling AutoModel.from_pretrained()? Point to the exact files and decision flow.”

  • Generation internals / repetition debugging:
    “I’m getting repetitive text in long generations even with repetition_penalty set, what knobs interact most strongly with repetition, and which files apply these penalties during decoding?”

  • Quantization & loading performance troubleshooting:
    “Loading a 7B causal LM with 4-bit quantization and device_map="auto" is causing slow CPU offload and high RAM. what are the likely causes in the loading path, what knobs should I change, and where are they handled in code?”

  • Serving/export reality check:
    “Is there a supported CLI command transformers serve for text-generation with batching? If not, what are the supported alternatives in the Transformers ecosystem, and where are the relevant docs/code in this repo?”

PS: This is just an initial draft I put together so maintainers and other community folks can try it out first. Once people test it and share feedback, we can iterate on it and polish/improve it.

For review : @Rocketknight1, @stevhliu, @ArthurZucker
CC : @Emasoft, @coolgalsandiego

@github-actions
Copy link
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43340&sha=d33b06

@gautamvarmadatla
Copy link
Author

gautamvarmadatla commented Jan 19, 2026

Looks like CI is failing for reasons unrelated to this PR. This PR only adds Claude Skill markdown files.

The failing tests involve dynamic or custom tokenizers where AutoTokenizer.from_pretrained(..., trust_remote_code=True) returns TokenizersBackend instead of CustomTokenizerFast. This matches a known upstream regression where auto_map is ignored in some cases. See issue #43202.

CI job link: https://app.circleci.com/pipelines/github/huggingface/transformers/160360/workflows/306aee43-9030-477f-919e-3d09752353dd/jobs/2110009/tests

I am happy to rebase and rerun CI once the upstream tokenizer fix is in main. Or please lmk i can open another issue and PR to fix this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Please create a Huggingface Transformers SKILL for Claude

1 participant