Add hf-best-model skill by NathanHB · Pull Request #125 · huggingface/skills

NathanHB · 2026-04-20T14:33:12Z

Adds a skill to make it easier for agents to use benchmark on the hub and the eladerboard feature.
For example, when prompted:

What's the best model to parse my parking tickets locally ?

The model should fetch leaderboards on the hub, find the relevant benchmarks, get the top models according to available hardware, and run it if the user wants to. All while using huggingface hub and leaderboards as backend.

Skill that finds the best HuggingFace model for a given task and device. It queries official benchmark leaderboards via the HF REST API, enriches results with model metadata (parameter count, license), filters by device constraints (MacBook/RTX/CPU), and returns a ranked comparison table with benchmark scores and how-to-run snippets (Ollama + transformers). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Return highest-performing models from leaderboards unconditionally when the user doesn't mention a device. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Instead of a hardcoded lookup table, compute max params from available memory using: fp16 = RAM/2 B, Q4 = RAM*2 B. Works for any device without needing to enumerate them all. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Let the API results speak for themselves. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Renames the plugin to huggingface-best so users can install with `hf skills add huggingface-best`, and the internal skill name to `best`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

burtenshaw

Looks good. Just two nits.

…n.md

evalstate · 2026-04-22T10:31:54Z

@NathanHB -- in the llm-trainer skill we have a benchmarks script (I've just pushed an update #125 here -- help text in the PR for easy review). Wondering if that is useful to include here too - or maybe as an hf plugin @hanouticelina ? @merveenoyan think we've also been discussing "best" model recently?

[edit] -- cool skill 😎

- Remove whisper spec doc that doesn't belong in this PR - Add REST API and CLI equivalents for hub_repo_details MCP tool Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

NathanHB · 2026-04-22T13:22:04Z

Yes i think we should have this in the cli to first, reduce the number of skills to install, second make it more natural / easier for agents to use.

NathanHB · 2026-04-22T13:24:12Z

Not a fan of having tool with hardcoded categories and use case as this adds a bit of bloat. I would rather keep it super simple, pointing the agent on how / where to get data and let it imply what to do. For example here it gets the list of benchmarks on the hub and imply from the suer prompt which ones to use.

evalstate · 2026-04-22T14:56:49Z

I meant #128

NathanHB and others added 10 commits April 20, 2026 16:30

Skip device filtering when no device is specified

ef9c99e

Return highest-performing models from leaderboards unconditionally when the user doesn't mention a device. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace device budget table with a linear formula

8416d3c

Instead of a hardcoded lookup table, compute max params from available memory using: fp16 = RAM/2 B, Q4 = RAM*2 B. Works for any device without needing to enumerate them all. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Remove curated benchmark supplements

ada0a90

Let the API results speak for themselves. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat: remove how-to-run section, prompt for HF Inference Job instead

cb17634

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat: offer local or HF Jobs when user wants to run the model

6080b8b

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chore: add hf-best-model to marketplace.json and regenerate artifacts

13306ea

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

rename skill from hf-best-model to huggingface-best

f88dd7a

Renames the plugin to huggingface-best so users can install with `hf skills add huggingface-best`, and the internal skill name to `best`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs: add whisper-small French ASR Space design spec

c9eb39c

fix: correct skill name to match marketplace.json entry

7753fff

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

burtenshaw approved these changes Apr 22, 2026

View reviewed changes

Comment thread docs/superpowers/specs/2026-04-21-whisper-small-fr-space-design.md Outdated

Comment thread skills/huggingface-best/SKILL.md Outdated

Delete docs/superpowers/specs/2026-04-21-whisper-small-fr-space-desig…

46b0aa1

…n.md

NathanHB and others added 2 commits April 22, 2026 15:01

fix: address PR review comments from burtenshaw

9796a08

- Remove whisper spec doc that doesn't belong in this PR - Add REST API and CLI equivalents for hub_repo_details MCP tool Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

change mcp tool to hf cli or curl

be02c15

burtenshaw merged commit ddcf680 into huggingface:main Apr 23, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hf-best-model skill#125

Add hf-best-model skill#125
burtenshaw merged 13 commits intohuggingface:mainfrom
NathanHB:feat/hf-best-model

NathanHB commented Apr 20, 2026 •

edited

Loading

Uh oh!

burtenshaw left a comment

Uh oh!

Uh oh!

Uh oh!

evalstate commented Apr 22, 2026 •

edited

Loading

Uh oh!

NathanHB commented Apr 22, 2026

Uh oh!

NathanHB commented Apr 22, 2026

Uh oh!

evalstate commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

NathanHB commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

burtenshaw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

evalstate commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NathanHB commented Apr 22, 2026

Uh oh!

NathanHB commented Apr 22, 2026

Uh oh!

evalstate commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

NathanHB commented Apr 20, 2026 •

edited

Loading

evalstate commented Apr 22, 2026 •

edited

Loading