Release v0.5.0 · hud-evals/hud-python

New major version of the HUD SDK (v0.5.0)!

What's new:
Simpler environments: New Environment class with @env.tool() and @env.scenario() decorators, which allow you to define both tools and evaluations in one place with a cleaner syntax. Cuts around 30% of LOC on our sample environments and makes it easier to track, create and run 100+ tasks both via SDK and platform!
Built-in A/B testing: New hud.eval() lets you test multiple models/configs in one line of code with variants and group parameters, all tracked on the platform if using hud.ai/models.
Unified model API: Call Claude, GPT, Gemini, or Grok through one OpenAI-compatible endpoint.
Other observability features: See all live jobs on platform, store and track tasks on the new evalsets/environment scenario pages, better trace UI.

Migration/Backwards compatibility:
Existing environments and task configs work with all old (v4) commands!
If you wish to migrate to v5, all tasks work via Task.from_v4(), as well as the hud eval CLI command and with --remote runs! See here how you can migrate to new environments: https://docs.hud.ai/migration

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!