lm-eval polishing and speed-up by 12010486 · Pull Request #2361 · huggingface/optimum-habana

12010486 · 2026-01-07T16:41:33Z

This pull request updates the Habana model adapter in examples/text-generation/model_adapter.py to improve device handling, logging, and padding logic for text generation on HPU devices.

More importantly, it speeds up both loglikelihood and generate_until based tasks at 0% change in accuracy.

Some examples:

PT_HPU_LAZY_MODE=1  python run_lm_eval.py --model_name_or_path meta-llama/Llama-3.2-1B-instruct \
--attn_softmax_bf16 --use_hpu_graphs --limit_hpu_graphs   --use_kv_cache --bf16 --sdp_on_bf16 --trim_logits \
--batch_size=32 --tasks gsm8k_cot_llama -o eval_gsm8k.json --num_fewshot=8 \
--fewshot_as_multiturn --apply_chat_template True

Improves throughput from 8.57it/s to 20.83it/s

PT_HPU_LAZY_MODE=1  python run_lm_eval.py --model_name_or_path meta-llama/Llama-3.2-1B-instruct --attn_softmax_bf16 --use_hpu_graphs --limit_hpu_graphs   --use_kv_cache --bf16 --sdp_on_bf16 --trim_logits --batch_size=16 --tasks ifeval -o eval_ifeval.json --num_fewshot=0 --fewshot_as_multiturn --apply_chat_template True

Improves throughput from 2.73it/s to 3.39it/s

PT_HPU_LAZY_MODE=1  python run_lm_eval.py --model_name_or_path meta-llama/Llama-3.2-1B-instruct --attn_softmax_bf16 --use_hpu_graphs --limit_hpu_graphs   --use_kv_cache --bf16 --sdp_on_bf16 --trim_logits --batch_size=32 --tasks hellaswag -o eval_hellaswag.json

Improves throughput from 46.36it/s to 198.44it/s

HuggingFaceDocBuilderDev · 2026-01-07T16:45:47Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

regisss

LGTM

12010486 added 9 commits October 20, 2025 15:09

Improve speed

7d48ab4

Merge branch 'huggingface:main' into lm_eval_impro

a20b5ae

Minor changes for readability

f3c233c

Fixes

0e5f376

Merge branch 'huggingface:main' into lm_eval_impro

a49c290

Merge branch 'huggingface:main' into lm_eval_impro

a9ca563

x4 speed-up of loglikelihood requests

2693eac

Unified with upstream

8a929b2

Merge branch 'huggingface:main' into lm_eval_impro

03e2138

12010486 requested a review from regisss as a code owner January 7, 2026 16:41

regisss approved these changes Jan 7, 2026

View reviewed changes

regisss merged commit 472e93d into huggingface:main Jan 7, 2026
2 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lm-eval polishing and speed-up#2361

lm-eval polishing and speed-up#2361
regisss merged 9 commits intohuggingface:mainfrom
12010486:lm_eval_impro

12010486 commented Jan 7, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Jan 7, 2026

Uh oh!

regisss left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

12010486 commented Jan 7, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Jan 7, 2026

Uh oh!

regisss left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants