Skip to content

Conversation

Wei-Lin-Intel
Copy link

@Wei-Lin-Intel Wei-Lin-Intel commented Oct 17, 2025

This PR fixed the corner case when all the sequences in the running queue are clear, the indices of Mamba cache table would be out of bound.
Hence it fixed the accuracy issue when bs > 256. Now with bs=512, the scores of gsm8k task are stable:

vllm (pretrained=/data/Qwen3-Next-80B-A3B-Instruct,trust_remote_code=True,enable_expert_parallel=True,tensor_parallel_size=4,distributed_executor_backend=mp,max_length=16384,max_gen_toks=2048,max_num_seqs=512,max_num_prefill_seqs=16), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 512

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.9386 ± 0.0066
strict-match 5 exact_match 0.8878 ± 0.0087

Copy link

@czhu15 czhu15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@czhu15 czhu15 merged commit 5813d47 into HabanaAI:aice/v1.22.0 Oct 20, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants