Regarding the batch size issue I mentioned earlier (Issue 22), I have conducted further investigations. I found a critical performance discrepancy when increasing the batch size, specifically linked to the use of latent steps
- When Latent Step = 0: Batch size does not have a significant impact on performance.
- When Latent Step > 0: A noticeable performance decrease is observed as the batch size increases.
Settings
# MBPP+
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task mbppplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10 --generate_bs 1
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task mbppplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10 --generate_bs 20
# HumanEval+
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task humanevalplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10 --generate_bs 1
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task humanevalplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10 --generate_bs 20
Resutls
| |
LatentMAS (latent_steps=10, bs = 1, HF) |
LatentMAS (latent_steps=10, bs = 20, HF) |
| MBPP+ |
0.621 |
0.219 |
| HumanEval+ |
0.719 |
0.189 |
Due to resource constraints, I haven't been able to test this on larger models yet. Currently, this issue has been confirmed on the Qwen3-4B model.
I would be grateful if you could look into whether the latent step logic is fully compatible with batched generation. Thank you for your time and for maintaining this great project!
Regarding the batch size issue I mentioned earlier (Issue 22), I have conducted further investigations. I found a critical performance discrepancy when increasing the batch size, specifically linked to the use of latent steps
Settings
Resutls
Due to resource constraints, I haven't been able to test this on larger models yet. Currently, this issue has been confirmed on the Qwen3-4B model.
I would be grateful if you could look into whether the latent step logic is fully compatible with batched generation. Thank you for your time and for maintaining this great project!