Skip to content

Pull requests: EleutherAI/lm-evaluation-harness

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Multiple Bangla Benchmark datasets added
#3454 opened Dec 9, 2025 by Ismail-Hossain-1 Loading…
Add Uncheatable Eval
#3442 opened Dec 2, 2025 by ziqing-huang Loading…
Refactor TaskManager
#3432 opened Nov 26, 2025 by baberabb Loading…
Adding pisa task
#3412 opened Nov 17, 2025 by HallerPatrick Loading…
Fix gsm8k_platinum description
#3411 opened Nov 17, 2025 by fxmarty-amd Loading…
feat: refine Chain-of-Thought removal logic
#3386 opened Nov 6, 2025 by Co-Cl2 Loading…
[feat] Add Countdown Task
#3384 opened Nov 4, 2025 by StephenXie Loading…
Math 500
#3381 opened Nov 1, 2025 by seldereyy Loading…
Add gsm_symbolic and gsm_symbolic_cot tasks
#3354 opened Oct 19, 2025 by MengAiDev Loading…
Added ULQA benchmark
#3340 opened Oct 13, 2025 by keramjan Loading…
Support torchrun vllm DP
#3304 opened Sep 19, 2025 by luccafong Loading…
Gemini evaluation support
#3300 opened Sep 15, 2025 by IsraelAbebe Loading…
Adding SPaRC to lm eval harness
#3262 opened Aug 25, 2025 by lkaesberg Loading…
fix gsm8k normalization
#3254 opened Aug 20, 2025 by huaanrui Loading…
Main
#3250 opened Aug 20, 2025 by seongtaehong Loading…
Adding 3LM to lm eval harness
#3241 opened Aug 14, 2025 by GeorgeSherif Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.