Skip to content

Actions: Aleph-Alpha-Research/eval-framework

Actions

Build & Deploy Sphinx Docs

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
38 workflow runs
38 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

fix: Flores200 data reading issue (#179)
Build & Deploy Sphinx Docs #38: Commit 9bf3155 pushed by tfburns
12m 33s main
feat: add GoldenSwag task (#175)
Build & Deploy Sphinx Docs #37: Commit a05e032 pushed by tfburns
12m 19s main
feat: BalancedCOPA dataset (#177)
Build & Deploy Sphinx Docs #36: Commit 25161aa pushed by tfburns
11m 55s main
feat: COPA uses appropriate dataset splits (#176)
Build & Deploy Sphinx Docs #35: Commit 55ebe44 pushed by tfburns
11m 41s main
feat: add Global MMLU task (#174)
Build & Deploy Sphinx Docs #34: Commit 0d0b227 pushed by tfburns
12m 25s main
fix: Change to more complete revision of zeroscrolls (#173)
Build & Deploy Sphinx Docs #33: Commit a84286e pushed by mys007
11m 42s main
chore(main): release 0.2.12 (#167)
Build & Deploy Sphinx Docs #31: Commit 5983e24 pushed by MaxHam
11m 49s main
feat: add "top_p" param to AlephAlphaAPIModel (#168)
Build & Deploy Sphinx Docs #29: Commit e52c927 pushed by tfburns
11m 20s main
chore(main): release 0.2.11 (#164)
Build & Deploy Sphinx Docs #27: Commit 5244b02 pushed by MaxHam
12m 44s main
chore(main): release 0.2.10 (#161)
Build & Deploy Sphinx Docs #23: Commit abc4aa6 pushed by MaxHam
11m 30s main
chore(main): release 0.2.9 (#155)
Build & Deploy Sphinx Docs #21: Commit 7456916 pushed by MaxHam
12m 24s main
feat: add repeats to eval-config (#150)
Build & Deploy Sphinx Docs #20: Commit cb9f860 pushed by MaxHam
11m 42s main
feat: add AIME25 benchmark task (#152)
Build & Deploy Sphinx Docs #18: Commit 3ef01fc pushed by MaxHam
12m 18s main
chore(main): release 0.2.8 (#149)
Build & Deploy Sphinx Docs #16: Commit c67338c pushed by MaxHam
13m 32s main
fix: normalize math reasoning (#148)
Build & Deploy Sphinx Docs #14: Commit 73a8843 pushed by MaxHam
11m 13s main