Releases: NousResearch/atropos
Releases · NousResearch/atropos
v0.4.0
Highlights
New example trainer
Weights are shared between vLLM and the trainer, no comms needed to sync weights, and memory saved by using only one copy of the weights!
On Policy/Self Distillation Support
Now support logprobs from a teacher/prompted endpoint, fully supporting on policy distillation/self distillation!
OpenAI Endpoint for managed server
Launch an openai endpoint and collect rollouts from any program that takes in an openai endpoint!
What's Changed
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #215
- Interleaved Tool-Use Within Reasoning Blocks by @interstellarninja in #195
- Pairwise Judgement Environment - improve dataloading, ctx len by @teknium1 in #218
- Add Word Hunt environment by @Aboozle1 in #220
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #222
- qwen tokenizer wrapper & fixed jinja template for tool handling by @shannonsands in #224
- Add arena-hard v1 environment by @teknium1 in #219
- Textworld minimal by @shannonsands in #225
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #228
- Diplomacy trainer env by @shannonsands in #227
- build: update checkout action to v5 by @rejected-l in #233
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #231
- fix: division-by-zero in gradient calculation by @brawncode in #236
- add error logging to collect_trajectories so they don't fail silently by @dmahan93 in #237
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #238
- Update bibtex by @hjc-puro in #235
- Refusalbench v2 by @J-SUPHA in #239
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #241
- Refusalbench v2 by @J-SUPHA in #242
- Fix multiple scored data groups by @shannonsands in #223
- Revert "Fix multiple scored data groups" by @shannonsands in #243
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #246
- fix typo in variable name by @prestoalvarez in #245
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #248
- Multi-Turn Tool-Use RL Environment by @interstellarninja in #160
- WIP: Environments/bleuberi by @aniemerg in #175
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #249
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #251
- refactor(api): improve attribute checking and remove hardcoded values by @DeVikingMark in #250
- fix: correct typos in documentation and comments by @viktorking7 in #254
- [Environment]: smolagents by @aniemerg in #104
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #255
- SmolAgent Env Linting Fixes by @ropresearch in #256
- group temps, sample temps, and logprob api params by @ropresearch in #253
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #257
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #259
- docs: minor fixes to follow code standards by @andrewshab3 in #261
- GZip Compression by @ropresearch in #263
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #266
- docs: few minor fixes by @letmehateu in #265
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #268
- add sglang specific token level logprob handling and server manager/b… by @dmahan93 in #264
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #269
- fix: correct typo and improve code quality by @bobtajson in #267
- add managed vllm server by @dmahan93 in #273
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #275
- refactor: Refactor scored data handling into reusable helper by @ninastef in #272
- feat: dump evaluate subcommand config to YAML in env save dir by @dhyaneesh in #274
- fix some issues by @teknium1 in #279
- docs: fix dead links by @kseniaeremekno in #277
- Convert Environments to ManagedServer for Tinker Integrations by @teknium1 in #278
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #281
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #284
- README updates for Tinker Integration by @samherring99 in #286
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #287
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #288
- docs: fix dead links by @juleennn in #283
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #291
- fix: fix broken links to files by @tonnycro in #292
- Port many benchmarks into atropos by @teknium1 in #294
- Olympiad Coding Environment and LCB Eval by @JoeLi12345 in #296
- big update for letter counting by @teknium1 in #298
- chore: bump license year to 2026 by @rejected-l in #299
- MT-GRPO Turn-Level Advantage Environment by @interstellarninja in #162
- Fix missing logprob by @JustKitting in #293
- Add reversed text environment by @teknium1 in #234
- add eval runner by @dmahan93 in #290
- Feat/sql query env by @PLippmann in #301
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #315
- fix: multiple typos of different importance by @crStiv in #318
- Add support for reasoning models and their variety of providers/endpo… by @teknium1 in #297
- fix: handle nested message format in jsonl2html.py by @Savage890 in #317
- Prevent hangs in kernel evaluation by bounding worker waits by @GHOryy5 in #289
- fix: typo in max_token_length by @windlgrass in #327
- Verifiers Integration by @alt-glitch in #305
- fix: correct typos in instructions.py by @windlgrass in #329
- fix: multiple typos of different importance by @crStiv in #325
- fix: use correct prefix for gradient quantiles with NaN/Inf by @DeVikingMark in #324
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #323
- fix: remove duplicate code in instruction files by @windlgrass in #330
- Fix typos in SLURM.md by @HusseinAdeiza in #334
- fix: initialize current_item in init to prevent AttributeError by @windlgrass in #338
- chore: fix typos by @VolodymyrBg in #339
- Add dummy openai managed server by @dmahan93 in #359
- fix duplicate code + add safety checks by @alireza78a in #370
- add tokenizer name config to set the vllm/sglang tokenizer by @dmahan93 in #373
- [docs] Clarify prerequisites...
v0.3.0
What's Changed
- updated llms.txt, contribution guide and added community folder with … by @shannonsands in #105
- Toolcalling server process fix by @shannonsands in #63
- Update CONTRIBUTING.md by @Hack666r in #107
- Unified gabinfay contribs by @shannonsands in #109
- add: environment for deep philosophical thinking by @gabinfay in #98
- Router env by @gabinfay in #101
- Lean prover environment by @gabinfay in #102
- Merge erikqu contributions by @shannonsands in #110
- [Hackathon] WebVoyager Finetune - Playwright Environment Configuration by @neverSettles in #100
- Merge vivek100 contributions by @shannonsands in #111
- Add hack0 metric card generator environment with artifacts and docume… by @vivek100 in #97
- Merge edmundman contributions by @shannonsands in #112
- UFC ENV ( creative track please) by @edmundman in #93
- loop check by @shannonsands in #108
- Bugfix default factories cli args by @shannonsands in #113
- Merge joshgarza contributions by @shannonsands in #115
- Accessibility Environment by @joshgarza in #96
- Merge roshansanjeev contributions by @shannonsands in #116
- Hackathon: ExamCraft - Adaptive LLM Teacher Environment by @RoshanSanjeev in #95
- Merge krishpop contributions by @shannonsands in #117
- [Hackathon] Catbot Arena by @steven4354 in #94
- Improve API Server Documentation and Update UFC Prediction Output Format by @leopardracer in #114
- Merge jakeboggs contributions by @shannonsands in #119
- Punchline Completion with VR-CLI by @JakeBoggs in #64
- Merge joshuajerin contributions by @shannonsands in #120
- Selcube - Nous hackathon by @joshuajerin in #92
- Merge iyaja contributions by @shannonsands in #121
- Pokemon Showndown Environment (Hackathon Submission) by @iyaja in #90
- Merge karthik contributions by @shannonsands in #122
- dev - push for submission by @Karthik-Ragunath in #89
- Add Solitaire Winning Probability Environment - Mathematical probabil… by @shannonsands in #123
- Infinimath env by @shannonsands in #39
- Merge justin5764 contributions by @shannonsands in #124
- Complete LeanRL environment by @justin5764 in #86
- Merge metonym contributions by @shannonsands in #125
- Merge yoniebans contributions by @shannonsands in #126
- [Hackathon] Caput Mundi: Six-Seat No-Limit Hold'em Poker Environment by @yoniebans in #84
- Merge jeannemtl contributions by @shannonsands in #127
- Hack/env quant by @jeannemtl in #67
- Merge arihanv contributions by @shannonsands in #128
- optimizer environment by @arihanv in #82
- Merge tsadpbb contributions by @shannonsands in #129
- [Hackathon] Helpful Doctors by @tsadpbb in #79
- Merge slyracoon23 contributions by @shannonsands in #130
- DynastAI: Medieval Kingdom Management RL Environment with Adaptive Rewards by @Slyracoon23 in #81
- Feat/swe rl environment v2 by @teknium1 in #106
- Merge ecsbeats contributions by @shannonsands in #132
- Add physical space/STL CAD RL environment by @ecsbeats in #76
- Merge hallerite contributions by @shannonsands in #133
- Protein Design env by @hallerite in #70
- Integrate odancona mcp env by @shannonsands in #134
- Submission by @ODAncona in #80
- Integrate khomeik sanskrit poetry by @shannonsands in #135
- Add Sanskrit poetry RL environment by @KhoomeiK in #71
- Integrate rahulschand openvla by @shannonsands in #136
- Added openVLA environment for robotics by @RahulSChand in #65
- Integrate caradmico starmap compression by @shannonsands in #137
- Data Compression by @caradmico in #66
- Integrate basedlsg padres spatial by @shannonsands in #138
- Add Padres: Spatial RL Environment by @basedlsg in #75
- docs: update README.md in atroposlib/env/README.md by @leehanchung in #58
- Integrate kirilligum consumer journey by @shannonsands in #139
- consumer journey rl by @kirilligum in #87
- Integrate fahrenheitresearch meteorology by @shannonsands in #140
- Add MeteorologyForecastRL environment for Atropos hackathon submission by @FahrenheitResearch in #68
- Integrate subrahmanyam cybersecurity by @shannonsands in #142
- Integrate aniemerg wikipedia by @shannonsands in #143
- fix: typos in documentation file by @vtjl10 in #118
- Integrate michaelwaves options iv by @shannonsands in #144
- Integrate chinguun101 goofy math by @shannonsands in #145
- Intern bootcamp env by @shannonsands in #146
- Fix contribution guide source by @emmanuel-ferdman in #151
- Fix Typos in MCP Tool Calling Environment Documentation by @zeevick10 in #147
- Add max_n_completions parameter to ServerManager for load balancing by @dmahan93 in #154
- Fix Typos in Comments and Documentation by @kilavvy in #153
- Align ScoredData model between API and base.py by @dmahan93 in #155
- Remove process defaults and respect config_init by @hjc-puro in #156
- Add Pydantic Schema to Structured Output Environment by @teknium1 in #157
- Add pytest workflow for Python 3.10 and 3.12 by @dmahan93 in #159
- Add reasoning gym env by @teknium1 in #163
- Fix broken README links and minor typo in docs by @cypherpepe in #166
- add reasoning gym randomization for complexity as well as curriculum support by @teknium1 in #165
- API Message + SFT fix by @dmahan93 in #169
- Minor Fixes: Typo Correction in README and Message Clarification in Tasks by @maximevtush in #168
- switch to using precommit ci not action by @dmahan93 in #178
- Letter counting environment by @teknium1 in #177
- Letter counting environment - Update default config options by @teknium1 in #179
- add tasks_per_step arg to multiply by group_size for bs calculation by @teknium1 in #171
- add additional data dumping features by @teknium1 in #172
- Display cat behaviors file path on error by @emmanuel-ferdman in #176
- Fix Typos in Documentation and Code Comments by @vtjl10 in #180
- Fix Typos and Update Comments for Clarity by @zeevick10 in #182
- Add cycling curriculum, difficulty threshold, update datadumps by @teknium1 in https...
v0.2.1
What's Changed
- Make run api not reload by @dmahan93 in #43
- add code execution environment by @JoeLi12345 in #26
- Blackjack2 env by @shannonsands in #38
- fix validation errors by @hjc-puro in #45
- Llms txt update by @shannonsands in #47
- updated APIServerConfig and added requirements.txt and install instru… by @shannonsands in #46
- Instruction following algo environment by @teknium1 in #44
- Kernelbench env with parallel compilation by @sumo43 in #51
- Added new env info by @shannonsands in #50
- add an SFT data loading env by @dmahan93 in #21
- changed health check to chat completions since all oai models are com… by @shannonsands in #56
- version bump to 0.2.1 by @hjc-puro in #57
New Contributors
- @JoeLi12345 made their first contribution in #26
- @shannonsands made their first contribution in #38
Full Changelog: v0.2.0...v0.2.1
v0.2.0
What's Changed
- Update README.md by @sukrucildirr in #2
- [README] Add offline SFT data gen docs by @hjc-puro in #4
- Add rejection sampling description to offline SFT docs. Also add
atropos-dpo-gento the pyproject.toml. by @hjc-puro in #5 - Add process subcommand by @hjc-puro in #9
- Add full wandb train/eval acc metrics, expand rollouts table with more information to Finance Prediction Environment by @teknium1 in #15
- Fix PR Template for GitHub Web by @teknium1 in #17
- Quick hotfix for better PR template by @teknium1 in #18
- Removed mentions of NousResearch/DeepHermes-3-Llama-3-1B-Preview and … by @edmundman in #20
- Update base env README with design philosophy by @hjc-puro in #25
- 24 keyerror on self state in base register env fail by @dmahan93 in #27
- fix multimodal envs. add view_run_multimodal by @sumo43 in #22
- Support args in
processcli inservesubcommand by @hjc-puro in #14 - fix olympiadbench due to upstream changes by @dmahan93 in #31
- run pre-commit on all files by @dmahan93 in #32
- add pre-commit workflow and readme.md changes to point to debugging tools by @dmahan93 in #33
- ⚡️ Speed up function
grab_exact_from_heterogeneous_queueby 1,680% by @aseembits93 in #7 - fix pre-commit by @dmahan93 in #37
- Improve error logging for HTTP requests by @hjc-puro in #13
- add gym taxi env by @dmahan93 in #36
- Remove dependency on torch for default installation by @dmahan93 in #40
- Add n kwarg being ignored workaround by @dmahan93 in #41
- add custom server support by @dmahan93 in #28
- Create upload_to_pypi.yml for releases by @dmahan93 in #42
New Contributors
- @sukrucildirr made their first contribution in #2
- @hjc-puro made their first contribution in #4
- @teknium1 made their first contribution in #15
- @edmundman made their first contribution in #20
- @dmahan93 made their first contribution in #27
- @sumo43 made their first contribution in #22
- @aseembits93 made their first contribution in #7
Full Changelog: https://github.com/NousResearch/atropos/commits/v0.2.0