Releases · NousResearch/atropos

10 Mar 04:20

dmahan93

v0.4.0

1d78069

v0.4.0 Latest

Latest

Highlights

New example trainer

Weights are shared between vLLM and the trainer, no comms needed to sync weights, and memory saved by using only one copy of the weights!

On Policy/Self Distillation Support

Now support logprobs from a teacher/prompted endpoint, fully supporting on policy distillation/self distillation!

OpenAI Endpoint for managed server

Launch an openai endpoint and collect rollouts from any program that takes in an openai endpoint!

What's Changed

[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #215
Interleaved Tool-Use Within Reasoning Blocks by @interstellarninja in #195
Pairwise Judgement Environment - improve dataloading, ctx len by @teknium1 in #218
Add Word Hunt environment by @Aboozle1 in #220
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #222
qwen tokenizer wrapper & fixed jinja template for tool handling by @shannonsands in #224
Add arena-hard v1 environment by @teknium1 in #219
Textworld minimal by @shannonsands in #225
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #228
Diplomacy trainer env by @shannonsands in #227
build: update checkout action to v5 by @rejected-l in #233
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #231
fix: division-by-zero in gradient calculation by @brawncode in #236
add error logging to collect_trajectories so they don't fail silently by @dmahan93 in #237
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #238
Update bibtex by @hjc-puro in #235
Refusalbench v2 by @J-SUPHA in #239
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #241
Refusalbench v2 by @J-SUPHA in #242
Fix multiple scored data groups by @shannonsands in #223
Revert "Fix multiple scored data groups" by @shannonsands in #243
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #246
fix typo in variable name by @prestoalvarez in #245
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #248
Multi-Turn Tool-Use RL Environment by @interstellarninja in #160
WIP: Environments/bleuberi by @aniemerg in #175
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #249
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #251
refactor(api): improve attribute checking and remove hardcoded values by @DeVikingMark in #250
fix: correct typos in documentation and comments by @viktorking7 in #254
[Environment]: smolagents by @aniemerg in #104
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #255
SmolAgent Env Linting Fixes by @ropresearch in #256
group temps, sample temps, and logprob api params by @ropresearch in #253
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #257
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #259
docs: minor fixes to follow code standards by @andrewshab3 in #261
GZip Compression by @ropresearch in #263
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #266
docs: few minor fixes by @letmehateu in #265
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #268
add sglang specific token level logprob handling and server manager/b… by @dmahan93 in #264
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #269
fix: correct typo and improve code quality by @bobtajson in #267
add managed vllm server by @dmahan93 in #273
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #275
refactor: Refactor scored data handling into reusable helper by @ninastef in #272
feat: dump evaluate subcommand config to YAML in env save dir by @dhyaneesh in #274
fix some issues by @teknium1 in #279
docs: fix dead links by @kseniaeremekno in #277
Convert Environments to ManagedServer for Tinker Integrations by @teknium1 in #278
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #281
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #284
README updates for Tinker Integration by @samherring99 in #286
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #287
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #288
docs: fix dead links by @juleennn in #283
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #291
fix: fix broken links to files by @tonnycro in #292
Port many benchmarks into atropos by @teknium1 in #294
Olympiad Coding Environment and LCB Eval by @JoeLi12345 in #296
big update for letter counting by @teknium1 in #298
chore: bump license year to 2026 by @rejected-l in #299
MT-GRPO Turn-Level Advantage Environment by @interstellarninja in #162
Fix missing logprob by @JustKitting in #293
Add reversed text environment by @teknium1 in #234
add eval runner by @dmahan93 in #290
Feat/sql query env by @PLippmann in #301
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #315
fix: multiple typos of different importance by @crStiv in #318
Add support for reasoning models and their variety of providers/endpo… by @teknium1 in #297
fix: handle nested message format in jsonl2html.py by @Savage890 in #317
Prevent hangs in kernel evaluation by bounding worker waits by @GHOryy5 in #289
fix: typo in max_token_length by @windlgrass in #327
Verifiers Integration by @alt-glitch in #305
fix: correct typos in instructions.py by @windlgrass in #329
fix: multiple typos of different importance by @crStiv in #325
fix: use correct prefix for gradient quantiles with NaN/Inf by @DeVikingMark in #324
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #323
fix: remove duplicate code in instruction files by @windlgrass in #330
Fix typos in SLURM.md by @HusseinAdeiza in #334
fix: initialize current_item in init to prevent AttributeError by @windlgrass in #338
chore: fix typos by @VolodymyrBg in #339
Add dummy openai managed server by @dmahan93 in #359
fix duplicate code + add safety checks by @alireza78a in #370
add tokenizer name config to set the vllm/sglang tokenizer by @dmahan93 in #373
[docs] Clarify prerequisites...

Contributors

aniemerg, shannonsands, and 41 other contributors

Assets 2

16 Jul 20:39

dmahan93

v0.3.0

8284a0b

v0.3.0

What's Changed

updated llms.txt, contribution guide and added community folder with … by @shannonsands in #105
Toolcalling server process fix by @shannonsands in #63
Update CONTRIBUTING.md by @Hack666r in #107
Unified gabinfay contribs by @shannonsands in #109
add: environment for deep philosophical thinking by @gabinfay in #98
Router env by @gabinfay in #101
Lean prover environment by @gabinfay in #102
Merge erikqu contributions by @shannonsands in #110
[Hackathon] WebVoyager Finetune - Playwright Environment Configuration by @neverSettles in #100
Merge vivek100 contributions by @shannonsands in #111
Add hack0 metric card generator environment with artifacts and docume… by @vivek100 in #97
Merge edmundman contributions by @shannonsands in #112
UFC ENV ( creative track please) by @edmundman in #93
loop check by @shannonsands in #108
Bugfix default factories cli args by @shannonsands in #113
Merge joshgarza contributions by @shannonsands in #115
Accessibility Environment by @joshgarza in #96
Merge roshansanjeev contributions by @shannonsands in #116
Hackathon: ExamCraft - Adaptive LLM Teacher Environment by @RoshanSanjeev in #95
Merge krishpop contributions by @shannonsands in #117
[Hackathon] Catbot Arena by @steven4354 in #94
Improve API Server Documentation and Update UFC Prediction Output Format by @leopardracer in #114
Merge jakeboggs contributions by @shannonsands in #119
Punchline Completion with VR-CLI by @JakeBoggs in #64
Merge joshuajerin contributions by @shannonsands in #120
Selcube - Nous hackathon by @joshuajerin in #92
Merge iyaja contributions by @shannonsands in #121
Pokemon Showndown Environment (Hackathon Submission) by @iyaja in #90
Merge karthik contributions by @shannonsands in #122
dev - push for submission by @Karthik-Ragunath in #89
Add Solitaire Winning Probability Environment - Mathematical probabil… by @shannonsands in #123
Infinimath env by @shannonsands in #39
Merge justin5764 contributions by @shannonsands in #124
Complete LeanRL environment by @justin5764 in #86
Merge metonym contributions by @shannonsands in #125
Merge yoniebans contributions by @shannonsands in #126
[Hackathon] Caput Mundi: Six-Seat No-Limit Hold'em Poker Environment by @yoniebans in #84
Merge jeannemtl contributions by @shannonsands in #127
Hack/env quant by @jeannemtl in #67
Merge arihanv contributions by @shannonsands in #128
optimizer environment by @arihanv in #82
Merge tsadpbb contributions by @shannonsands in #129
[Hackathon] Helpful Doctors by @tsadpbb in #79
Merge slyracoon23 contributions by @shannonsands in #130
DynastAI: Medieval Kingdom Management RL Environment with Adaptive Rewards by @Slyracoon23 in #81
Feat/swe rl environment v2 by @teknium1 in #106
Merge ecsbeats contributions by @shannonsands in #132
Add physical space/STL CAD RL environment by @ecsbeats in #76
Merge hallerite contributions by @shannonsands in #133
Protein Design env by @hallerite in #70
Integrate odancona mcp env by @shannonsands in #134
Submission by @ODAncona in #80
Integrate khomeik sanskrit poetry by @shannonsands in #135
Add Sanskrit poetry RL environment by @KhoomeiK in #71
Integrate rahulschand openvla by @shannonsands in #136
Added openVLA environment for robotics by @RahulSChand in #65
Integrate caradmico starmap compression by @shannonsands in #137
Data Compression by @caradmico in #66
Integrate basedlsg padres spatial by @shannonsands in #138
Add Padres: Spatial RL Environment by @basedlsg in #75
docs: update README.md in atroposlib/env/README.md by @leehanchung in #58
Integrate kirilligum consumer journey by @shannonsands in #139
consumer journey rl by @kirilligum in #87
Integrate fahrenheitresearch meteorology by @shannonsands in #140
Add MeteorologyForecastRL environment for Atropos hackathon submission by @FahrenheitResearch in #68
Integrate subrahmanyam cybersecurity by @shannonsands in #142
Integrate aniemerg wikipedia by @shannonsands in #143
fix: typos in documentation file by @vtjl10 in #118
Integrate michaelwaves options iv by @shannonsands in #144
Integrate chinguun101 goofy math by @shannonsands in #145
Intern bootcamp env by @shannonsands in #146
Fix contribution guide source by @emmanuel-ferdman in #151
Fix Typos in MCP Tool Calling Environment Documentation by @zeevick10 in #147
Add max_n_completions parameter to ServerManager for load balancing by @dmahan93 in #154
Fix Typos in Comments and Documentation by @kilavvy in #153
Align ScoredData model between API and base.py by @dmahan93 in #155
Remove process defaults and respect config_init by @hjc-puro in #156
Add Pydantic Schema to Structured Output Environment by @teknium1 in #157
Add pytest workflow for Python 3.10 and 3.12 by @dmahan93 in #159
Add reasoning gym env by @teknium1 in #163
Fix broken README links and minor typo in docs by @cypherpepe in #166
add reasoning gym randomization for complexity as well as curriculum support by @teknium1 in #165
API Message + SFT fix by @dmahan93 in #169
Minor Fixes: Typo Correction in README and Message Clarification in Tasks by @maximevtush in #168
switch to using precommit ci not action by @dmahan93 in #178
Letter counting environment by @teknium1 in #177
Letter counting environment - Update default config options by @teknium1 in #179
add tasks_per_step arg to multiply by group_size for bs calculation by @teknium1 in #171
add additional data dumping features by @teknium1 in #172
Display cat behaviors file path on error by @emmanuel-ferdman in #176
Fix Typos in Documentation and Code Comments by @vtjl10 in #180
Fix Typos and Update Comments for Clarity by @zeevick10 in #182
Add cycling curriculum, difficulty threshold, update datadumps by @teknium1 in https...

Contributors

jeannemtl, kirilligum, and 50 other contributors

Assets 2

18 May 14:58

hjc-puro

v0.2.1

c189fc3

v0.2.1

What's Changed

Make run api not reload by @dmahan93 in #43
add code execution environment by @JoeLi12345 in #26
Blackjack2 env by @shannonsands in #38
fix validation errors by @hjc-puro in #45
Llms txt update by @shannonsands in #47
updated APIServerConfig and added requirements.txt and install instru… by @shannonsands in #46
Instruction following algo environment by @teknium1 in #44
Kernelbench env with parallel compilation by @sumo43 in #51
Added new env info by @shannonsands in #50
add an SFT data loading env by @dmahan93 in #21
changed health check to chat completions since all oai models are com… by @shannonsands in #56
version bump to 0.2.1 by @hjc-puro in #57

New Contributors

@JoeLi12345 made their first contribution in #26
@shannonsands made their first contribution in #38

Full Changelog: v0.2.0...v0.2.1

Contributors

shannonsands, sumo43, and 4 other contributors

Assets 2

13 May 22:57

dmahan93

v0.2.0

4f0c464

v0.2.0

What's Changed

Update README.md by @sukrucildirr in #2
[README] Add offline SFT data gen docs by @hjc-puro in #4
Add rejection sampling description to offline SFT docs. Also add atropos-dpo-gen to the pyproject.toml. by @hjc-puro in #5
Add process subcommand by @hjc-puro in #9
Add full wandb train/eval acc metrics, expand rollouts table with more information to Finance Prediction Environment by @teknium1 in #15
Fix PR Template for GitHub Web by @teknium1 in #17
Quick hotfix for better PR template by @teknium1 in #18
Removed mentions of NousResearch/DeepHermes-3-Llama-3-1B-Preview and … by @edmundman in #20
Update base env README with design philosophy by @hjc-puro in #25
24 keyerror on self state in base register env fail by @dmahan93 in #27
fix multimodal envs. add view_run_multimodal by @sumo43 in #22
Support args in process cli in serve subcommand by @hjc-puro in #14
fix olympiadbench due to upstream changes by @dmahan93 in #31
run pre-commit on all files by @dmahan93 in #32
add pre-commit workflow and readme.md changes to point to debugging tools by @dmahan93 in #33
⚡️ Speed up function grab_exact_from_heterogeneous_queue by 1,680% by @aseembits93 in #7
fix pre-commit by @dmahan93 in #37
Improve error logging for HTTP requests by @hjc-puro in #13
add gym taxi env by @dmahan93 in #36
Remove dependency on torch for default installation by @dmahan93 in #40
Add n kwarg being ignored workaround by @dmahan93 in #41
add custom server support by @dmahan93 in #28
Create upload_to_pypi.yml for releases by @dmahan93 in #42

New Contributors

@sukrucildirr made their first contribution in #2
@hjc-puro made their first contribution in #4
@teknium1 made their first contribution in #15
@edmundman made their first contribution in #20
@dmahan93 made their first contribution in #27
@sumo43 made their first contribution in #22
@aseembits93 made their first contribution in #7

Full Changelog: https://github.com/NousResearch/atropos/commits/v0.2.0

Contributors

aseembits93, sukrucildirr, and 5 other contributors

Assets 2

Releases: NousResearch/atropos

v0.4.0

Highlights

New example trainer

On Policy/Self Distillation Support

OpenAI Endpoint for managed server

What's Changed

Contributors

Uh oh!

v0.3.0

What's Changed

Contributors

Uh oh!

v0.2.1

What's Changed

New Contributors

Contributors

Uh oh!

v0.2.0

What's Changed

New Contributors

Contributors

Uh oh!