Overview
⭐️ Trinity-RFT repository has moved to agentscope-ai organization. You can find our new repository here🔗 and documentation here📚
This is a minor release that includes several bug fixes, feature improvements, and dependency upgrades.
Explorer
- Add OpenAI API support for Tinker backend, now users can use tinker backend to run agentic RL examples.
- Enhance the AgentScope Workflow Adapter to support features of AgentScope Tuner.
Trainer
- Update veRL to v0.7.0, which includes various performance improvements and bug fixes.
- Fix bugs in multi-stage resume and last checkpoint saving.
- Avoid preserving checkpoints for weight synchronization purpose to reduce storage usage.
Buffer
- Fix batch size mismatch issue in SQL buffer.
Others
- Introducing R3L, a systematic reflect-then-retry RL mechanism with efficient language-guided exploration and stable off-policy learning. [github repo], [paper]
- Improve documentation.
🚨 Breaking Changes
veRL has been upgraded to v0.7.0, and v0.5.0 is no longer supported.
What's Changed
- Bug fix in multi stage resume by @chenyushuo in #462
- Bug fix in last checkpoint save. by @chenyushuo in #460
- Add model_name for auxiliary models by @hiyuchang in #461
- Improve contributing.md by @yxdyc in #464
- Fix
learn_to_askandtinkerexample. by @chenyushuo in #466 - Add default batch size for
SQLReaderby @chenyushuo in #467 - Enhance AgentScope Workflow Adapter by @pan-x-c in #465
- Fix
process_messages_to_experiencein MultiTurnWorkflow by @hiyuchang in #468 - [Doc] Update docker install guide by @pan-x-c in #469
- Support dynamic lora updating by @pan-x-c in #472
- Add low-quality experience filter operator #470 by @AdnanQureshi3 in #473
- Add openai client support for tinker backend by @chenyushuo in #475
- Improve documentation for CPU-only users and first-time contributors by @P09s in #474
- Remove checkpoints saved for sync purpose by @pan-x-c in #476
- Add R3L to readme and doc by @yanxi-chen in #477
- Update veRL to 0.7.0 by @chenyushuo in #471
- Support add / remove lora adapter by @pan-x-c in #478
- Rename
warmup_styletolr_scheduler_typeby @chenyushuo in #479
New Contributors
- @AdnanQureshi3 made their first contribution in #473
- @P09s made their first contribution in #474
Full Changelog: v0.4.0...v0.4.1