docs: update features.md to reflect v0.5 release and v0.6 roadmap#1966
docs: update features.md to reflect v0.5 release and v0.6 roadmap#1966
Conversation
📝 WalkthroughWalkthroughUpdated the feature roadmap documentation from v0.4 to v0.6, replacing and adding feature items including Muon Optimizer, SGLang Inference, new models, expanded algorithms, and clarified integration paths with DTensor, Megatron, and Hugging Face. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@docs/about/features.md`:
- Line 24: Update the feature line for the Multi-Turn RL entry by adding the
missing period to the abbreviation “etc” so it reads “etc.”; locate the string
"**Multi-Turn RL** - Multi-turn generation and training for RL with tool use,
games, etc" in docs/about/features.md and change it to end with "etc." to follow
American English abbreviation style.
🧹 Nitpick comments (1)
docs/about/features.md (1)
15-15: Clarify the on‑policy distillation status to avoid mixed signals.It appears in both “Available Now” and “Coming in v0.6,” but the distinction (base support vs. multi‑teacher/cross‑tokenizer enhancements) isn’t explicit. Consider wording that makes the incremental v0.6 scope clear.
✏️ Suggested wording tweak
-- **Learning Algorithms** - GRPO/GSPO/DAPO, SFT (with LoRA), DPO, and On-policy distillation +- **Learning Algorithms** - GRPO/GSPO/DAPO, SFT (with LoRA), DPO, and on‑policy distillation (baseline) -- **On-Policy Distillation** - Multi-teacher and cross tokenizer distillation support +- **On-Policy Distillation (enhancements)** - Multi‑teacher and cross‑tokenizer distillation supportAlso applies to: 23-23
The features doc was outdated (still showing v0.4 roadmap). Updated to match the current README.md feature section: - Move delivered v0.4 items (DAPO, VLM, FP8, Async RL, Megatron Inference, GB200, etc.) to "Available Now" - Add new v0.5 features: LoRA for SFT, NeMo-Gym integration, On-policy distillation, improved HF integration descriptions - Update roadmap to v0.6: Muon Optimizer, SGLang Inference, Speculative Decoding, GDPO, Resiliency, new models
24435b7 to
634c08f
Compare
- "Speculaive" -> "Speculative" - "GPRO" -> "GRPO" - "suport" -> "support"
terrykong
left a comment
There was a problem hiding this comment.
thanks for keeping our docs up to date!
|
@seonjinn can you resolve DCO? |
Oh sure, I'll resolve it :) |
…IDIA-NeMo#1966) Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
The features doc was outdated (still showing v0.4 roadmap). Updated to match the current README.md feature section:
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Issues
List issues that this PR closes (syntax):
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit