-
Notifications
You must be signed in to change notification settings - Fork 233
docs: Document Gym + RL integration design #1762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Ananth Subramaniam <[email protected]>
9527953 to
9ea9f1a
Compare
📝 WalkthroughWalkthroughTwo documentation files were added: a new design document describing the NeMo Gym integration architecture, initialization sequence, training loop, data formats, and tokenization with Mermaid diagrams; and an update to the documentation index to include the new design document in the navigation structure. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@docs/design-docs/nemo-gym-integration.md`:
- Around line 22-23: Update the inline comments for the two config keys to
explicitly state their relationship: note that async_engine and
expose_http_server are independent settings but both must be enabled to support
the HTTP server; e.g., change the comment on async_engine to clarify it enables
the async worker/runtime and the comment on expose_http_server to state it
controls whether the HTTP server (exposing /v1/chat/completions) is started, and
add a combined comment line that both must be true to enable HTTP server
support.
🧹 Nitpick comments (1)
docs/design-docs/nemo-gym-integration.md (1)
184-184: Minor grammar refinement.For consistency with the formal tone used throughout the documentation, consider revising "Results return out of order" to "Results are returned out of order".
📝 Suggested revision
-1. **Results return out of order**: Rollouts complete at different times depending on conversation length and tool calls. Rather than waiting for all results, the actor processes each result as soon as it completes. +1. **Results are returned out of order**: Rollouts complete at different times depending on conversation length and tool calls. Rather than waiting for all results, the actor processes each result as soon as it completes.
| async_engine: true # Required for HTTP server support | ||
| expose_http_server: true # Exposes /v1/chat/completions endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n -A10 -B2 'expose_http_server|async_engine' --type=py -g '!**/test/**' | head -100Repository: NVIDIA-NeMo/RL
Length of output: 8349
🏁 Script executed:
rg -n 'expose_http_server' --type=py --type=md --type=yamlRepository: NVIDIA-NeMo/RL
Length of output: 1511
🏁 Script executed:
sed -n '165,185p' nemo_rl/models/generation/vllm/vllm_worker_async.pyRepository: NVIDIA-NeMo/RL
Length of output: 962
🏁 Script executed:
sed -n '1,170p' nemo_rl/models/generation/vllm/vllm_worker_async.py | grep -A20 -B5 'async_engine\|AsyncLLM'Repository: NVIDIA-NeMo/RL
Length of output: 1110
🏁 Script executed:
sed -n '1,50p' nemo_rl/models/generation/vllm/config.pyRepository: NVIDIA-NeMo/RL
Length of output: 2404
🏁 Script executed:
rg -n -B5 -A5 'expose_http_server.*async|async.*expose_http_server' --type=pyRepository: NVIDIA-NeMo/RL
Length of output: 1930
Clarify the relationship between async_engine and expose_http_server.
Both settings are required to enable HTTP server support, but they are independent: async_engine is inherent to the async worker, while expose_http_server controls whether the HTTP server is actually started. Consider updating the comments to make this clearer, e.g., "Both required for HTTP server support" rather than leaving the relationship ambiguous.
🤖 Prompt for AI Agents
In `@docs/design-docs/nemo-gym-integration.md` around lines 22 - 23, Update the
inline comments for the two config keys to explicitly state their relationship:
note that async_engine and expose_http_server are independent settings but both
must be enabled to support the HTTP server; e.g., change the comment on
async_engine to clarify it enables the async worker/runtime and the comment on
expose_http_server to state it controls whether the HTTP server (exposing
/v1/chat/completions) is started, and add a combined comment line that both must
be true to enable HTTP server support.
What does this PR do ?
Part of NVIDIA-NeMo/Gym#292
This PR documents the NeMo RL + Gym integration, which includes:
Issues
NVIDIA-NeMo/Gym#292
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit