-
Notifications
You must be signed in to change notification settings - Fork 2k
[TRTLLM-9737][chore] Add rl perf reproduce script and enhance the robustness of Ray tests #9939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
shuyixiong
merged 21 commits into
NVIDIA:main
from
shuyixiong:user/shuyix/rl_perf_repro
Dec 24, 2025
Merged
Changes from all commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
cc2e33e
Add rl pref reproduce script
shuyixiong 8a48fb1
Add multi instance test
shuyixiong 863989f
Add assert info
shuyixiong 8ffa028
Change to 4 gpu tests
shuyixiong 23a276d
Add ray stage with 4 h100 gpus
shuyixiong f0ff339
Add more comment in code
shuyixiong 8d5e62f
Move rl_perf_reproduce.py to tests/ dir and add functional test
shuyixiong c4f90a6
Add rl_perf_repro functional test to CI and make script consistent wi…
shuyixiong 876d4f8
Fix test name
shuyixiong d830db7
Remove redundant code
shuyixiong c5fe408
Initialize ray sessions once forstability
shuyixiong 7740c9e
Minor comment changes
shuyixiong 112a250
Resolve review comment
shuyixiong d1d4694
Improve port allocation mechanism to prevent port conflicts
shuyixiong a4ac652
Change test name in test_lists
shuyixiong b43e7c5
Isolate Ray cluster state between tests to prevent cross-test pollution
shuyixiong 79a10b1
Pass run-ray config to unittest
shuyixiong 1d6ffe2
Use get_free_port_in_ci to avoid port conflict in CI
shuyixiong 7abfddd
Independently initialize and tear down the Ray cluster in tests
shuyixiong 91674f2
Use popen in trt_test_alternative for automatic process tree cleanup
shuyixiong 97e4e74
Apply changes to more ray tests
shuyixiong File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,59 @@ | ||
| # RL Framework Integration Tests | ||
|
|
||
| This directory contains integration tests for TensorRT-LLM with [Ray orchestrator](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/ray_orchestrator), specifically designed to cover usage patterns from various RL (Reinforcement Learning) frameworks such as VeRL and NeMo RL. | ||
|
|
||
| ## Available Scripts | ||
|
|
||
| | Script | Description | | ||
| |--------|-------------| | ||
| | `run_rl_perf_reproduce.py` | Emulates RL workload performance with multiple AsyncLLM instances distributed across GPUs using Ray placement groups | | ||
|
|
||
| ## Usage Examples | ||
|
|
||
| ### RL Performance Reproduction | ||
|
|
||
| The `run_rl_perf_reproduce.py` script creates multiple TensorRT-LLM instances in parallel to simulate RL rollout workloads. | ||
|
|
||
| **TP=4 with 2 instances (8 GPUs total):** | ||
|
|
||
| ```bash | ||
| python run_rl_perf_reproduce.py \ | ||
| --model_dir /path/to/model_dir \ | ||
| --data_path /path/to/prompts.json \ | ||
| --num_instances 2 \ | ||
| --tp_size 4 \ | ||
| --top_p 1 \ | ||
| --logprobs 1 \ | ||
| --max_batch_size 1024 \ | ||
| --enable_cuda_graph_padding | ||
| ``` | ||
|
|
||
| **TP=1 with 8 instances (8 GPUs total):** | ||
|
|
||
| ```bash | ||
| python run_rl_perf_reproduce.py \ | ||
| --model_dir /path/to/model_dir \ | ||
| --data_path /path/to/prompts.json \ | ||
| --num_instances 8 \ | ||
| --tp_size 1 \ | ||
| --top_p 1 \ | ||
| --logprobs 1 \ | ||
| --max_batch_size 384 \ | ||
| --enable_cuda_graph_padding | ||
| ``` | ||
|
|
||
| ## Data Format | ||
|
|
||
| The `--data_path` should point to a JSON file containing a list of prompts, where each prompt is a list of token IDs: | ||
|
|
||
| ```json | ||
| [ | ||
| [1, 2345, 6789, ...], | ||
| [1, 3456, 7890, ...], | ||
| ... | ||
| ] | ||
| ``` | ||
|
|
||
| ## Notes | ||
|
|
||
| - RL Perf reproduction scripts support single-node execution only (max 8 GPUs) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.