Skip to content

[Test] Add SGLang backend and wrapper tests#3433

Merged
vmoens merged 31 commits intogh/vmoens/213/basefrom
gh/vmoens/213/head
Feb 3, 2026
Merged

[Test] Add SGLang backend and wrapper tests#3433
vmoens merged 31 commits intogh/vmoens/213/basefrom
gh/vmoens/213/head

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Jan 31, 2026

Stack from ghstack (oldest at bottom):

Add comprehensive tests for SGLang integration:

TestAsyncSGLangIntegration:

  • test_connect_to_server: Verify service connection
  • test_get_tp_size: Tensor parallel size retrieval
  • test_get_dp_size: Data parallel size retrieval
  • test_get_model_metadata: Model metadata extraction
  • test_generate_text: Single prompt generation
  • test_generate_batch: Batch generation
  • test_flush_cache: Cache management

TestSGLangWrapper:

  • test_wrapper_creation_from_service: Wrapper initialization
  • test_history_mode: History-based input
  • test_text_mode: Text-based input
  • test_tokens_mode: Token-based input
  • test_log_probs: Log probability extraction
  • test_get_new_version: Policy version tracking

Tests use Qwen/Qwen2.5-0.5B for faster CI execution.
Markers: @pytest.mark.gpu, @pytest.mark.slow

Co-authored-by: Cursor cursoragent@cursor.com

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Jan 31, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3433

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 27 Pending

As of commit 89678be with merge base 01413ca (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 31, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 153. Improved: $\large\color{#35bf28}20$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 82.8609μs 81.2663μs 12.3052 KOps/s 11.4456 KOps/s $\textbf{\color{#35bf28}+7.51\%}$
test_tensor_to_bytestream_speed[torch.save] 0.1398ms 0.1396ms 7.1613 KOps/s 6.7173 KOps/s $\textbf{\color{#35bf28}+6.61\%}$
test_tensor_to_bytestream_speed[untyped_storage] 0.1274s 0.1266s 7.8975 Ops/s 7.2456 Ops/s $\textbf{\color{#35bf28}+9.00\%}$
test_tensor_to_bytestream_speed[numpy] 2.6881μs 2.6816μs 372.9157 KOps/s 362.7705 KOps/s $\color{#35bf28}+2.80\%$
test_tensor_to_bytestream_speed[safetensors] 39.0121μs 37.1999μs 26.8818 KOps/s 26.6522 KOps/s $\color{#35bf28}+0.86\%$
test_simple 0.6703s 0.5808s 1.7217 Ops/s 1.7247 Ops/s $\color{#d91a1a}-0.17\%$
test_transformed 1.2539s 1.1631s 0.8598 Ops/s 0.8510 Ops/s $\color{#35bf28}+1.03\%$
test_serial 1.6938s 1.6925s 0.5909 Ops/s 0.5799 Ops/s $\color{#35bf28}+1.89\%$
test_parallel 1.2344s 1.1542s 0.8664 Ops/s 0.8938 Ops/s $\color{#d91a1a}-3.07\%$
test_step_mdp_speed[True-True-True-True-True] 0.3576ms 45.7664μs 21.8501 KOps/s 22.5398 KOps/s $\color{#d91a1a}-3.06\%$
test_step_mdp_speed[True-True-True-True-False] 64.7320μs 25.5289μs 39.1713 KOps/s 40.5466 KOps/s $\color{#d91a1a}-3.39\%$
test_step_mdp_speed[True-True-True-False-True] 57.5110μs 25.2219μs 39.6480 KOps/s 40.0202 KOps/s $\color{#d91a1a}-0.93\%$
test_step_mdp_speed[True-True-True-False-False] 50.5510μs 14.0806μs 71.0195 KOps/s 72.7588 KOps/s $\color{#d91a1a}-2.39\%$
test_step_mdp_speed[True-True-False-True-True] 82.9520μs 48.4968μs 20.6199 KOps/s 21.3538 KOps/s $\color{#d91a1a}-3.44\%$
test_step_mdp_speed[True-True-False-True-False] 52.6410μs 28.1200μs 35.5619 KOps/s 36.0045 KOps/s $\color{#d91a1a}-1.23\%$
test_step_mdp_speed[True-True-False-False-True] 0.1032ms 28.7122μs 34.8284 KOps/s 36.0516 KOps/s $\color{#d91a1a}-3.39\%$
test_step_mdp_speed[True-True-False-False-False] 46.2110μs 16.6838μs 59.9383 KOps/s 59.9201 KOps/s $\color{#35bf28}+0.03\%$
test_step_mdp_speed[True-False-True-True-True] 88.7820μs 51.0629μs 19.5837 KOps/s 19.9141 KOps/s $\color{#d91a1a}-1.66\%$
test_step_mdp_speed[True-False-True-True-False] 62.5810μs 30.8626μs 32.4016 KOps/s 32.0353 KOps/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[True-False-True-False-True] 59.5410μs 27.9318μs 35.8015 KOps/s 35.3331 KOps/s $\color{#35bf28}+1.33\%$
test_step_mdp_speed[True-False-True-False-False] 95.7230μs 16.7259μs 59.7876 KOps/s 59.1477 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[True-False-False-True-True] 87.4520μs 53.4330μs 18.7150 KOps/s 18.9119 KOps/s $\color{#d91a1a}-1.04\%$
test_step_mdp_speed[True-False-False-True-False] 66.9110μs 33.0586μs 30.2493 KOps/s 30.5815 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[True-False-False-False-True] 60.1410μs 30.1090μs 33.2126 KOps/s 33.1647 KOps/s $\color{#35bf28}+0.14\%$
test_step_mdp_speed[True-False-False-False-False] 51.2810μs 19.6393μs 50.9183 KOps/s 51.6925 KOps/s $\color{#d91a1a}-1.50\%$
test_step_mdp_speed[False-True-True-True-True] 78.8610μs 50.2202μs 19.9123 KOps/s 20.0324 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[False-True-True-True-False] 0.1075ms 30.4714μs 32.8177 KOps/s 32.5939 KOps/s $\color{#35bf28}+0.69\%$
test_step_mdp_speed[False-True-True-False-True] 61.8520μs 31.8854μs 31.3623 KOps/s 31.1792 KOps/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[False-True-True-False-False] 47.2810μs 18.6593μs 53.5927 KOps/s 53.4525 KOps/s $\color{#35bf28}+0.26\%$
test_step_mdp_speed[False-True-False-True-True] 2.6848ms 53.9353μs 18.5407 KOps/s 18.9452 KOps/s $\color{#d91a1a}-2.14\%$
test_step_mdp_speed[False-True-False-True-False] 66.6810μs 33.9807μs 29.4285 KOps/s 29.9440 KOps/s $\color{#d91a1a}-1.72\%$
test_step_mdp_speed[False-True-False-False-True] 0.1063ms 34.5573μs 28.9375 KOps/s 29.5999 KOps/s $\color{#d91a1a}-2.24\%$
test_step_mdp_speed[False-True-False-False-False] 54.3210μs 21.1513μs 47.2783 KOps/s 47.4897 KOps/s $\color{#d91a1a}-0.45\%$
test_step_mdp_speed[False-False-True-True-True] 97.4520μs 56.0983μs 17.8259 KOps/s 17.8685 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[False-False-True-True-False] 70.9720μs 36.4377μs 27.4441 KOps/s 27.6692 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[False-False-True-False-True] 64.5110μs 34.4380μs 29.0377 KOps/s 29.4190 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[False-False-True-False-False] 0.1019ms 20.7137μs 48.2772 KOps/s 46.9950 KOps/s $\color{#35bf28}+2.73\%$
test_step_mdp_speed[False-False-False-True-True] 97.4920μs 58.9093μs 16.9752 KOps/s 17.3309 KOps/s $\color{#d91a1a}-2.05\%$
test_step_mdp_speed[False-False-False-True-False] 66.6710μs 38.2746μs 26.1270 KOps/s 25.6890 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-False-False-False-True] 67.3210μs 36.4505μs 27.4344 KOps/s 27.4802 KOps/s $\color{#d91a1a}-0.17\%$
test_step_mdp_speed[False-False-False-False-False] 51.2210μs 23.6267μs 42.3250 KOps/s 42.4195 KOps/s $\color{#d91a1a}-0.22\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8806s 0.7864s 1.2716 Ops/s 1.2955 Ops/s $\color{#d91a1a}-1.84\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7436s 0.6487s 1.5415 Ops/s 1.5612 Ops/s $\color{#d91a1a}-1.27\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7774s 1.7013s 0.5878 Ops/s 0.5927 Ops/s $\color{#d91a1a}-0.83\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5573s 1.4749s 0.6780 Ops/s 0.6841 Ops/s $\color{#d91a1a}-0.89\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0533s 1.9630s 0.5094 Ops/s 0.5165 Ops/s $\color{#d91a1a}-1.38\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.8072s 1.7308s 0.5778 Ops/s 0.5813 Ops/s $\color{#d91a1a}-0.61\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7647s 4.6559s 0.2148 Ops/s 0.2127 Ops/s $\color{#35bf28}+0.97\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.6597s 4.5126s 0.2216 Ops/s 0.2265 Ops/s $\color{#d91a1a}-2.15\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.0377s 1.9583s 0.5106 Ops/s 0.5026 Ops/s $\color{#35bf28}+1.60\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.8244s 1.6984s 0.5888 Ops/s 0.5983 Ops/s $\color{#d91a1a}-1.59\%$
test_values[generalized_advantage_estimate-True-True] 11.3327ms 10.9448ms 91.3674 Ops/s 86.4614 Ops/s $\textbf{\color{#35bf28}+5.67\%}$
test_values[vec_generalized_advantage_estimate-True-True] 21.4653ms 17.8685ms 55.9643 Ops/s 56.4176 Ops/s $\color{#d91a1a}-0.80\%$
test_values[td0_return_estimate-False-False] 0.2100ms 0.1277ms 7.8317 KOps/s 4.8511 KOps/s $\textbf{\color{#35bf28}+61.44\%}$
test_values[td1_return_estimate-False-False] 30.8353ms 29.9872ms 33.3476 Ops/s 32.1788 Ops/s $\color{#35bf28}+3.63\%$
test_values[vec_td1_return_estimate-False-False] 20.6717ms 17.6910ms 56.5260 Ops/s 56.4423 Ops/s $\color{#35bf28}+0.15\%$
test_values[td_lambda_return_estimate-True-False] 45.9331ms 44.6556ms 22.3936 Ops/s 21.6003 Ops/s $\color{#35bf28}+3.67\%$
test_values[vec_td_lambda_return_estimate-True-False] 21.3725ms 17.7979ms 56.1863 Ops/s 56.7002 Ops/s $\color{#d91a1a}-0.91\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.8559ms 9.7913ms 102.1314 Ops/s 96.4598 Ops/s $\textbf{\color{#35bf28}+5.88\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.0099ms 1.5335ms 652.0930 Ops/s 623.0237 Ops/s $\color{#35bf28}+4.67\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4782ms 0.4359ms 2.2940 KOps/s 2.2044 KOps/s $\color{#35bf28}+4.06\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 34.9222ms 34.3178ms 29.1394 Ops/s 28.1131 Ops/s $\color{#35bf28}+3.65\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1.9107ms 1.7255ms 579.5325 Ops/s 578.9750 Ops/s $\color{#35bf28}+0.10\%$
test_dqn_speed[False-None] 1.5470ms 1.4285ms 700.0209 Ops/s 691.1623 Ops/s $\color{#35bf28}+1.28\%$
test_dqn_speed[False-backward] 2.0761ms 2.0238ms 494.1274 Ops/s 480.8082 Ops/s $\color{#35bf28}+2.77\%$
test_dqn_speed[True-None] 0.9262ms 0.5565ms 1.7969 KOps/s 1.8023 KOps/s $\color{#d91a1a}-0.30\%$
test_dqn_speed[True-backward] 1.0824ms 1.0146ms 985.6495 Ops/s 837.5307 Ops/s $\textbf{\color{#35bf28}+17.69\%}$
test_dqn_speed[reduce-overhead-None] 0.9419ms 0.5330ms 1.8762 KOps/s 1.8397 KOps/s $\color{#35bf28}+1.98\%$
test_ddpg_speed[False-None] 3.2915ms 2.9139ms 343.1880 Ops/s 350.2828 Ops/s $\color{#d91a1a}-2.03\%$
test_ddpg_speed[False-backward] 4.2726ms 4.1714ms 239.7275 Ops/s 242.4684 Ops/s $\color{#d91a1a}-1.13\%$
test_ddpg_speed[True-None] 1.8749ms 1.4366ms 696.0914 Ops/s 724.5355 Ops/s $\color{#d91a1a}-3.93\%$
test_ddpg_speed[True-backward] 2.8443ms 2.4677ms 405.2432 Ops/s 407.6059 Ops/s $\color{#d91a1a}-0.58\%$
test_ddpg_speed[reduce-overhead-None] 1.5804ms 1.4286ms 699.9754 Ops/s 731.9004 Ops/s $\color{#d91a1a}-4.36\%$
test_sac_speed[False-None] 8.5483ms 8.2438ms 121.3038 Ops/s 122.9931 Ops/s $\color{#d91a1a}-1.37\%$
test_sac_speed[False-backward] 11.8457ms 11.3696ms 87.9542 Ops/s 87.2661 Ops/s $\color{#35bf28}+0.79\%$
test_sac_speed[True-None] 2.6427ms 2.2112ms 452.2400 Ops/s 464.6993 Ops/s $\color{#d91a1a}-2.68\%$
test_sac_speed[True-backward] 4.2708ms 4.1256ms 242.3867 Ops/s 246.2249 Ops/s $\color{#d91a1a}-1.56\%$
test_sac_speed[reduce-overhead-None] 2.6234ms 2.1906ms 456.4888 Ops/s 459.8188 Ops/s $\color{#d91a1a}-0.72\%$
test_redq_speed[False-None] 15.1054ms 10.8974ms 91.7653 Ops/s 93.9376 Ops/s $\color{#d91a1a}-2.31\%$
test_redq_speed[False-backward] 18.9803ms 18.0934ms 55.2688 Ops/s 56.0912 Ops/s $\color{#d91a1a}-1.47\%$
test_redq_speed[True-None] 5.2390ms 4.4760ms 223.4141 Ops/s 222.0349 Ops/s $\color{#35bf28}+0.62\%$
test_redq_speed[True-backward] 10.6533ms 9.9676ms 100.3252 Ops/s 98.8334 Ops/s $\color{#35bf28}+1.51\%$
test_redq_speed[reduce-overhead-None] 4.8709ms 4.5241ms 221.0368 Ops/s 228.2830 Ops/s $\color{#d91a1a}-3.17\%$
test_redq_deprec_speed[False-None] 11.7462ms 11.2943ms 88.5401 Ops/s 91.7691 Ops/s $\color{#d91a1a}-3.52\%$
test_redq_deprec_speed[False-backward] 16.7185ms 16.2877ms 61.3961 Ops/s 64.3966 Ops/s $\color{#d91a1a}-4.66\%$
test_redq_deprec_speed[True-None] 4.1357ms 3.7311ms 268.0206 Ops/s 262.4586 Ops/s $\color{#35bf28}+2.12\%$
test_redq_deprec_speed[True-backward] 8.2398ms 7.8843ms 126.8337 Ops/s 127.5847 Ops/s $\color{#d91a1a}-0.59\%$
test_redq_deprec_speed[reduce-overhead-None] 4.1922ms 3.7184ms 268.9323 Ops/s 271.3259 Ops/s $\color{#d91a1a}-0.88\%$
test_td3_speed[False-None] 8.3764ms 8.1522ms 122.6659 Ops/s 120.1548 Ops/s $\color{#35bf28}+2.09\%$
test_td3_speed[False-backward] 11.5196ms 11.0780ms 90.2692 Ops/s 89.4382 Ops/s $\color{#35bf28}+0.93\%$
test_td3_speed[True-None] 2.0032ms 1.8917ms 528.6147 Ops/s 525.1706 Ops/s $\color{#35bf28}+0.66\%$
test_td3_speed[True-backward] 3.8347ms 3.7131ms 269.3194 Ops/s 243.3330 Ops/s $\textbf{\color{#35bf28}+10.68\%}$
test_td3_speed[reduce-overhead-None] 1.9349ms 1.8526ms 539.7762 Ops/s 523.9206 Ops/s $\color{#35bf28}+3.03\%$
test_cql_speed[False-None] 31.1991ms 26.9012ms 37.1730 Ops/s 37.5477 Ops/s $\color{#d91a1a}-1.00\%$
test_cql_speed[False-backward] 39.5654ms 35.8668ms 27.8809 Ops/s 27.3326 Ops/s $\color{#35bf28}+2.01\%$
test_cql_speed[True-None] 13.1272ms 12.5178ms 79.8859 Ops/s 79.2537 Ops/s $\color{#35bf28}+0.80\%$
test_cql_speed[True-backward] 18.3638ms 17.8651ms 55.9749 Ops/s 54.2774 Ops/s $\color{#35bf28}+3.13\%$
test_cql_speed[reduce-overhead-None] 13.1638ms 12.7657ms 78.3348 Ops/s 78.8647 Ops/s $\color{#d91a1a}-0.67\%$
test_a2c_speed[False-None] 5.7991ms 5.5531ms 180.0801 Ops/s 178.9462 Ops/s $\color{#35bf28}+0.63\%$
test_a2c_speed[False-backward] 12.6283ms 12.3364ms 81.0612 Ops/s 81.6016 Ops/s $\color{#d91a1a}-0.66\%$
test_a2c_speed[True-None] 4.2977ms 3.8599ms 259.0745 Ops/s 264.3792 Ops/s $\color{#d91a1a}-2.01\%$
test_a2c_speed[True-backward] 9.3637ms 8.7365ms 114.4624 Ops/s 104.5292 Ops/s $\textbf{\color{#35bf28}+9.50\%}$
test_a2c_speed[reduce-overhead-None] 3.9423ms 3.8149ms 262.1328 Ops/s 265.4276 Ops/s $\color{#d91a1a}-1.24\%$
test_ppo_speed[False-None] 6.5553ms 6.1122ms 163.6082 Ops/s 164.2145 Ops/s $\color{#d91a1a}-0.37\%$
test_ppo_speed[False-backward] 13.7022ms 12.9845ms 77.0149 Ops/s 78.4052 Ops/s $\color{#d91a1a}-1.77\%$
test_ppo_speed[True-None] 4.0352ms 3.7566ms 266.1972 Ops/s 271.3303 Ops/s $\color{#d91a1a}-1.89\%$
test_ppo_speed[True-backward] 8.9924ms 8.5871ms 116.4535 Ops/s 117.1928 Ops/s $\color{#d91a1a}-0.63\%$
test_ppo_speed[reduce-overhead-None] 4.0823ms 3.6951ms 270.6298 Ops/s 275.1564 Ops/s $\color{#d91a1a}-1.65\%$
test_reinforce_speed[False-None] 5.1115ms 4.6747ms 213.9179 Ops/s 214.6542 Ops/s $\color{#d91a1a}-0.34\%$
test_reinforce_speed[False-backward] 7.8026ms 7.5700ms 132.1001 Ops/s 131.5878 Ops/s $\color{#35bf28}+0.39\%$
test_reinforce_speed[True-None] 3.3730ms 2.9762ms 336.0010 Ops/s 323.8683 Ops/s $\color{#35bf28}+3.75\%$
test_reinforce_speed[True-backward] 8.3564ms 7.9527ms 125.7435 Ops/s 121.4864 Ops/s $\color{#35bf28}+3.50\%$
test_reinforce_speed[reduce-overhead-None] 3.3575ms 2.9736ms 336.2921 Ops/s 339.0240 Ops/s $\color{#d91a1a}-0.81\%$
test_iql_speed[False-None] 21.0829ms 20.4375ms 48.9296 Ops/s 48.3732 Ops/s $\color{#35bf28}+1.15\%$
test_iql_speed[False-backward] 31.8460ms 30.9492ms 32.3111 Ops/s 31.9476 Ops/s $\color{#35bf28}+1.14\%$
test_iql_speed[True-None] 8.9678ms 8.6451ms 115.6728 Ops/s 115.6263 Ops/s $\color{#35bf28}+0.04\%$
test_iql_speed[True-backward] 17.1862ms 16.8208ms 59.4501 Ops/s 58.6791 Ops/s $\color{#35bf28}+1.31\%$
test_iql_speed[reduce-overhead-None] 9.2281ms 8.7432ms 114.3746 Ops/s 112.2495 Ops/s $\color{#35bf28}+1.89\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2642ms 6.1276ms 163.1952 Ops/s 167.2428 Ops/s $\color{#d91a1a}-2.42\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.2034ms 0.3149ms 3.1751 KOps/s 2.7725 KOps/s $\textbf{\color{#35bf28}+14.52\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6983ms 0.3059ms 3.2690 KOps/s 2.9123 KOps/s $\textbf{\color{#35bf28}+12.25\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0727ms 5.8501ms 170.9379 Ops/s 172.9754 Ops/s $\color{#d91a1a}-1.18\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8958ms 0.3024ms 3.3073 KOps/s 2.8269 KOps/s $\textbf{\color{#35bf28}+16.99\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5472ms 0.2978ms 3.3579 KOps/s 2.9712 KOps/s $\textbf{\color{#35bf28}+13.02\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5159ms 1.2935ms 773.1014 Ops/s 699.5347 Ops/s $\textbf{\color{#35bf28}+10.52\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5195ms 1.2549ms 796.8823 Ops/s 732.4434 Ops/s $\textbf{\color{#35bf28}+8.80\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 10.7111ms 6.0802ms 164.4689 Ops/s 167.7308 Ops/s $\color{#d91a1a}-1.94\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8172ms 0.4427ms 2.2590 KOps/s 1.9970 KOps/s $\textbf{\color{#35bf28}+13.12\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0775ms 0.4373ms 2.2867 KOps/s 1.9851 KOps/s $\textbf{\color{#35bf28}+15.19\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9159ms 5.8089ms 172.1499 Ops/s 171.5373 Ops/s $\color{#35bf28}+0.36\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.3753ms 0.3210ms 3.1152 KOps/s 3.1893 KOps/s $\color{#d91a1a}-2.32\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6578ms 0.3304ms 3.0267 KOps/s 3.4560 KOps/s $\textbf{\color{#d91a1a}-12.42\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0122ms 5.7561ms 173.7295 Ops/s 171.3018 Ops/s $\color{#35bf28}+1.42\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0932ms 0.3194ms 3.1309 KOps/s 3.4344 KOps/s $\textbf{\color{#d91a1a}-8.84\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7178ms 0.3038ms 3.2920 KOps/s 3.7510 KOps/s $\textbf{\color{#d91a1a}-12.24\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0650ms 5.9667ms 167.5961 Ops/s 166.0344 Ops/s $\color{#35bf28}+0.94\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7738ms 0.5162ms 1.9372 KOps/s 2.1784 KOps/s $\textbf{\color{#d91a1a}-11.08\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5840ms 0.4220ms 2.3696 KOps/s 1.8746 KOps/s $\textbf{\color{#35bf28}+26.41\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.4891ms 5.0062ms 199.7503 Ops/s 46.7797 Ops/s $\textbf{\color{#35bf28}+327.00\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 13.1468ms 2.0388ms 490.4950 Ops/s 489.2962 Ops/s $\color{#35bf28}+0.25\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 3.5056ms 0.9673ms 1.0338 KOps/s 1.1111 KOps/s $\textbf{\color{#d91a1a}-6.96\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5476s 15.9634ms 62.6435 Ops/s 197.2749 Ops/s $\textbf{\color{#d91a1a}-68.25\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.9074ms 1.8904ms 528.9904 Ops/s 529.2649 Ops/s $\color{#d91a1a}-0.05\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 10.5284ms 1.3015ms 768.3607 Ops/s 1.1304 KOps/s $\textbf{\color{#d91a1a}-32.03\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.8184ms 5.2906ms 189.0133 Ops/s 189.6017 Ops/s $\color{#d91a1a}-0.31\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.0323ms 1.8885ms 529.5298 Ops/s 502.8148 Ops/s $\textbf{\color{#35bf28}+5.31\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.4502ms 1.1386ms 878.3058 Ops/s 953.2829 Ops/s $\textbf{\color{#d91a1a}-7.87\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 38.4690ms 36.6644ms 27.2744 Ops/s 27.2071 Ops/s $\color{#35bf28}+0.25\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.7097ms 18.3312ms 54.5519 Ops/s 54.0597 Ops/s $\color{#35bf28}+0.91\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 41.2461ms 37.5410ms 26.6375 Ops/s 26.4157 Ops/s $\color{#35bf28}+0.84\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.5760ms 18.8878ms 52.9441 Ops/s 53.2057 Ops/s $\color{#d91a1a}-0.49\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.7700ms 39.5865ms 25.2611 Ops/s 24.7868 Ops/s $\color{#35bf28}+1.91\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 0.4901s 29.5880ms 33.7975 Ops/s 49.4698 Ops/s $\textbf{\color{#d91a1a}-31.68\%}$

[ghstack-poisoned]
@github-actions
Copy link
Contributor

github-actions bot commented Jan 31, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 148. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 81.5581μs 80.5051μs 12.4216 KOps/s 12.1379 KOps/s $\color{#35bf28}+2.34\%$
test_tensor_to_bytestream_speed[torch.save] 0.1397ms 0.1393ms 7.1779 KOps/s 7.0467 KOps/s $\color{#35bf28}+1.86\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1256s 0.1250s 8.0008 Ops/s 7.8624 Ops/s $\color{#35bf28}+1.76\%$
test_tensor_to_bytestream_speed[numpy] 2.6464μs 2.6400μs 378.7912 KOps/s 379.0361 KOps/s $\color{#d91a1a}-0.06\%$
test_tensor_to_bytestream_speed[safetensors] 37.4231μs 36.8169μs 27.1615 KOps/s 26.6833 KOps/s $\color{#35bf28}+1.79\%$
test_simple 0.9166s 0.8217s 1.2169 Ops/s 1.2197 Ops/s $\color{#d91a1a}-0.23\%$
test_transformed 1.5547s 1.4700s 0.6803 Ops/s 0.6796 Ops/s $\color{#35bf28}+0.10\%$
test_serial 2.4359s 2.3437s 0.4267 Ops/s 0.4282 Ops/s $\color{#d91a1a}-0.35\%$
test_parallel 2.0410s 1.9564s 0.5111 Ops/s 0.5210 Ops/s $\color{#d91a1a}-1.88\%$
test_step_mdp_speed[True-True-True-True-True] 0.1715ms 44.0926μs 22.6796 KOps/s 21.7998 KOps/s $\color{#35bf28}+4.04\%$
test_step_mdp_speed[True-True-True-True-False] 0.4435ms 24.9434μs 40.0907 KOps/s 38.1271 KOps/s $\textbf{\color{#35bf28}+5.15\%}$
test_step_mdp_speed[True-True-True-False-True] 0.4509ms 25.1979μs 39.6859 KOps/s 38.9472 KOps/s $\color{#35bf28}+1.90\%$
test_step_mdp_speed[True-True-True-False-False] 50.2110μs 13.8763μs 72.0652 KOps/s 69.9434 KOps/s $\color{#35bf28}+3.03\%$
test_step_mdp_speed[True-True-False-True-True] 0.4766ms 47.9839μs 20.8403 KOps/s 20.0561 KOps/s $\color{#35bf28}+3.91\%$
test_step_mdp_speed[True-True-False-True-False] 0.4454ms 27.6916μs 36.1120 KOps/s 34.2172 KOps/s $\textbf{\color{#35bf28}+5.54\%}$
test_step_mdp_speed[True-True-False-False-True] 0.1002ms 26.9061μs 37.1663 KOps/s 34.4791 KOps/s $\textbf{\color{#35bf28}+7.79\%}$
test_step_mdp_speed[True-True-False-False-False] 67.2320μs 16.7312μs 59.7687 KOps/s 57.9905 KOps/s $\color{#35bf28}+3.07\%$
test_step_mdp_speed[True-False-True-True-True] 93.2520μs 50.5535μs 19.7810 KOps/s 18.8420 KOps/s $\color{#35bf28}+4.98\%$
test_step_mdp_speed[True-False-True-True-False] 80.0820μs 30.8144μs 32.4524 KOps/s 30.8588 KOps/s $\textbf{\color{#35bf28}+5.16\%}$
test_step_mdp_speed[True-False-True-False-True] 63.8420μs 28.1180μs 35.5645 KOps/s 34.3407 KOps/s $\color{#35bf28}+3.56\%$
test_step_mdp_speed[True-False-True-False-False] 54.1310μs 16.7410μs 59.7334 KOps/s 57.8458 KOps/s $\color{#35bf28}+3.26\%$
test_step_mdp_speed[True-False-False-True-True] 0.1303ms 53.7428μs 18.6072 KOps/s 17.9574 KOps/s $\color{#35bf28}+3.62\%$
test_step_mdp_speed[True-False-False-True-False] 70.5120μs 33.6382μs 29.7281 KOps/s 28.1852 KOps/s $\textbf{\color{#35bf28}+5.47\%}$
test_step_mdp_speed[True-False-False-False-True] 78.8820μs 30.2769μs 33.0284 KOps/s 31.4437 KOps/s $\textbf{\color{#35bf28}+5.04\%}$
test_step_mdp_speed[True-False-False-False-False] 70.0520μs 19.4166μs 51.5023 KOps/s 49.2761 KOps/s $\color{#35bf28}+4.52\%$
test_step_mdp_speed[False-True-True-True-True] 0.1144ms 50.8556μs 19.6635 KOps/s 19.0559 KOps/s $\color{#35bf28}+3.19\%$
test_step_mdp_speed[False-True-True-True-False] 73.1220μs 30.8639μs 32.4003 KOps/s 30.9556 KOps/s $\color{#35bf28}+4.67\%$
test_step_mdp_speed[False-True-True-False-True] 59.4020μs 32.0817μs 31.1704 KOps/s 30.5342 KOps/s $\color{#35bf28}+2.08\%$
test_step_mdp_speed[False-True-True-False-False] 50.5020μs 18.5957μs 53.7758 KOps/s 52.3437 KOps/s $\color{#35bf28}+2.74\%$
test_step_mdp_speed[False-True-False-True-True] 2.7794ms 54.1516μs 18.4667 KOps/s 17.8796 KOps/s $\color{#35bf28}+3.28\%$
test_step_mdp_speed[False-True-False-True-False] 0.1099ms 33.8120μs 29.5753 KOps/s 29.0776 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-True-False-False-True] 69.3810μs 34.5197μs 28.9690 KOps/s 28.4635 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[False-True-False-False-False] 55.1920μs 21.4143μs 46.6978 KOps/s 45.5346 KOps/s $\color{#35bf28}+2.55\%$
test_step_mdp_speed[False-False-True-True-True] 0.1072ms 56.5659μs 17.6785 KOps/s 17.0490 KOps/s $\color{#35bf28}+3.69\%$
test_step_mdp_speed[False-False-True-True-False] 89.6820μs 36.9483μs 27.0648 KOps/s 26.1920 KOps/s $\color{#35bf28}+3.33\%$
test_step_mdp_speed[False-False-True-False-True] 85.3120μs 34.5634μs 28.9323 KOps/s 28.0233 KOps/s $\color{#35bf28}+3.24\%$
test_step_mdp_speed[False-False-True-False-False] 55.8920μs 21.4635μs 46.5906 KOps/s 46.2184 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[False-False-False-True-True] 0.1392ms 59.2385μs 16.8809 KOps/s 16.7446 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[False-False-False-True-False] 77.8420μs 38.7351μs 25.8164 KOps/s 24.9004 KOps/s $\color{#35bf28}+3.68\%$
test_step_mdp_speed[False-False-False-False-True] 78.5120μs 36.6714μs 27.2692 KOps/s 26.6085 KOps/s $\color{#35bf28}+2.48\%$
test_step_mdp_speed[False-False-False-False-False] 47.7410μs 23.4709μs 42.6059 KOps/s 40.5042 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_non_tensor_env_rollout_speed[1000-single-True] 0.7540s 0.7517s 1.3303 Ops/s 1.2796 Ops/s $\color{#35bf28}+3.97\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7320s 0.6373s 1.5690 Ops/s 1.5563 Ops/s $\color{#35bf28}+0.82\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7610s 1.6869s 0.5928 Ops/s 0.5880 Ops/s $\color{#35bf28}+0.82\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5359s 1.4635s 0.6833 Ops/s 0.6811 Ops/s $\color{#35bf28}+0.32\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0157s 1.9411s 0.5152 Ops/s 0.5124 Ops/s $\color{#35bf28}+0.54\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7957s 1.7194s 0.5816 Ops/s 0.5797 Ops/s $\color{#35bf28}+0.33\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.6929s 4.6383s 0.2156 Ops/s 0.2141 Ops/s $\color{#35bf28}+0.71\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5944s 4.5247s 0.2210 Ops/s 0.2227 Ops/s $\color{#d91a1a}-0.74\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.0898s 1.9834s 0.5042 Ops/s 0.4870 Ops/s $\color{#35bf28}+3.54\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7474s 1.6941s 0.5903 Ops/s 0.5892 Ops/s $\color{#35bf28}+0.18\%$
test_values[generalized_advantage_estimate-True-True] 20.8397ms 20.3273ms 49.1949 Ops/s 49.6224 Ops/s $\color{#d91a1a}-0.86\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1388s 3.7027ms 270.0718 Ops/s 280.4881 Ops/s $\color{#d91a1a}-3.71\%$
test_values[td0_return_estimate-False-False] 0.1084ms 84.1502μs 11.8835 KOps/s 12.0002 KOps/s $\color{#d91a1a}-0.97\%$
test_values[td1_return_estimate-False-False] 49.1725ms 48.6364ms 20.5607 Ops/s 20.8497 Ops/s $\color{#d91a1a}-1.39\%$
test_values[vec_td1_return_estimate-False-False] 1.3679ms 1.0965ms 911.9669 Ops/s 913.6867 Ops/s $\color{#d91a1a}-0.19\%$
test_values[td_lambda_return_estimate-True-False] 79.6546ms 79.2439ms 12.6193 Ops/s 12.7516 Ops/s $\color{#d91a1a}-1.04\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2813ms 1.0919ms 915.8408 Ops/s 918.0536 Ops/s $\color{#d91a1a}-0.24\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 20.9000ms 20.7446ms 48.2052 Ops/s 49.1850 Ops/s $\color{#d91a1a}-1.99\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0644ms 0.7667ms 1.3043 KOps/s 1.3059 KOps/s $\color{#d91a1a}-0.13\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7493ms 0.6889ms 1.4516 KOps/s 1.4554 KOps/s $\color{#d91a1a}-0.26\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5671ms 1.5018ms 665.8673 Ops/s 667.7386 Ops/s $\color{#d91a1a}-0.28\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7809ms 0.7087ms 1.4111 KOps/s 1.4207 KOps/s $\color{#d91a1a}-0.68\%$
test_dqn_speed[False-None] 1.9521ms 1.5606ms 640.7767 Ops/s 647.4834 Ops/s $\color{#d91a1a}-1.04\%$
test_dqn_speed[False-backward] 2.5843ms 2.2206ms 450.3322 Ops/s 456.4673 Ops/s $\color{#d91a1a}-1.34\%$
test_dqn_speed[True-None] 0.6269ms 0.5448ms 1.8354 KOps/s 1.7558 KOps/s $\color{#35bf28}+4.54\%$
test_dqn_speed[True-backward] 1.2515ms 1.1981ms 834.6836 Ops/s 915.8194 Ops/s $\textbf{\color{#d91a1a}-8.86\%}$
test_dqn_speed[reduce-overhead-None] 0.6535ms 0.5731ms 1.7449 KOps/s 1.7051 KOps/s $\color{#35bf28}+2.34\%$
test_ddpg_speed[False-None] 3.2955ms 2.9705ms 336.6380 Ops/s 344.5135 Ops/s $\color{#d91a1a}-2.29\%$
test_ddpg_speed[False-backward] 4.6741ms 4.4009ms 227.2272 Ops/s 237.5605 Ops/s $\color{#d91a1a}-4.35\%$
test_ddpg_speed[True-None] 1.3677ms 1.2947ms 772.3572 Ops/s 766.5809 Ops/s $\color{#35bf28}+0.75\%$
test_ddpg_speed[True-backward] 2.5425ms 2.4959ms 400.6491 Ops/s 421.3728 Ops/s $\color{#d91a1a}-4.92\%$
test_ddpg_speed[reduce-overhead-None] 1.4352ms 1.3273ms 753.4069 Ops/s 749.3274 Ops/s $\color{#35bf28}+0.54\%$
test_sac_speed[False-None] 9.0561ms 8.4870ms 117.8267 Ops/s 118.5227 Ops/s $\color{#d91a1a}-0.59\%$
test_sac_speed[False-backward] 12.1972ms 11.7926ms 84.7992 Ops/s 86.7730 Ops/s $\color{#d91a1a}-2.27\%$
test_sac_speed[True-None] 1.8430ms 1.7869ms 559.6426 Ops/s 554.0456 Ops/s $\color{#35bf28}+1.01\%$
test_sac_speed[True-backward] 4.0275ms 3.5924ms 278.3616 Ops/s 277.0079 Ops/s $\color{#35bf28}+0.49\%$
test_sac_speed[reduce-overhead-None] 18.5671ms 10.5085ms 95.1612 Ops/s 95.5687 Ops/s $\color{#d91a1a}-0.43\%$
test_redq_deprec_speed[False-None] 10.0538ms 9.4510ms 105.8093 Ops/s 106.5887 Ops/s $\color{#d91a1a}-0.73\%$
test_redq_deprec_speed[False-backward] 13.3060ms 12.8913ms 77.5720 Ops/s 78.1798 Ops/s $\color{#d91a1a}-0.78\%$
test_redq_deprec_speed[True-None] 2.6428ms 2.5213ms 396.6180 Ops/s 396.5830 Ops/s $+0.01\%$
test_redq_deprec_speed[True-backward] 4.6594ms 4.3029ms 232.3993 Ops/s 232.2511 Ops/s $\color{#35bf28}+0.06\%$
test_redq_deprec_speed[reduce-overhead-None] 15.3378ms 9.4796ms 105.4896 Ops/s 106.3668 Ops/s $\color{#d91a1a}-0.82\%$
test_td3_speed[False-None] 8.4302ms 8.2919ms 120.6003 Ops/s 113.2300 Ops/s $\textbf{\color{#35bf28}+6.51\%}$
test_td3_speed[False-backward] 11.5504ms 10.9791ms 91.0824 Ops/s 90.8724 Ops/s $\color{#35bf28}+0.23\%$
test_td3_speed[True-None] 1.6481ms 1.6190ms 617.6649 Ops/s 618.3034 Ops/s $\color{#d91a1a}-0.10\%$
test_td3_speed[True-backward] 3.3389ms 3.2376ms 308.8731 Ops/s 323.1558 Ops/s $\color{#d91a1a}-4.42\%$
test_td3_speed[reduce-overhead-None] 65.5215ms 23.4054ms 42.7252 Ops/s 42.3845 Ops/s $\color{#35bf28}+0.80\%$
test_cql_speed[False-None] 17.7181ms 17.4685ms 57.2459 Ops/s 57.3807 Ops/s $\color{#d91a1a}-0.23\%$
test_cql_speed[False-backward] 23.5981ms 23.1918ms 43.1187 Ops/s 43.9411 Ops/s $\color{#d91a1a}-1.87\%$
test_cql_speed[True-None] 3.3606ms 3.2206ms 310.5024 Ops/s 306.9750 Ops/s $\color{#35bf28}+1.15\%$
test_cql_speed[True-backward] 6.1648ms 5.4867ms 182.2573 Ops/s 187.1462 Ops/s $\color{#d91a1a}-2.61\%$
test_cql_speed[reduce-overhead-None] 18.6630ms 11.6534ms 85.8120 Ops/s 86.2143 Ops/s $\color{#d91a1a}-0.47\%$
test_a2c_speed[False-None] 4.2769ms 3.2874ms 304.1932 Ops/s 307.0480 Ops/s $\color{#d91a1a}-0.93\%$
test_a2c_speed[False-backward] 6.9635ms 6.5241ms 153.2776 Ops/s 161.8710 Ops/s $\textbf{\color{#d91a1a}-5.31\%}$
test_a2c_speed[True-None] 1.4107ms 1.3302ms 751.7591 Ops/s 750.3018 Ops/s $\color{#35bf28}+0.19\%$
test_a2c_speed[True-backward] 3.2012ms 3.0893ms 323.7018 Ops/s 325.0938 Ops/s $\color{#d91a1a}-0.43\%$
test_a2c_speed[reduce-overhead-None] 1.0637ms 0.9760ms 1.0246 KOps/s 1.0280 KOps/s $\color{#d91a1a}-0.33\%$
test_ppo_speed[False-None] 4.2054ms 3.9644ms 252.2451 Ops/s 255.8457 Ops/s $\color{#d91a1a}-1.41\%$
test_ppo_speed[False-backward] 7.6016ms 7.1263ms 140.3248 Ops/s 137.0989 Ops/s $\color{#35bf28}+2.35\%$
test_ppo_speed[True-None] 1.4688ms 1.4134ms 707.5246 Ops/s 706.0618 Ops/s $\color{#35bf28}+0.21\%$
test_ppo_speed[True-backward] 3.1343ms 3.0649ms 326.2764 Ops/s 310.3274 Ops/s $\textbf{\color{#35bf28}+5.14\%}$
test_ppo_speed[reduce-overhead-None] 1.1162ms 1.0376ms 963.7228 Ops/s 939.8713 Ops/s $\color{#35bf28}+2.54\%$
test_reinforce_speed[False-None] 2.4494ms 2.3160ms 431.7783 Ops/s 431.1735 Ops/s $\color{#35bf28}+0.14\%$
test_reinforce_speed[False-backward] 3.5923ms 3.4728ms 287.9510 Ops/s 298.7569 Ops/s $\color{#d91a1a}-3.62\%$
test_reinforce_speed[True-None] 1.4816ms 1.2570ms 795.5571 Ops/s 771.8023 Ops/s $\color{#35bf28}+3.08\%$
test_reinforce_speed[True-backward] 3.1029ms 3.0188ms 331.2551 Ops/s 345.0182 Ops/s $\color{#d91a1a}-3.99\%$
test_reinforce_speed[reduce-overhead-None] 16.4160ms 9.0390ms 110.6314 Ops/s 97.9703 Ops/s $\textbf{\color{#35bf28}+12.92\%}$
test_iql_speed[False-None] 10.1962ms 9.6368ms 103.7687 Ops/s 104.2507 Ops/s $\color{#d91a1a}-0.46\%$
test_iql_speed[False-backward] 13.6833ms 13.3728ms 74.7784 Ops/s 74.8733 Ops/s $\color{#d91a1a}-0.13\%$
test_iql_speed[True-None] 2.3490ms 2.1575ms 463.5068 Ops/s 465.0545 Ops/s $\color{#d91a1a}-0.33\%$
test_iql_speed[True-backward] 4.8211ms 4.7504ms 210.5084 Ops/s 211.8852 Ops/s $\color{#d91a1a}-0.65\%$
test_iql_speed[reduce-overhead-None] 17.4038ms 10.1819ms 98.2136 Ops/s 98.1124 Ops/s $\color{#35bf28}+0.10\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.5835ms 6.1172ms 163.4725 Ops/s 165.0721 Ops/s $\color{#d91a1a}-0.97\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.2785ms 0.3337ms 2.9966 KOps/s 2.9753 KOps/s $\color{#35bf28}+0.72\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5755ms 0.3202ms 3.1229 KOps/s 3.3074 KOps/s $\textbf{\color{#d91a1a}-5.58\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1327ms 5.8654ms 170.4925 Ops/s 170.1870 Ops/s $\color{#35bf28}+0.18\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9143ms 0.3330ms 3.0030 KOps/s 3.2854 KOps/s $\textbf{\color{#d91a1a}-8.59\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6141ms 0.3612ms 2.7688 KOps/s 3.0309 KOps/s $\textbf{\color{#d91a1a}-8.65\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6363ms 1.3166ms 759.5196 Ops/s 752.0115 Ops/s $\color{#35bf28}+1.00\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6272ms 1.2289ms 813.7035 Ops/s 808.8576 Ops/s $\color{#35bf28}+0.60\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1883ms 5.9995ms 166.6797 Ops/s 166.7788 Ops/s $\color{#d91a1a}-0.06\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.4303ms 0.4431ms 2.2567 KOps/s 2.2889 KOps/s $\color{#d91a1a}-1.41\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7648ms 0.4935ms 2.0262 KOps/s 2.3188 KOps/s $\textbf{\color{#d91a1a}-12.62\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1397ms 5.9126ms 169.1292 Ops/s 169.4785 Ops/s $\color{#d91a1a}-0.21\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.2305ms 0.3398ms 2.9433 KOps/s 2.7758 KOps/s $\textbf{\color{#35bf28}+6.03\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5443ms 0.3104ms 3.2216 KOps/s 2.9535 KOps/s $\textbf{\color{#35bf28}+9.08\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1992ms 5.8634ms 170.5503 Ops/s 172.1209 Ops/s $\color{#d91a1a}-0.91\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3376ms 0.3336ms 2.9979 KOps/s 2.8012 KOps/s $\textbf{\color{#35bf28}+7.02\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5090ms 0.3005ms 3.3279 KOps/s 3.0249 KOps/s $\textbf{\color{#35bf28}+10.02\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2464ms 6.0718ms 164.6948 Ops/s 166.1208 Ops/s $\color{#d91a1a}-0.86\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.3830ms 0.5206ms 1.9210 KOps/s 2.2443 KOps/s $\textbf{\color{#d91a1a}-14.41\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7498ms 0.4953ms 2.0191 KOps/s 2.1304 KOps/s $\textbf{\color{#d91a1a}-5.22\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.5139ms 4.9778ms 200.8919 Ops/s 48.5677 Ops/s $\textbf{\color{#35bf28}+313.63\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 5.2413ms 2.1845ms 457.7609 Ops/s 530.5629 Ops/s $\textbf{\color{#d91a1a}-13.72\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.1056ms 1.1566ms 864.6000 Ops/s 1.0740 KOps/s $\textbf{\color{#d91a1a}-19.49\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5893s 16.8078ms 59.4961 Ops/s 193.7643 Ops/s $\textbf{\color{#d91a1a}-69.29\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 4.1068ms 1.8537ms 539.4503 Ops/s 534.4449 Ops/s $\color{#35bf28}+0.94\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.1842ms 0.9387ms 1.0653 KOps/s 763.0568 Ops/s $\textbf{\color{#35bf28}+39.61\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 9.4684ms 5.2817ms 189.3328 Ops/s 187.4026 Ops/s $\color{#35bf28}+1.03\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.2260ms 2.1392ms 467.4563 Ops/s 490.4473 Ops/s $\color{#d91a1a}-4.69\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2825ms 1.2551ms 796.7641 Ops/s 829.5326 Ops/s $\color{#d91a1a}-3.95\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 38.8937ms 36.5597ms 27.3526 Ops/s 27.2394 Ops/s $\color{#35bf28}+0.42\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.5753ms 18.4692ms 54.1442 Ops/s 53.3880 Ops/s $\color{#35bf28}+1.42\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 41.2624ms 37.6256ms 26.5777 Ops/s 26.1776 Ops/s $\color{#35bf28}+1.53\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.6050ms 18.8010ms 53.1887 Ops/s 52.3327 Ops/s $\color{#35bf28}+1.64\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.9411ms 39.4098ms 25.3744 Ops/s 25.0405 Ops/s $\color{#35bf28}+1.33\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.3657ms 20.1006ms 49.7497 Ops/s 49.3154 Ops/s $\color{#35bf28}+0.88\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@github-actions github-actions bot added the CI Has to do with CI setup (e.g. wheels & builds, tests...) label Feb 2, 2026
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
vmoens added 11 commits February 3, 2026 09:48
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 3, 2026
Add comprehensive tests for SGLang integration:

TestAsyncSGLangIntegration:
- test_connect_to_server: Verify service connection
- test_get_tp_size: Tensor parallel size retrieval
- test_get_dp_size: Data parallel size retrieval
- test_get_model_metadata: Model metadata extraction
- test_generate_text: Single prompt generation
- test_generate_batch: Batch generation
- test_flush_cache: Cache management

TestSGLangWrapper:
- test_wrapper_creation_from_service: Wrapper initialization
- test_history_mode: History-based input
- test_text_mode: Text-based input
- test_tokens_mode: Token-based input
- test_log_probs: Log probability extraction
- test_get_new_version: Policy version tracking

Tests use Qwen/Qwen2.5-0.5B for faster CI execution.
Markers: pytest.mark.gpu, pytest.mark.slow

ghstack-source-id: 8b8f37e
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 8b8f37e
Pull-Request: #3433
Co-authored-by: Cursor <cursoragent@cursor.com>
@vmoens vmoens merged commit 89678be into gh/vmoens/213/base Feb 3, 2026
114 of 116 checks passed
@vmoens vmoens deleted the gh/vmoens/213/head branch February 3, 2026 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Benchmarks rl/benchmark changes CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors Documentation Improvements or additions to documentation Examples llm/ LLM-related PR, triggers LLM CI tests Modules ReplayBuffers sota-implementations/ Tests Incomplete or broken unit tests WeightUpdate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant