Skip to content

[Feature] Auto-batching inference server: multiprocessing transport#3494

Merged
vmoens merged 5 commits intogh/vmoens/236/basefrom
gh/vmoens/236/head
Feb 21, 2026
Merged

[Feature] Auto-batching inference server: multiprocessing transport#3494
vmoens merged 5 commits intogh/vmoens/236/basefrom
gh/vmoens/236/head

Conversation

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Feb 11, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3494

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 235ae20 with merge base 266e4aa (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 11, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}22$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 81.6984μs 81.0913μs 12.3318 KOps/s 12.2990 KOps/s $\color{#35bf28}+0.27\%$
test_tensor_to_bytestream_speed[torch.save] 0.1395ms 0.1393ms 7.1812 KOps/s 6.9913 KOps/s $\color{#35bf28}+2.72\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1070s 0.1069s 9.3503 Ops/s 8.8363 Ops/s $\textbf{\color{#35bf28}+5.82\%}$
test_tensor_to_bytestream_speed[numpy] 2.6738μs 2.6678μs 374.8423 KOps/s 362.5885 KOps/s $\color{#35bf28}+3.38\%$
test_tensor_to_bytestream_speed[safetensors] 37.8537μs 37.4329μs 26.7145 KOps/s 26.3300 KOps/s $\color{#35bf28}+1.46\%$
test_simple 0.7992s 0.7908s 1.2645 Ops/s 1.2209 Ops/s $\color{#35bf28}+3.58\%$
test_transformed 1.3803s 1.3777s 0.7259 Ops/s 0.7059 Ops/s $\color{#35bf28}+2.83\%$
test_serial 2.2807s 2.2790s 0.4388 Ops/s 0.4276 Ops/s $\color{#35bf28}+2.61\%$
test_parallel 1.9016s 1.8273s 0.5473 Ops/s 0.5501 Ops/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[True-True-True-True-True] 0.1126ms 42.4068μs 23.5811 KOps/s 23.6332 KOps/s $\color{#d91a1a}-0.22\%$
test_step_mdp_speed[True-True-True-True-False] 61.5910μs 24.2606μs 41.2192 KOps/s 41.3529 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[True-True-True-False-True] 66.4610μs 24.2822μs 41.1824 KOps/s 41.2940 KOps/s $\color{#d91a1a}-0.27\%$
test_step_mdp_speed[True-True-True-False-False] 38.7300μs 13.1476μs 76.0595 KOps/s 76.1411 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[True-True-False-True-True] 78.3520μs 45.2448μs 22.1020 KOps/s 22.1064 KOps/s $\color{#d91a1a}-0.02\%$
test_step_mdp_speed[True-True-False-True-False] 62.5810μs 26.5234μs 37.7025 KOps/s 38.6149 KOps/s $\color{#d91a1a}-2.36\%$
test_step_mdp_speed[True-True-False-False-True] 56.9400μs 26.8436μs 37.2528 KOps/s 37.8926 KOps/s $\color{#d91a1a}-1.69\%$
test_step_mdp_speed[True-True-False-False-False] 43.8500μs 16.1903μs 61.7652 KOps/s 63.2237 KOps/s $\color{#d91a1a}-2.31\%$
test_step_mdp_speed[True-False-True-True-True] 95.6110μs 49.3658μs 20.2569 KOps/s 20.7775 KOps/s $\color{#d91a1a}-2.51\%$
test_step_mdp_speed[True-False-True-True-False] 57.0810μs 29.3140μs 34.1134 KOps/s 34.0535 KOps/s $\color{#35bf28}+0.18\%$
test_step_mdp_speed[True-False-True-False-True] 0.2009ms 27.1710μs 36.8040 KOps/s 37.3135 KOps/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[True-False-True-False-False] 45.7310μs 15.8747μs 62.9935 KOps/s 63.3399 KOps/s $\color{#d91a1a}-0.55\%$
test_step_mdp_speed[True-False-False-True-True] 79.2910μs 51.7448μs 19.3256 KOps/s 19.8618 KOps/s $\color{#d91a1a}-2.70\%$
test_step_mdp_speed[True-False-False-True-False] 80.6710μs 31.9638μs 31.2854 KOps/s 31.6967 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[True-False-False-False-True] 73.6010μs 29.5426μs 33.8495 KOps/s 34.7253 KOps/s $\color{#d91a1a}-2.52\%$
test_step_mdp_speed[True-False-False-False-False] 49.3510μs 18.5445μs 53.9243 KOps/s 54.2037 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[False-True-True-True-True] 0.1575ms 48.3972μs 20.6623 KOps/s 21.3431 KOps/s $\color{#d91a1a}-3.19\%$
test_step_mdp_speed[False-True-True-True-False] 60.1010μs 28.9159μs 34.5830 KOps/s 33.9676 KOps/s $\color{#35bf28}+1.81\%$
test_step_mdp_speed[False-True-True-False-True] 2.4868ms 31.3847μs 31.8627 KOps/s 32.2913 KOps/s $\color{#d91a1a}-1.33\%$
test_step_mdp_speed[False-True-True-False-False] 48.1500μs 18.0408μs 55.4299 KOps/s 56.5561 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[False-True-False-True-True] 0.1074ms 51.9709μs 19.2415 KOps/s 19.6934 KOps/s $\color{#d91a1a}-2.29\%$
test_step_mdp_speed[False-True-False-True-False] 64.7210μs 32.3724μs 30.8905 KOps/s 31.7629 KOps/s $\color{#d91a1a}-2.75\%$
test_step_mdp_speed[False-True-False-False-True] 61.6410μs 33.4883μs 29.8612 KOps/s 30.8123 KOps/s $\color{#d91a1a}-3.09\%$
test_step_mdp_speed[False-True-False-False-False] 55.9010μs 20.1289μs 49.6798 KOps/s 48.9737 KOps/s $\color{#35bf28}+1.44\%$
test_step_mdp_speed[False-False-True-True-True] 90.1210μs 53.9260μs 18.5439 KOps/s 18.8523 KOps/s $\color{#d91a1a}-1.64\%$
test_step_mdp_speed[False-False-True-True-False] 62.4600μs 34.9874μs 28.5817 KOps/s 28.5726 KOps/s $\color{#35bf28}+0.03\%$
test_step_mdp_speed[False-False-True-False-True] 65.5210μs 33.5576μs 29.7995 KOps/s 30.0450 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[False-False-True-False-False] 52.9010μs 20.3624μs 49.1102 KOps/s 50.0320 KOps/s $\color{#d91a1a}-1.84\%$
test_step_mdp_speed[False-False-False-True-True] 91.1010μs 56.5037μs 17.6980 KOps/s 17.8307 KOps/s $\color{#d91a1a}-0.74\%$
test_step_mdp_speed[False-False-False-True-False] 71.7110μs 37.4361μs 26.7122 KOps/s 26.7662 KOps/s $\color{#d91a1a}-0.20\%$
test_step_mdp_speed[False-False-False-False-True] 70.7910μs 35.8755μs 27.8742 KOps/s 28.3485 KOps/s $\color{#d91a1a}-1.67\%$
test_step_mdp_speed[False-False-False-False-False] 49.7710μs 22.4757μs 44.4926 KOps/s 43.8435 KOps/s $\color{#35bf28}+1.48\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8432s 0.7410s 1.3495 Ops/s 1.3512 Ops/s $\color{#d91a1a}-0.13\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7057s 0.6085s 1.6435 Ops/s 1.6589 Ops/s $\color{#d91a1a}-0.93\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7129s 1.6345s 0.6118 Ops/s 0.6125 Ops/s $\color{#d91a1a}-0.10\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.4888s 1.4115s 0.7085 Ops/s 0.7124 Ops/s $\color{#d91a1a}-0.55\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9535s 1.8720s 0.5342 Ops/s 0.5334 Ops/s $\color{#35bf28}+0.16\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7318s 1.6531s 0.6049 Ops/s 0.6047 Ops/s $\color{#35bf28}+0.04\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7842s 4.6436s 0.2154 Ops/s 0.2161 Ops/s $\color{#d91a1a}-0.37\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.6791s 4.4628s 0.2241 Ops/s 0.2221 Ops/s $\color{#35bf28}+0.90\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9359s 1.8571s 0.5385 Ops/s 0.5369 Ops/s $\color{#35bf28}+0.29\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6892s 1.6248s 0.6155 Ops/s 0.6290 Ops/s $\color{#d91a1a}-2.15\%$
test_values[generalized_advantage_estimate-True-True] 21.2310ms 20.7822ms 48.1180 Ops/s 45.2804 Ops/s $\textbf{\color{#35bf28}+6.27\%}$
test_values[vec_generalized_advantage_estimate-True-True] 0.1242s 3.4107ms 293.1976 Ops/s 271.2696 Ops/s $\textbf{\color{#35bf28}+8.08\%}$
test_values[td0_return_estimate-False-False] 0.1087ms 84.4659μs 11.8391 KOps/s 11.5978 KOps/s $\color{#35bf28}+2.08\%$
test_values[td1_return_estimate-False-False] 49.6761ms 48.8731ms 20.4611 Ops/s 19.5519 Ops/s $\color{#35bf28}+4.65\%$
test_values[vec_td1_return_estimate-False-False] 1.2742ms 1.0922ms 915.6036 Ops/s 907.0584 Ops/s $\color{#35bf28}+0.94\%$
test_values[td_lambda_return_estimate-True-False] 80.5352ms 79.8990ms 12.5158 Ops/s 11.9313 Ops/s $\color{#35bf28}+4.90\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2889ms 1.0897ms 917.6876 Ops/s 908.5494 Ops/s $\color{#35bf28}+1.01\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 21.8258ms 21.0778ms 47.4432 Ops/s 44.2474 Ops/s $\textbf{\color{#35bf28}+7.22\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0301ms 0.7615ms 1.3131 KOps/s 1.2847 KOps/s $\color{#35bf28}+2.21\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8565ms 0.7092ms 1.4100 KOps/s 1.4502 KOps/s $\color{#d91a1a}-2.77\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6170ms 1.4983ms 667.4186 Ops/s 663.4487 Ops/s $\color{#35bf28}+0.60\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8347ms 0.6975ms 1.4338 KOps/s 1.4137 KOps/s $\color{#35bf28}+1.42\%$
test_dqn_speed[False-None] 1.6666ms 1.5349ms 651.4936 Ops/s 645.7258 Ops/s $\color{#35bf28}+0.89\%$
test_dqn_speed[False-backward] 2.2351ms 2.1734ms 460.1153 Ops/s 461.1144 Ops/s $\color{#d91a1a}-0.22\%$
test_dqn_speed[True-None] 0.7194ms 0.5513ms 1.8140 KOps/s 1.6949 KOps/s $\textbf{\color{#35bf28}+7.02\%}$
test_dqn_speed[True-backward] 1.1714ms 1.0967ms 911.8108 Ops/s 810.0615 Ops/s $\textbf{\color{#35bf28}+12.56\%}$
test_dqn_speed[reduce-overhead-None] 0.7894ms 0.5875ms 1.7023 KOps/s 1.6859 KOps/s $\color{#35bf28}+0.97\%$
test_ddpg_speed[False-None] 3.2301ms 2.8702ms 348.4051 Ops/s 346.9338 Ops/s $\color{#35bf28}+0.42\%$
test_ddpg_speed[False-backward] 4.5901ms 4.1489ms 241.0263 Ops/s 233.3855 Ops/s $\color{#35bf28}+3.27\%$
test_ddpg_speed[True-None] 1.4500ms 1.2949ms 772.2843 Ops/s 746.4764 Ops/s $\color{#35bf28}+3.46\%$
test_ddpg_speed[True-backward] 2.4182ms 2.3362ms 428.0436 Ops/s 394.0161 Ops/s $\textbf{\color{#35bf28}+8.64\%}$
test_ddpg_speed[reduce-overhead-None] 1.4615ms 1.3230ms 755.8470 Ops/s 743.2141 Ops/s $\color{#35bf28}+1.70\%$
test_sac_speed[False-None] 8.8295ms 8.2942ms 120.5655 Ops/s 118.6128 Ops/s $\color{#35bf28}+1.65\%$
test_sac_speed[False-backward] 11.8167ms 11.3387ms 88.1938 Ops/s 84.6822 Ops/s $\color{#35bf28}+4.15\%$
test_sac_speed[True-None] 2.0926ms 1.7832ms 560.7826 Ops/s 550.7390 Ops/s $\color{#35bf28}+1.82\%$
test_sac_speed[True-backward] 3.4647ms 3.3501ms 298.4980 Ops/s 275.8160 Ops/s $\textbf{\color{#35bf28}+8.22\%}$
test_sac_speed[reduce-overhead-None] 19.0589ms 10.8482ms 92.1811 Ops/s 82.8451 Ops/s $\textbf{\color{#35bf28}+11.27\%}$
test_redq_deprec_speed[False-None] 9.8250ms 9.2880ms 107.6652 Ops/s 106.0853 Ops/s $\color{#35bf28}+1.49\%$
test_redq_deprec_speed[False-backward] 12.9581ms 12.4122ms 80.5656 Ops/s 77.7848 Ops/s $\color{#35bf28}+3.58\%$
test_redq_deprec_speed[True-None] 2.9168ms 2.4833ms 402.6965 Ops/s 393.9392 Ops/s $\color{#35bf28}+2.22\%$
test_redq_deprec_speed[True-backward] 4.4607ms 4.0392ms 247.5741 Ops/s 228.7261 Ops/s $\textbf{\color{#35bf28}+8.24\%}$
test_redq_deprec_speed[reduce-overhead-None] 15.9650ms 9.7929ms 102.1147 Ops/s 101.8258 Ops/s $\color{#35bf28}+0.28\%$
test_td3_speed[False-None] 8.3325ms 8.1532ms 122.6506 Ops/s 121.4664 Ops/s $\color{#35bf28}+0.97\%$
test_td3_speed[False-backward] 11.1788ms 10.6048ms 94.2973 Ops/s 91.6575 Ops/s $\color{#35bf28}+2.88\%$
test_td3_speed[True-None] 1.6644ms 1.5878ms 629.7990 Ops/s 611.2631 Ops/s $\color{#35bf28}+3.03\%$
test_td3_speed[True-backward] 3.4084ms 3.0215ms 330.9642 Ops/s 305.3905 Ops/s $\textbf{\color{#35bf28}+8.37\%}$
test_td3_speed[reduce-overhead-None] 47.4224ms 24.2471ms 41.2420 Ops/s 40.3922 Ops/s $\color{#35bf28}+2.10\%$
test_cql_speed[False-None] 18.3372ms 17.2097ms 58.1068 Ops/s 56.4948 Ops/s $\color{#35bf28}+2.85\%$
test_cql_speed[False-backward] 23.0319ms 22.4821ms 44.4799 Ops/s 43.1346 Ops/s $\color{#35bf28}+3.12\%$
test_cql_speed[True-None] 3.2917ms 3.1788ms 314.5842 Ops/s 300.6073 Ops/s $\color{#35bf28}+4.65\%$
test_cql_speed[True-backward] 5.7018ms 5.2414ms 190.7896 Ops/s 181.9183 Ops/s $\color{#35bf28}+4.88\%$
test_cql_speed[reduce-overhead-None] 18.9424ms 11.8304ms 84.5283 Ops/s 84.3903 Ops/s $\color{#35bf28}+0.16\%$
test_a2c_speed[False-None] 3.9105ms 3.2310ms 309.5005 Ops/s 305.1504 Ops/s $\color{#35bf28}+1.43\%$
test_a2c_speed[False-backward] 6.6364ms 6.1362ms 162.9686 Ops/s 153.8129 Ops/s $\textbf{\color{#35bf28}+5.95\%}$
test_a2c_speed[True-None] 1.4611ms 1.2987ms 770.0092 Ops/s 757.9516 Ops/s $\color{#35bf28}+1.59\%$
test_a2c_speed[True-backward] 3.0154ms 2.9065ms 344.0518 Ops/s 322.6143 Ops/s $\textbf{\color{#35bf28}+6.64\%}$
test_a2c_speed[reduce-overhead-None] 1.0474ms 0.9696ms 1.0314 KOps/s 1.0367 KOps/s $\color{#d91a1a}-0.51\%$
test_ppo_speed[False-None] 3.9523ms 3.8560ms 259.3343 Ops/s 257.8064 Ops/s $\color{#35bf28}+0.59\%$
test_ppo_speed[False-backward] 7.4396ms 7.0184ms 142.4822 Ops/s 135.9550 Ops/s $\color{#35bf28}+4.80\%$
test_ppo_speed[True-None] 1.4587ms 1.4024ms 713.0784 Ops/s 713.9899 Ops/s $\color{#d91a1a}-0.13\%$
test_ppo_speed[True-backward] 3.3418ms 3.2291ms 309.6834 Ops/s 302.7290 Ops/s $\color{#35bf28}+2.30\%$
test_ppo_speed[reduce-overhead-None] 1.5877ms 1.0176ms 982.7131 Ops/s 949.1433 Ops/s $\color{#35bf28}+3.54\%$
test_reinforce_speed[False-None] 2.6029ms 2.2661ms 441.2786 Ops/s 434.2835 Ops/s $\color{#35bf28}+1.61\%$
test_reinforce_speed[False-backward] 3.5081ms 3.4100ms 293.2589 Ops/s 298.4721 Ops/s $\color{#d91a1a}-1.75\%$
test_reinforce_speed[True-None] 1.3400ms 1.2468ms 802.0761 Ops/s 794.7850 Ops/s $\color{#35bf28}+0.92\%$
test_reinforce_speed[True-backward] 3.1456ms 3.0155ms 331.6151 Ops/s 337.3217 Ops/s $\color{#d91a1a}-1.69\%$
test_reinforce_speed[reduce-overhead-None] 17.4291ms 9.4700ms 105.5968 Ops/s 104.6324 Ops/s $\color{#35bf28}+0.92\%$
test_iql_speed[False-None] 10.0060ms 9.4121ms 106.2462 Ops/s 105.0294 Ops/s $\color{#35bf28}+1.16\%$
test_iql_speed[False-backward] 13.4931ms 13.2482ms 75.4821 Ops/s 75.5221 Ops/s $\color{#d91a1a}-0.05\%$
test_iql_speed[True-None] 2.2663ms 2.1268ms 470.1859 Ops/s 460.1049 Ops/s $\color{#35bf28}+2.19\%$
test_iql_speed[True-backward] 4.9378ms 4.6166ms 216.6086 Ops/s 202.4598 Ops/s $\textbf{\color{#35bf28}+6.99\%}$
test_iql_speed[reduce-overhead-None] 17.9755ms 10.5386ms 94.8890 Ops/s 95.5233 Ops/s $\color{#d91a1a}-0.66\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3203ms 5.9778ms 167.2858 Ops/s 169.1819 Ops/s $\color{#d91a1a}-1.12\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1836ms 0.3651ms 2.7387 KOps/s 2.7872 KOps/s $\color{#d91a1a}-1.74\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5628ms 0.3502ms 2.8558 KOps/s 2.9177 KOps/s $\color{#d91a1a}-2.12\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1970ms 5.8018ms 172.3599 Ops/s 172.4648 Ops/s $\color{#d91a1a}-0.06\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7985ms 0.3456ms 2.8938 KOps/s 2.9623 KOps/s $\color{#d91a1a}-2.31\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5679ms 0.3287ms 3.0419 KOps/s 3.1117 KOps/s $\color{#d91a1a}-2.24\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6256ms 1.3602ms 735.1595 Ops/s 736.1489 Ops/s $\color{#d91a1a}-0.13\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4964ms 1.2835ms 779.1405 Ops/s 783.7594 Ops/s $\color{#d91a1a}-0.59\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1290ms 5.9914ms 166.9054 Ops/s 168.4886 Ops/s $\color{#d91a1a}-0.94\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0507ms 0.4799ms 2.0837 KOps/s 2.0690 KOps/s $\color{#35bf28}+0.71\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7829ms 0.4642ms 2.1542 KOps/s 2.1451 KOps/s $\color{#35bf28}+0.42\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0848ms 5.9006ms 169.4731 Ops/s 172.0383 Ops/s $\color{#d91a1a}-1.49\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.2367ms 0.3480ms 2.8735 KOps/s 2.8774 KOps/s $\color{#d91a1a}-0.14\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6979ms 0.3317ms 3.0147 KOps/s 3.0889 KOps/s $\color{#d91a1a}-2.40\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0596ms 5.7804ms 172.9993 Ops/s 174.0693 Ops/s $\color{#d91a1a}-0.61\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7195ms 0.3099ms 3.2265 KOps/s 2.7620 KOps/s $\textbf{\color{#35bf28}+16.82\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6385ms 0.3388ms 2.9519 KOps/s 2.8961 KOps/s $\color{#35bf28}+1.93\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3252ms 6.0052ms 166.5214 Ops/s 167.5695 Ops/s $\color{#d91a1a}-0.63\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9213ms 0.4520ms 2.2122 KOps/s 2.0349 KOps/s $\textbf{\color{#35bf28}+8.71\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8195ms 0.4270ms 2.3418 KOps/s 2.1654 KOps/s $\textbf{\color{#35bf28}+8.15\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5774s 16.5467ms 60.4350 Ops/s 51.5123 Ops/s $\textbf{\color{#35bf28}+17.32\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 4.0457ms 1.8408ms 543.2568 Ops/s 517.0295 Ops/s $\textbf{\color{#35bf28}+5.07\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.1160ms 0.9402ms 1.0637 KOps/s 865.5459 Ops/s $\textbf{\color{#35bf28}+22.89\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.1298ms 5.1108ms 195.6657 Ops/s 196.2082 Ops/s $\color{#d91a1a}-0.28\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 11.8943ms 1.9726ms 506.9374 Ops/s 534.7437 Ops/s $\textbf{\color{#d91a1a}-5.20\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.0595ms 0.9402ms 1.0636 KOps/s 738.5827 Ops/s $\textbf{\color{#35bf28}+44.01\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.9143ms 5.2983ms 188.7401 Ops/s 188.7085 Ops/s $\color{#35bf28}+0.02\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 12.9334ms 2.1547ms 464.1005 Ops/s 462.6309 Ops/s $\color{#35bf28}+0.32\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.0073ms 1.1681ms 856.0807 Ops/s 878.6049 Ops/s $\color{#d91a1a}-2.56\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 37.9639ms 36.2877ms 27.5576 Ops/s 26.9928 Ops/s $\color{#35bf28}+2.09\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.2353ms 18.2610ms 54.7616 Ops/s 53.3689 Ops/s $\color{#35bf28}+2.61\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.4267ms 37.2859ms 26.8198 Ops/s 26.3246 Ops/s $\color{#35bf28}+1.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.5821ms 18.5273ms 53.9745 Ops/s 52.9846 Ops/s $\color{#35bf28}+1.87\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.8141ms 38.9836ms 25.6518 Ops/s 25.2190 Ops/s $\color{#35bf28}+1.72\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.4127ms 19.9976ms 50.0060 Ops/s 49.3510 Ops/s $\color{#35bf28}+1.33\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8739ms 0.2157ms 4.6361 KOps/s 4.4845 KOps/s $\color{#35bf28}+3.38\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7062ms 1.4086ms 709.9301 Ops/s 689.7287 Ops/s $\color{#35bf28}+2.93\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.5106ms 2.3214ms 430.7715 Ops/s 430.9876 Ops/s $\color{#d91a1a}-0.05\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0709ms 2.8825ms 346.9197 Ops/s 339.2092 Ops/s $\color{#35bf28}+2.27\%$
test_storage_write_contiguous[50-img_shape0-small] 0.4545ms 0.1643ms 6.0871 KOps/s 6.0532 KOps/s $\color{#35bf28}+0.56\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3820ms 0.2231ms 4.4827 KOps/s 4.1168 KOps/s $\textbf{\color{#35bf28}+8.89\%}$
test_storage_write_contiguous[100-img_shape2-large_img] 2.2220ms 1.7733ms 563.9297 Ops/s 558.5752 Ops/s $\color{#35bf28}+0.96\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5885ms 1.3840ms 722.5394 Ops/s 722.8864 Ops/s $\color{#d91a1a}-0.05\%$
test_collector_stack_then_write[50-img_shape0-small] 1.3708ms 1.1640ms 859.0720 Ops/s 860.6045 Ops/s $\color{#d91a1a}-0.18\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.7770ms 3.5892ms 278.6141 Ops/s 274.7593 Ops/s $\color{#35bf28}+1.40\%$
test_collector_stack_then_write[100-img_shape2-large_img] 6.0755ms 5.7296ms 174.5324 Ops/s 170.1227 Ops/s $\color{#35bf28}+2.59\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.3431ms 7.0726ms 141.3904 Ops/s 141.5212 Ops/s $\color{#d91a1a}-0.09\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4263ms 0.2724ms 3.6713 KOps/s 3.6139 KOps/s $\color{#35bf28}+1.59\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.7474ms 1.5476ms 646.1655 Ops/s 640.3497 Ops/s $\color{#35bf28}+0.91\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.7287ms 2.4251ms 412.3494 Ops/s 408.9371 Ops/s $\color{#35bf28}+0.83\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.3418ms 3.1040ms 322.1694 Ops/s 317.4099 Ops/s $\color{#35bf28}+1.50\%$
test_collector_without_rb[100-img_shape0-atari] 33.6624ms 32.7233ms 30.5592 Ops/s 29.9950 Ops/s $\color{#35bf28}+1.88\%$
test_collector_without_rb[200-img_shape1-large_batch] 64.6228ms 64.1721ms 15.5831 Ops/s 15.1113 Ops/s $\color{#35bf28}+3.12\%$
test_collector_with_rb[100-img_shape0-atari] 37.5541ms 37.1449ms 26.9216 Ops/s 26.2504 Ops/s $\color{#35bf28}+2.56\%$
test_collector_with_rb[200-img_shape1-large_batch] 73.4791ms 73.0237ms 13.6942 Ops/s 13.4067 Ops/s $\color{#35bf28}+2.14\%$
test_collector_without_rb_cuda[100-img_shape0-atari] 57.0483ms 55.3406ms 18.0699 Ops/s 17.7549 Ops/s $\color{#35bf28}+1.77\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1122s 0.1101s 9.0810 Ops/s 8.9438 Ops/s $\color{#35bf28}+1.53\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 58.6898ms 57.1280ms 17.5045 Ops/s 17.1510 Ops/s $\color{#35bf28}+2.06\%$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1136s 0.1130s 8.8457 Ops/s 8.6264 Ops/s $\color{#35bf28}+2.54\%$

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 11, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 173. Improved: $\large\color{#35bf28}21$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 79.4537μs 78.7098μs 12.7049 KOps/s 12.4578 KOps/s $\color{#35bf28}+1.98\%$
test_tensor_to_bytestream_speed[torch.save] 0.1456ms 0.1414ms 7.0735 KOps/s 7.2264 KOps/s $\color{#d91a1a}-2.12\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1044s 0.1042s 9.5987 Ops/s 9.1446 Ops/s $\color{#35bf28}+4.97\%$
test_tensor_to_bytestream_speed[numpy] 2.4594μs 2.4533μs 407.6207 KOps/s 393.9030 KOps/s $\color{#35bf28}+3.48\%$
test_tensor_to_bytestream_speed[safetensors] 36.8559μs 36.6201μs 27.3074 KOps/s 27.5922 KOps/s $\color{#d91a1a}-1.03\%$
test_simple 0.5309s 0.5304s 1.8852 Ops/s 1.7407 Ops/s $\textbf{\color{#35bf28}+8.30\%}$
test_transformed 1.0601s 1.0571s 0.9460 Ops/s 0.8954 Ops/s $\textbf{\color{#35bf28}+5.64\%}$
test_serial 1.6238s 1.6216s 0.6167 Ops/s 0.5924 Ops/s $\color{#35bf28}+4.10\%$
test_parallel 1.1109s 1.0172s 0.9831 Ops/s 0.9667 Ops/s $\color{#35bf28}+1.69\%$
test_step_mdp_speed[True-True-True-True-True] 0.2449ms 40.0677μs 24.9578 KOps/s 24.0462 KOps/s $\color{#35bf28}+3.79\%$
test_step_mdp_speed[True-True-True-True-False] 57.4610μs 23.0839μs 43.3202 KOps/s 43.0798 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[True-True-True-False-True] 64.4120μs 23.0641μs 43.3575 KOps/s 42.7293 KOps/s $\color{#35bf28}+1.47\%$
test_step_mdp_speed[True-True-True-False-False] 43.8010μs 12.7154μs 78.6445 KOps/s 78.1224 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-True-False-True-True] 94.2130μs 43.5474μs 22.9635 KOps/s 22.8279 KOps/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[True-True-False-True-False] 94.1620μs 24.8670μs 40.2139 KOps/s 38.8631 KOps/s $\color{#35bf28}+3.48\%$
test_step_mdp_speed[True-True-False-False-True] 61.2010μs 25.5838μs 39.0873 KOps/s 38.2441 KOps/s $\color{#35bf28}+2.20\%$
test_step_mdp_speed[True-True-False-False-False] 45.9620μs 15.2931μs 65.3891 KOps/s 64.9731 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[True-False-True-True-True] 84.1420μs 47.4816μs 21.0608 KOps/s 21.1243 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[True-False-True-True-False] 63.7820μs 28.7555μs 34.7760 KOps/s 34.9259 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[True-False-True-False-True] 60.1420μs 26.1355μs 38.2622 KOps/s 38.6993 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[True-False-True-False-False] 50.2720μs 15.3804μs 65.0176 KOps/s 64.7492 KOps/s $\color{#35bf28}+0.41\%$
test_step_mdp_speed[True-False-False-True-True] 0.1011ms 49.3376μs 20.2685 KOps/s 20.1343 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-False-False-True-False] 67.8610μs 30.7163μs 32.5560 KOps/s 32.4292 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[True-False-False-False-True] 67.4010μs 28.2137μs 35.4437 KOps/s 35.5089 KOps/s $\color{#d91a1a}-0.18\%$
test_step_mdp_speed[True-False-False-False-False] 51.0010μs 17.8103μs 56.1473 KOps/s 55.8573 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[False-True-True-True-True] 81.1320μs 46.8793μs 21.3314 KOps/s 21.2434 KOps/s $\color{#35bf28}+0.41\%$
test_step_mdp_speed[False-True-True-True-False] 55.9310μs 27.8373μs 35.9231 KOps/s 35.3696 KOps/s $\color{#35bf28}+1.56\%$
test_step_mdp_speed[False-True-True-False-True] 2.5195ms 29.8785μs 33.4689 KOps/s 33.4171 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[False-True-True-False-False] 46.5310μs 16.9333μs 59.0552 KOps/s 58.1335 KOps/s $\color{#35bf28}+1.59\%$
test_step_mdp_speed[False-True-False-True-True] 89.8620μs 49.6522μs 20.1401 KOps/s 20.4243 KOps/s $\color{#d91a1a}-1.39\%$
test_step_mdp_speed[False-True-False-True-False] 59.1410μs 30.8738μs 32.3899 KOps/s 32.3171 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[False-True-False-False-True] 69.7420μs 31.3016μs 31.9472 KOps/s 31.4185 KOps/s $\color{#35bf28}+1.68\%$
test_step_mdp_speed[False-True-False-False-False] 54.3220μs 19.3997μs 51.5472 KOps/s 51.0130 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[False-False-True-True-True] 86.5120μs 51.6481μs 19.3618 KOps/s 18.9566 KOps/s $\color{#35bf28}+2.14\%$
test_step_mdp_speed[False-False-True-True-False] 67.1310μs 33.1309μs 30.1833 KOps/s 29.2220 KOps/s $\color{#35bf28}+3.29\%$
test_step_mdp_speed[False-False-True-False-True] 68.8720μs 31.5735μs 31.6721 KOps/s 30.8757 KOps/s $\color{#35bf28}+2.58\%$
test_step_mdp_speed[False-False-True-False-False] 50.5810μs 19.4377μs 51.4465 KOps/s 50.2721 KOps/s $\color{#35bf28}+2.34\%$
test_step_mdp_speed[False-False-False-True-True] 90.8720μs 53.2019μs 18.7963 KOps/s 18.1409 KOps/s $\color{#35bf28}+3.61\%$
test_step_mdp_speed[False-False-False-True-False] 67.9520μs 35.4903μs 28.1767 KOps/s 27.3291 KOps/s $\color{#35bf28}+3.10\%$
test_step_mdp_speed[False-False-False-False-True] 61.6120μs 33.9032μs 29.4958 KOps/s 29.2441 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[False-False-False-False-False] 64.5320μs 21.5835μs 46.3317 KOps/s 46.5725 KOps/s $\color{#d91a1a}-0.52\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.7043s 0.7029s 1.4226 Ops/s 1.3677 Ops/s $\color{#35bf28}+4.01\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.6902s 0.5919s 1.6894 Ops/s 1.6707 Ops/s $\color{#35bf28}+1.12\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.6779s 1.5977s 0.6259 Ops/s 0.6253 Ops/s $\color{#35bf28}+0.10\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.4568s 1.3767s 0.7264 Ops/s 0.7195 Ops/s $\color{#35bf28}+0.95\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9192s 1.8389s 0.5438 Ops/s 0.5431 Ops/s $\color{#35bf28}+0.14\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.6950s 1.6157s 0.6189 Ops/s 0.6148 Ops/s $\color{#35bf28}+0.67\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.6232s 4.5588s 0.2194 Ops/s 0.2197 Ops/s $\color{#d91a1a}-0.14\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5350s 4.3751s 0.2286 Ops/s 0.2236 Ops/s $\color{#35bf28}+2.21\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9384s 1.8396s 0.5436 Ops/s 0.5418 Ops/s $\color{#35bf28}+0.34\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6843s 1.5668s 0.6382 Ops/s 0.6525 Ops/s $\color{#d91a1a}-2.19\%$
test_values[generalized_advantage_estimate-True-True] 10.1145ms 9.9321ms 100.6834 Ops/s 97.8628 Ops/s $\color{#35bf28}+2.88\%$
test_values[vec_generalized_advantage_estimate-True-True] 19.5011ms 17.5525ms 56.9719 Ops/s 89.3210 Ops/s $\textbf{\color{#d91a1a}-36.22\%}$
test_values[td0_return_estimate-False-False] 0.2276ms 0.1292ms 7.7394 KOps/s 7.7419 KOps/s $\color{#d91a1a}-0.03\%$
test_values[td1_return_estimate-False-False] 27.6417ms 26.9589ms 37.0934 Ops/s 34.3744 Ops/s $\textbf{\color{#35bf28}+7.91\%}$
test_values[vec_td1_return_estimate-False-False] 18.2146ms 17.6117ms 56.7804 Ops/s 89.8150 Ops/s $\textbf{\color{#d91a1a}-36.78\%}$
test_values[td_lambda_return_estimate-True-False] 41.3000ms 39.6560ms 25.2169 Ops/s 23.0402 Ops/s $\textbf{\color{#35bf28}+9.45\%}$
test_values[vec_td_lambda_return_estimate-True-False] 18.1755ms 17.5577ms 56.9549 Ops/s 89.8084 Ops/s $\textbf{\color{#d91a1a}-36.58\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.0823ms 8.7738ms 113.9751 Ops/s 109.4389 Ops/s $\color{#35bf28}+4.14\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.8276ms 1.4739ms 678.4766 Ops/s 640.7755 Ops/s $\textbf{\color{#35bf28}+5.88\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4703ms 0.4235ms 2.3612 KOps/s 2.2989 KOps/s $\color{#35bf28}+2.71\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 34.9691ms 34.5802ms 28.9183 Ops/s 33.5624 Ops/s $\textbf{\color{#d91a1a}-13.84\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1.9882ms 1.6838ms 593.8915 Ops/s 585.6380 Ops/s $\color{#35bf28}+1.41\%$
test_dqn_speed[False-None] 1.5090ms 1.3764ms 726.5256 Ops/s 722.2693 Ops/s $\color{#35bf28}+0.59\%$
test_dqn_speed[False-backward] 2.0016ms 1.9006ms 526.1535 Ops/s 530.9298 Ops/s $\color{#d91a1a}-0.90\%$
test_dqn_speed[True-None] 0.7979ms 0.5337ms 1.8739 KOps/s 1.8533 KOps/s $\color{#35bf28}+1.11\%$
test_dqn_speed[True-backward] 1.0278ms 0.9897ms 1.0104 KOps/s 965.1011 Ops/s $\color{#35bf28}+4.69\%$
test_dqn_speed[reduce-overhead-None] 0.7660ms 0.5204ms 1.9217 KOps/s 1.8193 KOps/s $\textbf{\color{#35bf28}+5.63\%}$
test_ddpg_speed[False-None] 3.1341ms 2.7856ms 358.9894 Ops/s 347.9638 Ops/s $\color{#35bf28}+3.17\%$
test_ddpg_speed[False-backward] 4.0906ms 3.9963ms 250.2296 Ops/s 243.3362 Ops/s $\color{#35bf28}+2.83\%$
test_ddpg_speed[True-None] 1.7405ms 1.3787ms 725.3179 Ops/s 686.7600 Ops/s $\textbf{\color{#35bf28}+5.61\%}$
test_ddpg_speed[True-backward] 2.4270ms 2.3615ms 423.4580 Ops/s 353.1400 Ops/s $\textbf{\color{#35bf28}+19.91\%}$
test_ddpg_speed[reduce-overhead-None] 1.5560ms 1.3804ms 724.4246 Ops/s 722.8298 Ops/s $\color{#35bf28}+0.22\%$
test_sac_speed[False-None] 8.4465ms 7.8531ms 127.3382 Ops/s 126.1877 Ops/s $\color{#35bf28}+0.91\%$
test_sac_speed[False-backward] 11.6764ms 11.1074ms 90.0302 Ops/s 89.5636 Ops/s $\color{#35bf28}+0.52\%$
test_sac_speed[True-None] 2.2487ms 2.1029ms 475.5259 Ops/s 469.4437 Ops/s $\color{#35bf28}+1.30\%$
test_sac_speed[True-backward] 3.9984ms 3.9070ms 255.9538 Ops/s 227.0853 Ops/s $\textbf{\color{#35bf28}+12.71\%}$
test_sac_speed[reduce-overhead-None] 2.3334ms 2.0791ms 480.9873 Ops/s 472.6477 Ops/s $\color{#35bf28}+1.76\%$
test_redq_speed[False-None] 14.9457ms 10.3273ms 96.8303 Ops/s 96.1515 Ops/s $\color{#35bf28}+0.71\%$
test_redq_speed[False-backward] 18.2409ms 17.5713ms 56.9109 Ops/s 56.5855 Ops/s $\color{#35bf28}+0.57\%$
test_redq_speed[True-None] 4.4481ms 4.2061ms 237.7474 Ops/s 227.7007 Ops/s $\color{#35bf28}+4.41\%$
test_redq_speed[True-backward] 9.8877ms 9.4560ms 105.7532 Ops/s 107.0839 Ops/s $\color{#d91a1a}-1.24\%$
test_redq_speed[reduce-overhead-None] 4.4788ms 4.1771ms 239.4006 Ops/s 240.9211 Ops/s $\color{#d91a1a}-0.63\%$
test_redq_deprec_speed[False-None] 11.4010ms 10.8795ms 91.9157 Ops/s 93.1682 Ops/s $\color{#d91a1a}-1.34\%$
test_redq_deprec_speed[False-backward] 16.4032ms 15.6134ms 64.0477 Ops/s 65.6625 Ops/s $\color{#d91a1a}-2.46\%$
test_redq_deprec_speed[True-None] 3.8586ms 3.4774ms 287.5693 Ops/s 266.8726 Ops/s $\textbf{\color{#35bf28}+7.76\%}$
test_redq_deprec_speed[True-backward] 7.6198ms 7.4040ms 135.0629 Ops/s 117.7838 Ops/s $\textbf{\color{#35bf28}+14.67\%}$
test_redq_deprec_speed[reduce-overhead-None] 3.9075ms 3.4622ms 288.8365 Ops/s 286.4729 Ops/s $\color{#35bf28}+0.83\%$
test_td3_speed[False-None] 8.1060ms 7.9082ms 126.4510 Ops/s 126.1058 Ops/s $\color{#35bf28}+0.27\%$
test_td3_speed[False-backward] 11.1351ms 10.7369ms 93.1365 Ops/s 92.9193 Ops/s $\color{#35bf28}+0.23\%$
test_td3_speed[True-None] 1.8067ms 1.7715ms 564.5048 Ops/s 567.1051 Ops/s $\color{#d91a1a}-0.46\%$
test_td3_speed[True-backward] 3.6393ms 3.4992ms 285.7788 Ops/s 279.3081 Ops/s $\color{#35bf28}+2.32\%$
test_td3_speed[reduce-overhead-None] 1.7731ms 1.7262ms 579.3135 Ops/s 569.4841 Ops/s $\color{#35bf28}+1.73\%$
test_cql_speed[False-None] 29.8726ms 26.0320ms 38.4143 Ops/s 38.6140 Ops/s $\color{#d91a1a}-0.52\%$
test_cql_speed[False-backward] 39.4569ms 35.3331ms 28.3020 Ops/s 28.4336 Ops/s $\color{#d91a1a}-0.46\%$
test_cql_speed[True-None] 12.8743ms 12.1436ms 82.3481 Ops/s 79.4448 Ops/s $\color{#35bf28}+3.65\%$
test_cql_speed[True-backward] 18.4161ms 17.9690ms 55.6514 Ops/s 55.7357 Ops/s $\color{#d91a1a}-0.15\%$
test_cql_speed[reduce-overhead-None] 12.7110ms 12.1802ms 82.1005 Ops/s 82.4286 Ops/s $\color{#d91a1a}-0.40\%$
test_a2c_speed[False-None] 5.7716ms 5.4705ms 182.7992 Ops/s 184.1291 Ops/s $\color{#d91a1a}-0.72\%$
test_a2c_speed[False-backward] 12.5435ms 11.8677ms 84.2624 Ops/s 84.7085 Ops/s $\color{#d91a1a}-0.53\%$
test_a2c_speed[True-None] 4.0260ms 3.6829ms 271.5252 Ops/s 267.0515 Ops/s $\color{#35bf28}+1.68\%$
test_a2c_speed[True-backward] 8.7271ms 8.4747ms 117.9984 Ops/s 118.6877 Ops/s $\color{#d91a1a}-0.58\%$
test_a2c_speed[reduce-overhead-None] 4.0619ms 3.6616ms 273.1021 Ops/s 276.9223 Ops/s $\color{#d91a1a}-1.38\%$
test_ppo_speed[False-None] 6.1727ms 5.9003ms 169.4838 Ops/s 171.1153 Ops/s $\color{#d91a1a}-0.95\%$
test_ppo_speed[False-backward] 12.8825ms 12.4948ms 80.0333 Ops/s 82.1090 Ops/s $\color{#d91a1a}-2.53\%$
test_ppo_speed[True-None] 4.0537ms 3.5853ms 278.9142 Ops/s 274.1622 Ops/s $\color{#35bf28}+1.73\%$
test_ppo_speed[True-backward] 8.6017ms 8.2965ms 120.5329 Ops/s 120.7641 Ops/s $\color{#d91a1a}-0.19\%$
test_ppo_speed[reduce-overhead-None] 3.9333ms 3.5718ms 279.9731 Ops/s 277.5062 Ops/s $\color{#35bf28}+0.89\%$
test_reinforce_speed[False-None] 4.7887ms 4.5222ms 221.1318 Ops/s 221.8858 Ops/s $\color{#d91a1a}-0.34\%$
test_reinforce_speed[False-backward] 7.5979ms 7.3411ms 136.2200 Ops/s 137.5379 Ops/s $\color{#d91a1a}-0.96\%$
test_reinforce_speed[True-None] 3.2099ms 2.8181ms 354.8464 Ops/s 352.7657 Ops/s $\color{#35bf28}+0.59\%$
test_reinforce_speed[True-backward] 7.8552ms 7.5960ms 131.6476 Ops/s 126.9480 Ops/s $\color{#35bf28}+3.70\%$
test_reinforce_speed[reduce-overhead-None] 3.0063ms 2.7869ms 358.8201 Ops/s 358.0973 Ops/s $\color{#35bf28}+0.20\%$
test_iql_speed[False-None] 25.4600ms 20.0116ms 49.9710 Ops/s 50.5001 Ops/s $\color{#d91a1a}-1.05\%$
test_iql_speed[False-backward] 36.2424ms 30.1424ms 33.1758 Ops/s 33.2433 Ops/s $\color{#d91a1a}-0.20\%$
test_iql_speed[True-None] 8.5773ms 8.2596ms 121.0712 Ops/s 119.3831 Ops/s $\color{#35bf28}+1.41\%$
test_iql_speed[True-backward] 16.7534ms 16.4354ms 60.8441 Ops/s 61.4842 Ops/s $\color{#d91a1a}-1.04\%$
test_iql_speed[reduce-overhead-None] 9.0243ms 8.3150ms 120.2646 Ops/s 119.9340 Ops/s $\color{#35bf28}+0.28\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9831ms 5.8463ms 171.0470 Ops/s 169.8722 Ops/s $\color{#35bf28}+0.69\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.8895ms 0.3336ms 2.9977 KOps/s 2.7155 KOps/s $\textbf{\color{#35bf28}+10.39\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6954ms 0.3165ms 3.1596 KOps/s 2.8190 KOps/s $\textbf{\color{#35bf28}+12.08\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.8383ms 5.5871ms 178.9840 Ops/s 177.9821 Ops/s $\color{#35bf28}+0.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7086ms 0.3458ms 2.8922 KOps/s 3.0367 KOps/s $\color{#d91a1a}-4.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7720ms 0.3505ms 2.8529 KOps/s 3.4299 KOps/s $\textbf{\color{#d91a1a}-16.82\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.7186ms 1.3823ms 723.4522 Ops/s 707.4495 Ops/s $\color{#35bf28}+2.26\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5617ms 1.2949ms 772.2598 Ops/s 746.5567 Ops/s $\color{#35bf28}+3.44\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.9639ms 5.8878ms 169.8424 Ops/s 175.2442 Ops/s $\color{#d91a1a}-3.08\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0468ms 0.5246ms 1.9060 KOps/s 2.0921 KOps/s $\textbf{\color{#d91a1a}-8.89\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7669ms 0.4900ms 2.0408 KOps/s 2.1702 KOps/s $\textbf{\color{#d91a1a}-5.96\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9091ms 5.6375ms 177.3826 Ops/s 178.6654 Ops/s $\color{#d91a1a}-0.72\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9413ms 0.2776ms 3.6029 KOps/s 2.6802 KOps/s $\textbf{\color{#35bf28}+34.43\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4595ms 0.2582ms 3.8724 KOps/s 2.8771 KOps/s $\textbf{\color{#35bf28}+34.59\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0138ms 5.5979ms 178.6397 Ops/s 176.8577 Ops/s $\color{#35bf28}+1.01\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0198ms 0.2713ms 3.6859 KOps/s 2.7258 KOps/s $\textbf{\color{#35bf28}+35.22\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4583ms 0.2541ms 3.9354 KOps/s 2.8962 KOps/s $\textbf{\color{#35bf28}+35.88\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.9482ms 5.7555ms 173.7468 Ops/s 172.1764 Ops/s $\color{#35bf28}+0.91\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9518ms 0.4421ms 2.2620 KOps/s 1.9278 KOps/s $\textbf{\color{#35bf28}+17.33\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8557ms 0.4207ms 2.3771 KOps/s 1.9938 KOps/s $\textbf{\color{#35bf28}+19.22\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.2755ms 4.8943ms 204.3205 Ops/s 204.1810 Ops/s $\color{#35bf28}+0.07\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 11.1383ms 2.2471ms 445.0190 Ops/s 483.3223 Ops/s $\textbf{\color{#d91a1a}-7.93\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.9897ms 0.8620ms 1.1601 KOps/s 1.2004 KOps/s $\color{#d91a1a}-3.36\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5445s 15.7702ms 63.4106 Ops/s 59.3966 Ops/s $\textbf{\color{#35bf28}+6.76\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.9097ms 1.8140ms 551.2544 Ops/s 550.0501 Ops/s $\color{#35bf28}+0.22\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.9921ms 1.0513ms 951.1864 Ops/s 966.7398 Ops/s $\color{#d91a1a}-1.61\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.0545ms 5.1482ms 194.2435 Ops/s 191.7802 Ops/s $\color{#35bf28}+1.28\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.1016ms 1.9309ms 517.8840 Ops/s 526.6383 Ops/s $\color{#d91a1a}-1.66\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.1714ms 1.0055ms 994.5207 Ops/s 1.0049 KOps/s $\color{#d91a1a}-1.04\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 38.7362ms 35.9828ms 27.7910 Ops/s 27.8242 Ops/s $\color{#d91a1a}-0.12\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.1261ms 18.4292ms 54.2616 Ops/s 55.7082 Ops/s $\color{#d91a1a}-2.60\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 39.5290ms 36.8718ms 27.1210 Ops/s 26.7850 Ops/s $\color{#35bf28}+1.25\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.3472ms 18.6693ms 53.5639 Ops/s 53.7309 Ops/s $\color{#d91a1a}-0.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 41.1749ms 38.4001ms 26.0416 Ops/s 25.6042 Ops/s $\color{#35bf28}+1.71\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.8379ms 20.2860ms 49.2952 Ops/s 49.4992 Ops/s $\color{#d91a1a}-0.41\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8553ms 0.2222ms 4.5005 KOps/s 4.5109 KOps/s $\color{#d91a1a}-0.23\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.6950ms 1.3995ms 714.5305 Ops/s 700.4506 Ops/s $\color{#35bf28}+2.01\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.5630ms 2.3104ms 432.8235 Ops/s 413.3654 Ops/s $\color{#35bf28}+4.71\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0738ms 2.9173ms 342.7854 Ops/s 337.2190 Ops/s $\color{#35bf28}+1.65\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2150ms 0.1366ms 7.3189 KOps/s 7.5683 KOps/s $\color{#d91a1a}-3.30\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3422ms 0.1753ms 5.7052 KOps/s 5.2431 KOps/s $\textbf{\color{#35bf28}+8.81\%}$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9025ms 1.7510ms 571.1053 Ops/s 568.0457 Ops/s $\color{#35bf28}+0.54\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.4484ms 1.3215ms 756.7333 Ops/s 778.3963 Ops/s $\color{#d91a1a}-2.78\%$
test_collector_stack_then_write[50-img_shape0-small] 1.3036ms 1.0829ms 923.4410 Ops/s 919.5926 Ops/s $\color{#35bf28}+0.42\%$
test_collector_stack_then_write[100-img_shape1-atari] 7.4386ms 3.5327ms 283.0731 Ops/s 277.2688 Ops/s $\color{#35bf28}+2.09\%$
test_collector_stack_then_write[100-img_shape2-large_img] 11.0216ms 5.6387ms 177.3462 Ops/s 177.3223 Ops/s $\color{#35bf28}+0.01\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 14.8775ms 6.9089ms 144.7407 Ops/s 143.8556 Ops/s $\color{#35bf28}+0.62\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4356ms 0.2701ms 3.7030 KOps/s 3.6372 KOps/s $\color{#35bf28}+1.81\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6427ms 1.4997ms 666.7905 Ops/s 659.1662 Ops/s $\color{#35bf28}+1.16\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.6018ms 2.4333ms 410.9697 Ops/s 395.0385 Ops/s $\color{#35bf28}+4.03\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.3096ms 3.1326ms 319.2267 Ops/s 315.1896 Ops/s $\color{#35bf28}+1.28\%$
test_collector_without_rb[100-img_shape0-atari] 32.4169ms 31.9569ms 31.2922 Ops/s 30.1646 Ops/s $\color{#35bf28}+3.74\%$
test_collector_without_rb[200-img_shape1-large_batch] 63.5657ms 63.2231ms 15.8170 Ops/s 15.5658 Ops/s $\color{#35bf28}+1.61\%$
test_collector_with_rb[100-img_shape0-atari] 37.5730ms 36.4809ms 27.4116 Ops/s 26.6444 Ops/s $\color{#35bf28}+2.88\%$
test_collector_with_rb[200-img_shape1-large_batch] 71.6802ms 70.9959ms 14.0853 Ops/s 13.6700 Ops/s $\color{#35bf28}+3.04\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 21, 2026
…3494)

Adds MPTransport using per-actor response queues for cross-process
communication. Clients must be created before spawning child processes
so that mp.Queue objects are inherited, not serialised.

Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: b22e6e7
Pull-Request: #3494
Co-authored-by: Cursor <cursoragent@cursor.com>
@vmoens vmoens merged commit 235ae20 into gh/vmoens/236/base Feb 21, 2026
115 of 116 checks passed
@vmoens vmoens deleted the gh/vmoens/236/head branch February 21, 2026 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Feature New feature Modules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant