Skip to content

Conversation

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 10, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3480

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Pending

As of commit 51cbfe7 with merge base 0bc6d20 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 173. Improved: $\large\color{#35bf28}21$. Worsened: $\large\color{#d91a1a}15$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 86.6972μs 83.0081μs 12.0470 KOps/s 12.5204 KOps/s $\color{#d91a1a}-3.78\%$
test_tensor_to_bytestream_speed[torch.save] 0.1480ms 0.1472ms 6.7943 KOps/s 7.1902 KOps/s $\textbf{\color{#d91a1a}-5.51\%}$
test_tensor_to_bytestream_speed[untyped_storage] 0.1179s 0.1177s 8.4990 Ops/s 9.1309 Ops/s $\textbf{\color{#d91a1a}-6.92\%}$
test_tensor_to_bytestream_speed[numpy] 2.7393μs 2.7263μs 366.7936 KOps/s 380.9443 KOps/s $\color{#d91a1a}-3.71\%$
test_tensor_to_bytestream_speed[safetensors] 39.2286μs 38.9393μs 25.6810 KOps/s 26.2937 KOps/s $\color{#d91a1a}-2.33\%$
test_simple 0.5582s 0.5553s 1.8008 Ops/s 1.7421 Ops/s $\color{#35bf28}+3.37\%$
test_transformed 1.1481s 1.1457s 0.8728 Ops/s 0.8598 Ops/s $\color{#35bf28}+1.51\%$
test_serial 1.6922s 1.6898s 0.5918 Ops/s 0.5873 Ops/s $\color{#35bf28}+0.76\%$
test_parallel 1.1547s 1.0538s 0.9489 Ops/s 0.9343 Ops/s $\color{#35bf28}+1.57\%$
test_step_mdp_speed[True-True-True-True-True] 0.3248ms 45.7484μs 21.8587 KOps/s 22.3432 KOps/s $\color{#d91a1a}-2.17\%$
test_step_mdp_speed[True-True-True-True-False] 51.4110μs 25.2140μs 39.6605 KOps/s 38.3783 KOps/s $\color{#35bf28}+3.34\%$
test_step_mdp_speed[True-True-True-False-True] 57.2110μs 24.9776μs 40.0358 KOps/s 38.9636 KOps/s $\color{#35bf28}+2.75\%$
test_step_mdp_speed[True-True-True-False-False] 45.6910μs 13.8866μs 72.0121 KOps/s 69.4159 KOps/s $\color{#35bf28}+3.74\%$
test_step_mdp_speed[True-True-False-True-True] 74.5820μs 47.4879μs 21.0580 KOps/s 20.7677 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[True-True-False-True-False] 67.7710μs 28.1148μs 35.5684 KOps/s 34.9442 KOps/s $\color{#35bf28}+1.79\%$
test_step_mdp_speed[True-True-False-False-True] 52.9510μs 28.0695μs 35.6259 KOps/s 35.2678 KOps/s $\color{#35bf28}+1.02\%$
test_step_mdp_speed[True-True-False-False-False] 55.0710μs 16.9947μs 58.8419 KOps/s 58.1243 KOps/s $\color{#35bf28}+1.23\%$
test_step_mdp_speed[True-False-True-True-True] 95.1310μs 51.3408μs 19.4777 KOps/s 19.7538 KOps/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[True-False-True-True-False] 60.0210μs 31.1020μs 32.1522 KOps/s 32.2782 KOps/s $\color{#d91a1a}-0.39\%$
test_step_mdp_speed[True-False-True-False-True] 70.3110μs 27.7882μs 35.9865 KOps/s 34.8094 KOps/s $\color{#35bf28}+3.38\%$
test_step_mdp_speed[True-False-True-False-False] 43.0400μs 16.8294μs 59.4199 KOps/s 58.0265 KOps/s $\color{#35bf28}+2.40\%$
test_step_mdp_speed[True-False-False-True-True] 84.9620μs 53.6094μs 18.6535 KOps/s 18.8690 KOps/s $\color{#d91a1a}-1.14\%$
test_step_mdp_speed[True-False-False-True-False] 72.1210μs 33.8610μs 29.5325 KOps/s 29.7441 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[True-False-False-False-True] 61.4510μs 31.0706μs 32.1847 KOps/s 32.4090 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[True-False-False-False-False] 45.9900μs 19.4231μs 51.4850 KOps/s 51.2012 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[False-True-True-True-True] 84.0010μs 51.3786μs 19.4633 KOps/s 19.8582 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[False-True-True-True-False] 59.6210μs 30.4674μs 32.8220 KOps/s 32.1640 KOps/s $\color{#35bf28}+2.05\%$
test_step_mdp_speed[False-True-True-False-True] 2.2874ms 32.2172μs 31.0393 KOps/s 31.4243 KOps/s $\color{#d91a1a}-1.22\%$
test_step_mdp_speed[False-True-True-False-False] 51.1810μs 18.4232μs 54.2795 KOps/s 53.2502 KOps/s $\color{#35bf28}+1.93\%$
test_step_mdp_speed[False-True-False-True-True] 0.1192ms 54.1981μs 18.4508 KOps/s 19.0030 KOps/s $\color{#d91a1a}-2.91\%$
test_step_mdp_speed[False-True-False-True-False] 63.1410μs 34.0743μs 29.3476 KOps/s 29.8648 KOps/s $\color{#d91a1a}-1.73\%$
test_step_mdp_speed[False-True-False-False-True] 62.2510μs 34.8591μs 28.6869 KOps/s 29.6101 KOps/s $\color{#d91a1a}-3.12\%$
test_step_mdp_speed[False-True-False-False-False] 74.8310μs 21.4711μs 46.5742 KOps/s 47.7447 KOps/s $\color{#d91a1a}-2.45\%$
test_step_mdp_speed[False-False-True-True-True] 88.7020μs 56.8198μs 17.5995 KOps/s 17.8693 KOps/s $\color{#d91a1a}-1.51\%$
test_step_mdp_speed[False-False-True-True-False] 60.3110μs 37.0267μs 27.0075 KOps/s 27.7540 KOps/s $\color{#d91a1a}-2.69\%$
test_step_mdp_speed[False-False-True-False-True] 70.4620μs 34.4706μs 29.0102 KOps/s 29.2343 KOps/s $\color{#d91a1a}-0.77\%$
test_step_mdp_speed[False-False-True-False-False] 55.3100μs 21.2599μs 47.0370 KOps/s 47.5300 KOps/s $\color{#d91a1a}-1.04\%$
test_step_mdp_speed[False-False-False-True-True] 0.1200ms 58.5345μs 17.0840 KOps/s 17.3105 KOps/s $\color{#d91a1a}-1.31\%$
test_step_mdp_speed[False-False-False-True-False] 78.1620μs 38.9070μs 25.7023 KOps/s 25.9352 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[False-False-False-False-True] 76.2710μs 36.6326μs 27.2981 KOps/s 27.7600 KOps/s $\color{#d91a1a}-1.66\%$
test_step_mdp_speed[False-False-False-False-False] 79.3310μs 23.5538μs 42.4560 KOps/s 42.1700 KOps/s $\color{#35bf28}+0.68\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.7545s 0.7483s 1.3364 Ops/s 1.2944 Ops/s $\color{#35bf28}+3.24\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7333s 0.6378s 1.5678 Ops/s 1.5599 Ops/s $\color{#35bf28}+0.50\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7679s 1.6910s 0.5914 Ops/s 0.5900 Ops/s $\color{#35bf28}+0.24\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5548s 1.4690s 0.6807 Ops/s 0.6802 Ops/s $\color{#35bf28}+0.09\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0294s 1.9513s 0.5125 Ops/s 0.5155 Ops/s $\color{#d91a1a}-0.59\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.8068s 1.7209s 0.5811 Ops/s 0.5835 Ops/s $\color{#d91a1a}-0.40\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.8519s 4.7244s 0.2117 Ops/s 0.2155 Ops/s $\color{#d91a1a}-1.79\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5701s 4.4846s 0.2230 Ops/s 0.2233 Ops/s $\color{#d91a1a}-0.14\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9897s 1.9085s 0.5240 Ops/s 0.5116 Ops/s $\color{#35bf28}+2.41\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7819s 1.6487s 0.6065 Ops/s 0.6162 Ops/s $\color{#d91a1a}-1.57\%$
test_values[generalized_advantage_estimate-True-True] 10.6531ms 10.4604ms 95.5983 Ops/s 97.2197 Ops/s $\color{#d91a1a}-1.67\%$
test_values[vec_generalized_advantage_estimate-True-True] 13.0453ms 11.1587ms 89.6165 Ops/s 55.9112 Ops/s $\textbf{\color{#35bf28}+60.28\%}$
test_values[td0_return_estimate-False-False] 0.2217ms 0.1276ms 7.8395 KOps/s 7.6841 KOps/s $\color{#35bf28}+2.02\%$
test_values[td1_return_estimate-False-False] 30.3489ms 29.4286ms 33.9805 Ops/s 35.3386 Ops/s $\color{#d91a1a}-3.84\%$
test_values[vec_td1_return_estimate-False-False] 12.1414ms 11.2655ms 88.7664 Ops/s 55.9840 Ops/s $\textbf{\color{#35bf28}+58.56\%}$
test_values[td_lambda_return_estimate-True-False] 44.8760ms 43.2097ms 23.1429 Ops/s 24.2244 Ops/s $\color{#d91a1a}-4.46\%$
test_values[vec_td_lambda_return_estimate-True-False] 11.9628ms 11.2357ms 89.0019 Ops/s 56.1035 Ops/s $\textbf{\color{#35bf28}+58.64\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.4325ms 9.3428ms 107.0339 Ops/s 109.2945 Ops/s $\color{#d91a1a}-2.07\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.8105ms 1.5164ms 659.4671 Ops/s 674.1109 Ops/s $\color{#d91a1a}-2.17\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5174ms 0.4287ms 2.3324 KOps/s 2.3363 KOps/s $\color{#d91a1a}-0.16\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 24.7798ms 23.9699ms 41.7190 Ops/s 30.9793 Ops/s $\textbf{\color{#35bf28}+34.67\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 2.1435ms 1.7230ms 580.3848 Ops/s 585.9244 Ops/s $\color{#d91a1a}-0.95\%$
test_dqn_speed[False-None] 1.7872ms 1.4116ms 708.4226 Ops/s 686.5831 Ops/s $\color{#35bf28}+3.18\%$
test_dqn_speed[False-backward] 1.9801ms 1.9339ms 517.0941 Ops/s 512.9135 Ops/s $\color{#35bf28}+0.82\%$
test_dqn_speed[True-None] 0.9772ms 0.5660ms 1.7666 KOps/s 1.7946 KOps/s $\color{#d91a1a}-1.56\%$
test_dqn_speed[True-backward] 1.0787ms 1.0293ms 971.5316 Ops/s 848.9590 Ops/s $\textbf{\color{#35bf28}+14.44\%}$
test_dqn_speed[reduce-overhead-None] 0.9714ms 0.5531ms 1.8081 KOps/s 1.6944 KOps/s $\textbf{\color{#35bf28}+6.71\%}$
test_ddpg_speed[False-None] 3.2462ms 2.9346ms 340.7593 Ops/s 334.0674 Ops/s $\color{#35bf28}+2.00\%$
test_ddpg_speed[False-backward] 4.5367ms 4.1686ms 239.8865 Ops/s 235.4722 Ops/s $\color{#35bf28}+1.87\%$
test_ddpg_speed[True-None] 1.8613ms 1.4567ms 686.4965 Ops/s 685.5544 Ops/s $\color{#35bf28}+0.14\%$
test_ddpg_speed[True-backward] 2.5687ms 2.4540ms 407.4989 Ops/s 405.9026 Ops/s $\color{#35bf28}+0.39\%$
test_ddpg_speed[reduce-overhead-None] 1.8570ms 1.4484ms 690.4128 Ops/s 688.4348 Ops/s $\color{#35bf28}+0.29\%$
test_sac_speed[False-None] 8.9740ms 8.1465ms 122.7523 Ops/s 120.9006 Ops/s $\color{#35bf28}+1.53\%$
test_sac_speed[False-backward] 11.9028ms 11.4263ms 87.5176 Ops/s 87.1475 Ops/s $\color{#35bf28}+0.42\%$
test_sac_speed[True-None] 2.6211ms 2.2135ms 451.7827 Ops/s 433.6314 Ops/s $\color{#35bf28}+4.19\%$
test_sac_speed[True-backward] 4.3275ms 4.1468ms 241.1518 Ops/s 238.5868 Ops/s $\color{#35bf28}+1.08\%$
test_sac_speed[reduce-overhead-None] 2.5487ms 2.2032ms 453.8949 Ops/s 435.2368 Ops/s $\color{#35bf28}+4.29\%$
test_redq_speed[False-None] 11.1074ms 10.5201ms 95.0560 Ops/s 94.4078 Ops/s $\color{#35bf28}+0.69\%$
test_redq_speed[False-backward] 19.3281ms 18.1959ms 54.9574 Ops/s 55.3207 Ops/s $\color{#d91a1a}-0.66\%$
test_redq_speed[True-None] 4.9329ms 4.5778ms 218.4433 Ops/s 210.7325 Ops/s $\color{#35bf28}+3.66\%$
test_redq_speed[True-backward] 10.5776ms 10.0249ms 99.7519 Ops/s 98.3935 Ops/s $\color{#35bf28}+1.38\%$
test_redq_speed[reduce-overhead-None] 4.7882ms 4.4740ms 223.5127 Ops/s 222.7765 Ops/s $\color{#35bf28}+0.33\%$
test_redq_deprec_speed[False-None] 11.9824ms 11.3678ms 87.9679 Ops/s 87.2916 Ops/s $\color{#35bf28}+0.77\%$
test_redq_deprec_speed[False-backward] 0.3889s 23.6246ms 42.3287 Ops/s 61.2939 Ops/s $\textbf{\color{#d91a1a}-30.94\%}$
test_redq_deprec_speed[True-None] 3.9550ms 3.7566ms 266.2006 Ops/s 262.5457 Ops/s $\color{#35bf28}+1.39\%$
test_redq_deprec_speed[True-backward] 8.2051ms 7.8081ms 128.0725 Ops/s 125.2333 Ops/s $\color{#35bf28}+2.27\%$
test_redq_deprec_speed[reduce-overhead-None] 3.9981ms 3.7168ms 269.0454 Ops/s 267.5868 Ops/s $\color{#35bf28}+0.55\%$
test_td3_speed[False-None] 48.5759ms 8.5222ms 117.3407 Ops/s 123.7443 Ops/s $\textbf{\color{#d91a1a}-5.17\%}$
test_td3_speed[False-backward] 11.3605ms 11.0726ms 90.3130 Ops/s 91.0909 Ops/s $\color{#d91a1a}-0.85\%$
test_td3_speed[True-None] 1.9814ms 1.8888ms 529.4403 Ops/s 530.9289 Ops/s $\color{#d91a1a}-0.28\%$
test_td3_speed[True-backward] 3.9196ms 3.7637ms 265.6947 Ops/s 266.4685 Ops/s $\color{#d91a1a}-0.29\%$
test_td3_speed[reduce-overhead-None] 1.9022ms 1.8599ms 537.6700 Ops/s 533.9028 Ops/s $\color{#35bf28}+0.71\%$
test_cql_speed[False-None] 30.1695ms 26.8670ms 37.2203 Ops/s 36.3417 Ops/s $\color{#35bf28}+2.42\%$
test_cql_speed[False-backward] 36.6392ms 35.6380ms 28.0599 Ops/s 26.7327 Ops/s $\color{#35bf28}+4.96\%$
test_cql_speed[True-None] 14.2586ms 12.8475ms 77.8361 Ops/s 79.1328 Ops/s $\color{#d91a1a}-1.64\%$
test_cql_speed[True-backward] 19.4097ms 18.9184ms 52.8586 Ops/s 54.6722 Ops/s $\color{#d91a1a}-3.32\%$
test_cql_speed[reduce-overhead-None] 12.9707ms 12.7317ms 78.5438 Ops/s 79.9946 Ops/s $\color{#d91a1a}-1.81\%$
test_a2c_speed[False-None] 5.7216ms 5.5612ms 179.8164 Ops/s 164.6318 Ops/s $\textbf{\color{#35bf28}+9.22\%}$
test_a2c_speed[False-backward] 12.3676ms 12.0271ms 83.1454 Ops/s 79.4961 Ops/s $\color{#35bf28}+4.59\%$
test_a2c_speed[True-None] 3.9528ms 3.8073ms 262.6504 Ops/s 263.7072 Ops/s $\color{#d91a1a}-0.40\%$
test_a2c_speed[True-backward] 8.9411ms 8.7735ms 113.9791 Ops/s 115.3296 Ops/s $\color{#d91a1a}-1.17\%$
test_a2c_speed[reduce-overhead-None] 4.1040ms 3.7732ms 265.0296 Ops/s 264.0896 Ops/s $\color{#35bf28}+0.36\%$
test_ppo_speed[False-None] 6.1471ms 5.9604ms 167.7737 Ops/s 153.5717 Ops/s $\textbf{\color{#35bf28}+9.25\%}$
test_ppo_speed[False-backward] 12.9481ms 12.5928ms 79.4103 Ops/s 75.8574 Ops/s $\color{#35bf28}+4.68\%$
test_ppo_speed[True-None] 3.8604ms 3.7236ms 268.5584 Ops/s 272.5681 Ops/s $\color{#d91a1a}-1.47\%$
test_ppo_speed[True-backward] 8.9161ms 8.5220ms 117.3437 Ops/s 115.7873 Ops/s $\color{#35bf28}+1.34\%$
test_ppo_speed[reduce-overhead-None] 3.8432ms 3.6926ms 270.8082 Ops/s 275.0729 Ops/s $\color{#d91a1a}-1.55\%$
test_reinforce_speed[False-None] 4.7664ms 4.6076ms 217.0305 Ops/s 194.0479 Ops/s $\textbf{\color{#35bf28}+11.84\%}$
test_reinforce_speed[False-backward] 7.8671ms 7.5204ms 132.9717 Ops/s 126.4669 Ops/s $\textbf{\color{#35bf28}+5.14\%}$
test_reinforce_speed[True-None] 3.1206ms 2.9654ms 337.2228 Ops/s 334.6937 Ops/s $\color{#35bf28}+0.76\%$
test_reinforce_speed[True-backward] 8.1440ms 7.8652ms 127.1426 Ops/s 128.2989 Ops/s $\color{#d91a1a}-0.90\%$
test_reinforce_speed[reduce-overhead-None] 3.0699ms 2.9102ms 343.6148 Ops/s 324.3303 Ops/s $\textbf{\color{#35bf28}+5.95\%}$
test_iql_speed[False-None] 21.3278ms 20.3007ms 49.2595 Ops/s 46.8687 Ops/s $\textbf{\color{#35bf28}+5.10\%}$
test_iql_speed[False-backward] 31.3725ms 30.7809ms 32.4877 Ops/s 31.5231 Ops/s $\color{#35bf28}+3.06\%$
test_iql_speed[True-None] 11.2578ms 8.8580ms 112.8928 Ops/s 114.7903 Ops/s $\color{#d91a1a}-1.65\%$
test_iql_speed[True-backward] 17.8322ms 17.1307ms 58.3749 Ops/s 59.2475 Ops/s $\color{#d91a1a}-1.47\%$
test_iql_speed[reduce-overhead-None] 9.0159ms 8.7264ms 114.5946 Ops/s 112.3190 Ops/s $\color{#35bf28}+2.03\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2749ms 6.1201ms 163.3952 Ops/s 167.6064 Ops/s $\color{#d91a1a}-2.51\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.9194ms 0.3413ms 2.9297 KOps/s 3.1243 KOps/s $\textbf{\color{#d91a1a}-6.23\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5907ms 0.3416ms 2.9273 KOps/s 3.6582 KOps/s $\textbf{\color{#d91a1a}-19.98\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0908ms 5.8377ms 171.3004 Ops/s 172.2946 Ops/s $\color{#d91a1a}-0.58\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5025ms 0.3440ms 2.9070 KOps/s 3.5822 KOps/s $\textbf{\color{#d91a1a}-18.85\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5735ms 0.3063ms 3.2650 KOps/s 3.8196 KOps/s $\textbf{\color{#d91a1a}-14.52\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6318ms 1.4393ms 694.7752 Ops/s 781.0688 Ops/s $\textbf{\color{#d91a1a}-11.05\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6455ms 1.3646ms 732.8019 Ops/s 834.2026 Ops/s $\textbf{\color{#d91a1a}-12.16\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.7856ms 6.0048ms 166.5328 Ops/s 167.0886 Ops/s $\color{#d91a1a}-0.33\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2264ms 0.4607ms 2.1707 KOps/s 2.0511 KOps/s $\textbf{\color{#35bf28}+5.83\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5997ms 0.4200ms 2.3810 KOps/s 2.1652 KOps/s $\textbf{\color{#35bf28}+9.97\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9737ms 5.8867ms 169.8749 Ops/s 170.6855 Ops/s $\color{#d91a1a}-0.47\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9253ms 0.2888ms 3.4626 KOps/s 3.0105 KOps/s $\textbf{\color{#35bf28}+15.02\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4773ms 0.2685ms 3.7247 KOps/s 3.1926 KOps/s $\textbf{\color{#35bf28}+16.67\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9689ms 5.7286ms 174.5617 Ops/s 173.1295 Ops/s $\color{#35bf28}+0.83\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1591ms 0.3184ms 3.1407 KOps/s 3.2959 KOps/s $\color{#d91a1a}-4.71\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4631ms 0.2658ms 3.7621 KOps/s 3.0107 KOps/s $\textbf{\color{#35bf28}+24.96\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3989ms 6.0206ms 166.0972 Ops/s 166.7823 Ops/s $\color{#d91a1a}-0.41\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2706ms 0.5562ms 1.7979 KOps/s 2.0923 KOps/s $\textbf{\color{#d91a1a}-14.07\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8661ms 0.5280ms 1.8940 KOps/s 2.1658 KOps/s $\textbf{\color{#d91a1a}-12.55\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5620s 16.2523ms 61.5299 Ops/s 57.9885 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.9532ms 1.8098ms 552.5538 Ops/s 510.7506 Ops/s $\textbf{\color{#35bf28}+8.18\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.1331ms 1.2221ms 818.2757 Ops/s 785.6865 Ops/s $\color{#35bf28}+4.15\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.6431ms 5.1492ms 194.2058 Ops/s 199.1449 Ops/s $\color{#d91a1a}-2.48\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.9517ms 1.7578ms 568.8807 Ops/s 574.4831 Ops/s $\color{#d91a1a}-0.98\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.8805ms 1.1384ms 878.4486 Ops/s 1.1131 KOps/s $\textbf{\color{#d91a1a}-21.08\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 9.3497ms 5.3188ms 188.0112 Ops/s 60.0695 Ops/s $\textbf{\color{#35bf28}+212.99\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 6.9416ms 2.0295ms 492.7240 Ops/s 528.9724 Ops/s $\textbf{\color{#d91a1a}-6.85\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2763ms 1.2600ms 793.6225 Ops/s 931.1287 Ops/s $\textbf{\color{#d91a1a}-14.77\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 39.2372ms 36.5514ms 27.3588 Ops/s 27.5714 Ops/s $\color{#d91a1a}-0.77\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.3439ms 18.5486ms 53.9123 Ops/s 55.3321 Ops/s $\color{#d91a1a}-2.57\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.1261ms 37.7010ms 26.5245 Ops/s 26.6456 Ops/s $\color{#d91a1a}-0.45\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.9722ms 18.9686ms 52.7188 Ops/s 53.9241 Ops/s $\color{#d91a1a}-2.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 41.9397ms 39.4140ms 25.3717 Ops/s 25.0309 Ops/s $\color{#35bf28}+1.36\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 22.9412ms 20.3973ms 49.0262 Ops/s 51.1328 Ops/s $\color{#d91a1a}-4.12\%$
test_storage_write_lazystack[50-img_shape0-small] 0.9114ms 0.2244ms 4.4558 KOps/s 4.6043 KOps/s $\color{#d91a1a}-3.22\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.9210ms 1.4112ms 708.6259 Ops/s 714.2631 Ops/s $\color{#d91a1a}-0.79\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.7779ms 2.3536ms 424.8750 Ops/s 426.7512 Ops/s $\color{#d91a1a}-0.44\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.4475ms 2.9761ms 336.0091 Ops/s 338.7937 Ops/s $\color{#d91a1a}-0.82\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2130ms 0.1347ms 7.4255 KOps/s 7.4219 KOps/s $\color{#35bf28}+0.05\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3497ms 0.2024ms 4.9415 KOps/s 5.0582 KOps/s $\color{#d91a1a}-2.31\%$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9937ms 1.8074ms 553.2838 Ops/s 559.8062 Ops/s $\color{#d91a1a}-1.17\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.7469ms 1.2783ms 782.2989 Ops/s 742.4737 Ops/s $\textbf{\color{#35bf28}+5.36\%}$
test_collector_stack_then_write[50-img_shape0-small] 1.5672ms 1.1292ms 885.5783 Ops/s 886.9575 Ops/s $\color{#d91a1a}-0.16\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.9875ms 3.5597ms 280.9221 Ops/s 283.6345 Ops/s $\color{#d91a1a}-0.96\%$
test_collector_stack_then_write[100-img_shape2-large_img] 11.1435ms 5.8629ms 170.5644 Ops/s 175.5370 Ops/s $\color{#d91a1a}-2.83\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.6471ms 7.0523ms 141.7984 Ops/s 135.1771 Ops/s $\color{#35bf28}+4.90\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.6861ms 0.2744ms 3.6437 KOps/s 3.6821 KOps/s $\color{#d91a1a}-1.04\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 2.0761ms 1.5363ms 650.9348 Ops/s 656.6580 Ops/s $\color{#d91a1a}-0.87\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.8766ms 2.4782ms 403.5168 Ops/s 405.1199 Ops/s $\color{#d91a1a}-0.40\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.6308ms 3.1859ms 313.8860 Ops/s 313.8385 Ops/s $\color{#35bf28}+0.02\%$
test_collector_without_rb[100-img_shape0-atari] 35.8520ms 34.5961ms 28.9050 Ops/s 29.1010 Ops/s $\color{#d91a1a}-0.67\%$
test_collector_without_rb[200-img_shape1-large_batch] 68.5900ms 68.0174ms 14.7021 Ops/s 14.7511 Ops/s $\color{#d91a1a}-0.33\%$
test_collector_with_rb[100-img_shape0-atari] 39.5588ms 39.0206ms 25.6275 Ops/s 25.6964 Ops/s $\color{#d91a1a}-0.27\%$
test_collector_with_rb[200-img_shape1-large_batch] 77.1993ms 76.7578ms 13.0280 Ops/s 13.1338 Ops/s $\color{#d91a1a}-0.81\%$

@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 81.5926μs 80.0143μs 12.4978 KOps/s 12.3131 KOps/s $\color{#35bf28}+1.50\%$
test_tensor_to_bytestream_speed[torch.save] 0.1395ms 0.1391ms 7.1891 KOps/s 7.1383 KOps/s $\color{#35bf28}+0.71\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1089s 0.1084s 9.2272 Ops/s 9.1217 Ops/s $\color{#35bf28}+1.16\%$
test_tensor_to_bytestream_speed[numpy] 2.7003μs 2.6930μs 371.3398 KOps/s 388.2940 KOps/s $\color{#d91a1a}-4.37\%$
test_tensor_to_bytestream_speed[safetensors] 38.1829μs 36.6638μs 27.2749 KOps/s 25.7951 KOps/s $\textbf{\color{#35bf28}+5.74\%}$
test_simple 0.7980s 0.7977s 1.2537 Ops/s 1.2159 Ops/s $\color{#35bf28}+3.11\%$
test_transformed 1.5472s 1.4533s 0.6881 Ops/s 0.6862 Ops/s $\color{#35bf28}+0.28\%$
test_serial 2.4126s 2.3200s 0.4310 Ops/s 0.4286 Ops/s $\color{#35bf28}+0.56\%$
test_parallel 1.9097s 1.8186s 0.5499 Ops/s 0.5522 Ops/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[True-True-True-True-True] 0.3596ms 43.2534μs 23.1196 KOps/s 22.6079 KOps/s $\color{#35bf28}+2.26\%$
test_step_mdp_speed[True-True-True-True-False] 0.1170ms 24.2885μs 41.1718 KOps/s 40.9892 KOps/s $\color{#35bf28}+0.45\%$
test_step_mdp_speed[True-True-True-False-True] 0.1079ms 23.9078μs 41.8274 KOps/s 40.5059 KOps/s $\color{#35bf28}+3.26\%$
test_step_mdp_speed[True-True-True-False-False] 43.3020μs 13.4739μs 74.2173 KOps/s 74.6334 KOps/s $\color{#d91a1a}-0.56\%$
test_step_mdp_speed[True-True-False-True-True] 0.1350ms 46.1714μs 21.6584 KOps/s 21.4453 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[True-True-False-True-False] 0.1008ms 26.9704μs 37.0777 KOps/s 36.8451 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[True-True-False-False-True] 65.7430μs 27.1293μs 36.8605 KOps/s 36.9313 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[True-True-False-False-False] 93.4040μs 16.3620μs 61.1172 KOps/s 61.6889 KOps/s $\color{#d91a1a}-0.93\%$
test_step_mdp_speed[True-False-True-True-True] 0.1434ms 49.1161μs 20.3599 KOps/s 19.9779 KOps/s $\color{#35bf28}+1.91\%$
test_step_mdp_speed[True-False-True-True-False] 0.1170ms 29.7664μs 33.5949 KOps/s 33.1224 KOps/s $\color{#35bf28}+1.43\%$
test_step_mdp_speed[True-False-True-False-True] 56.8530μs 27.0508μs 36.9674 KOps/s 36.6035 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[True-False-True-False-False] 88.5240μs 16.1697μs 61.8441 KOps/s 61.2479 KOps/s $\color{#35bf28}+0.97\%$
test_step_mdp_speed[True-False-False-True-True] 0.1363ms 51.7741μs 19.3147 KOps/s 19.0308 KOps/s $\color{#35bf28}+1.49\%$
test_step_mdp_speed[True-False-False-True-False] 0.1110ms 32.3793μs 30.8839 KOps/s 30.5377 KOps/s $\color{#35bf28}+1.13\%$
test_step_mdp_speed[True-False-False-False-True] 63.9920μs 29.4706μs 33.9321 KOps/s 33.6672 KOps/s $\color{#35bf28}+0.79\%$
test_step_mdp_speed[True-False-False-False-False] 0.1035ms 18.6889μs 53.5076 KOps/s 53.4022 KOps/s $\color{#35bf28}+0.20\%$
test_step_mdp_speed[False-True-True-True-True] 0.1320ms 49.3946μs 20.2451 KOps/s 20.1391 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[False-True-True-True-False] 64.2030μs 29.9396μs 33.4006 KOps/s 33.0824 KOps/s $\color{#35bf28}+0.96\%$
test_step_mdp_speed[False-True-True-False-True] 2.4523ms 30.9368μs 32.3240 KOps/s 32.4854 KOps/s $\color{#d91a1a}-0.50\%$
test_step_mdp_speed[False-True-True-False-False] 96.6440μs 17.6276μs 56.7292 KOps/s 55.3391 KOps/s $\color{#35bf28}+2.51\%$
test_step_mdp_speed[False-True-False-True-True] 0.1332ms 51.5229μs 19.4088 KOps/s 19.1631 KOps/s $\color{#35bf28}+1.28\%$
test_step_mdp_speed[False-True-False-True-False] 62.9730μs 32.4203μs 30.8448 KOps/s 30.5278 KOps/s $\color{#35bf28}+1.04\%$
test_step_mdp_speed[False-True-False-False-True] 0.1083ms 32.9521μs 30.3471 KOps/s 29.7210 KOps/s $\color{#35bf28}+2.11\%$
test_step_mdp_speed[False-True-False-False-False] 98.6540μs 20.3871μs 49.0507 KOps/s 48.8621 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[False-False-True-True-True] 0.1385ms 54.3112μs 18.4124 KOps/s 18.2186 KOps/s $\color{#35bf28}+1.06\%$
test_step_mdp_speed[False-False-True-True-False] 66.9830μs 35.2499μs 28.3689 KOps/s 28.0913 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[False-False-True-False-True] 0.1093ms 33.0429μs 30.2637 KOps/s 29.9680 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[False-False-True-False-False] 94.9440μs 20.5507μs 48.6602 KOps/s 48.0216 KOps/s $\color{#35bf28}+1.33\%$
test_step_mdp_speed[False-False-False-True-True] 0.1391ms 56.6049μs 17.6663 KOps/s 17.6261 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[False-False-False-True-False] 70.5330μs 37.5909μs 26.6022 KOps/s 26.4313 KOps/s $\color{#35bf28}+0.65\%$
test_step_mdp_speed[False-False-False-False-True] 0.1114ms 35.3134μs 28.3179 KOps/s 28.2205 KOps/s $\color{#35bf28}+0.34\%$
test_step_mdp_speed[False-False-False-False-False] 98.7440μs 22.8217μs 43.8180 KOps/s 43.4640 KOps/s $\color{#35bf28}+0.81\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8472s 0.7506s 1.3322 Ops/s 1.3312 Ops/s $\color{#35bf28}+0.08\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7139s 0.6173s 1.6200 Ops/s 1.6159 Ops/s $\color{#35bf28}+0.25\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7115s 1.6317s 0.6128 Ops/s 0.6109 Ops/s $\color{#35bf28}+0.32\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.4929s 1.4170s 0.7057 Ops/s 0.7042 Ops/s $\color{#35bf28}+0.22\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9585s 1.8806s 0.5318 Ops/s 0.5319 Ops/s $\color{#d91a1a}-0.02\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7403s 1.6602s 0.6023 Ops/s 0.6017 Ops/s $\color{#35bf28}+0.10\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7246s 4.5995s 0.2174 Ops/s 0.2140 Ops/s $\color{#35bf28}+1.61\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5151s 4.4608s 0.2242 Ops/s 0.2245 Ops/s $\color{#d91a1a}-0.12\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.0148s 1.9141s 0.5225 Ops/s 0.5278 Ops/s $\color{#d91a1a}-1.02\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6767s 1.5905s 0.6287 Ops/s 0.6207 Ops/s $\color{#35bf28}+1.30\%$
test_values[generalized_advantage_estimate-True-True] 21.1030ms 20.5484ms 48.6657 Ops/s 49.6906 Ops/s $\color{#d91a1a}-2.06\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1366s 3.6566ms 273.4760 Ops/s 260.6406 Ops/s $\color{#35bf28}+4.92\%$
test_values[td0_return_estimate-False-False] 0.1088ms 84.1897μs 11.8779 KOps/s 12.1307 KOps/s $\color{#d91a1a}-2.08\%$
test_values[td1_return_estimate-False-False] 49.2689ms 48.5557ms 20.5949 Ops/s 21.0010 Ops/s $\color{#d91a1a}-1.93\%$
test_values[vec_td1_return_estimate-False-False] 1.3063ms 1.0935ms 914.4940 Ops/s 920.4291 Ops/s $\color{#d91a1a}-0.64\%$
test_values[td_lambda_return_estimate-True-False] 80.1688ms 79.6061ms 12.5619 Ops/s 12.7517 Ops/s $\color{#d91a1a}-1.49\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2342ms 1.0870ms 919.9747 Ops/s 922.0918 Ops/s $\color{#d91a1a}-0.23\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 20.5389ms 20.3547ms 49.1288 Ops/s 49.5287 Ops/s $\color{#d91a1a}-0.81\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0578ms 0.7684ms 1.3014 KOps/s 1.3123 KOps/s $\color{#d91a1a}-0.83\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7217ms 0.6794ms 1.4720 KOps/s 1.4803 KOps/s $\color{#d91a1a}-0.56\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5498ms 1.4913ms 670.5676 Ops/s 671.5832 Ops/s $\color{#d91a1a}-0.15\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7590ms 0.6970ms 1.4346 KOps/s 1.4426 KOps/s $\color{#d91a1a}-0.55\%$
test_dqn_speed[False-None] 1.6088ms 1.5254ms 655.5573 Ops/s 652.5203 Ops/s $\color{#35bf28}+0.47\%$
test_dqn_speed[False-backward] 2.3860ms 2.1902ms 456.5794 Ops/s 457.2234 Ops/s $\color{#d91a1a}-0.14\%$
test_dqn_speed[True-None] 0.6585ms 0.5565ms 1.7970 KOps/s 1.7716 KOps/s $\color{#35bf28}+1.43\%$
test_dqn_speed[True-backward] 1.2603ms 1.2134ms 824.1331 Ops/s 825.4278 Ops/s $\color{#d91a1a}-0.16\%$
test_dqn_speed[reduce-overhead-None] 0.6309ms 0.5786ms 1.7283 KOps/s 1.6739 KOps/s $\color{#35bf28}+3.25\%$
test_ddpg_speed[False-None] 3.2703ms 2.8793ms 347.3068 Ops/s 346.0601 Ops/s $\color{#35bf28}+0.36\%$
test_ddpg_speed[False-backward] 4.6252ms 4.2882ms 233.1992 Ops/s 231.3387 Ops/s $\color{#35bf28}+0.80\%$
test_ddpg_speed[True-None] 1.3347ms 1.2927ms 773.6006 Ops/s 768.1129 Ops/s $\color{#35bf28}+0.71\%$
test_ddpg_speed[True-backward] 2.5378ms 2.4977ms 400.3729 Ops/s 397.6719 Ops/s $\color{#35bf28}+0.68\%$
test_ddpg_speed[reduce-overhead-None] 1.4004ms 1.3175ms 759.0088 Ops/s 751.7818 Ops/s $\color{#35bf28}+0.96\%$
test_sac_speed[False-None] 8.8765ms 8.3070ms 120.3811 Ops/s 119.7879 Ops/s $\color{#35bf28}+0.50\%$
test_sac_speed[False-backward] 12.2700ms 11.5844ms 86.3230 Ops/s 85.7299 Ops/s $\color{#35bf28}+0.69\%$
test_sac_speed[True-None] 1.8502ms 1.7875ms 559.4378 Ops/s 546.4485 Ops/s $\color{#35bf28}+2.38\%$
test_sac_speed[True-backward] 3.6195ms 3.5691ms 280.1844 Ops/s 295.8229 Ops/s $\textbf{\color{#d91a1a}-5.29\%}$
test_sac_speed[reduce-overhead-None] 0.3678s 11.7878ms 84.8335 Ops/s 93.4140 Ops/s $\textbf{\color{#d91a1a}-9.19\%}$
test_redq_deprec_speed[False-None] 9.7881ms 9.2394ms 108.2322 Ops/s 107.7293 Ops/s $\color{#35bf28}+0.47\%$
test_redq_deprec_speed[False-backward] 13.2187ms 12.6868ms 78.8221 Ops/s 80.7383 Ops/s $\color{#d91a1a}-2.37\%$
test_redq_deprec_speed[True-None] 3.5127ms 2.5485ms 392.3804 Ops/s 392.3285 Ops/s $\color{#35bf28}+0.01\%$
test_redq_deprec_speed[True-backward] 4.6073ms 4.2325ms 236.2673 Ops/s 228.7489 Ops/s $\color{#35bf28}+3.29\%$
test_redq_deprec_speed[reduce-overhead-None] 16.1031ms 9.8158ms 101.8767 Ops/s 101.7734 Ops/s $\color{#35bf28}+0.10\%$
test_td3_speed[False-None] 8.4350ms 8.2705ms 120.9119 Ops/s 120.8214 Ops/s $\color{#35bf28}+0.07\%$
test_td3_speed[False-backward] 11.0203ms 10.6987ms 93.4693 Ops/s 91.2263 Ops/s $\color{#35bf28}+2.46\%$
test_td3_speed[True-None] 1.7083ms 1.6441ms 608.2408 Ops/s 616.7081 Ops/s $\color{#d91a1a}-1.37\%$
test_td3_speed[True-backward] 3.2777ms 3.1029ms 322.2758 Ops/s 314.0345 Ops/s $\color{#35bf28}+2.62\%$
test_td3_speed[reduce-overhead-None] 46.4209ms 23.7952ms 42.0252 Ops/s 40.6388 Ops/s $\color{#35bf28}+3.41\%$
test_cql_speed[False-None] 17.6665ms 17.2813ms 57.8661 Ops/s 58.3444 Ops/s $\color{#d91a1a}-0.82\%$
test_cql_speed[False-backward] 23.2128ms 22.6620ms 44.1267 Ops/s 44.1527 Ops/s $\color{#d91a1a}-0.06\%$
test_cql_speed[True-None] 3.4260ms 3.2462ms 308.0499 Ops/s 312.6702 Ops/s $\color{#d91a1a}-1.48\%$
test_cql_speed[True-backward] 5.8559ms 5.3613ms 186.5229 Ops/s 190.8055 Ops/s $\color{#d91a1a}-2.24\%$
test_cql_speed[reduce-overhead-None] 0.6871s 15.3760ms 65.0363 Ops/s 83.8567 Ops/s $\textbf{\color{#d91a1a}-22.44\%}$
test_a2c_speed[False-None] 3.9548ms 3.2783ms 305.0320 Ops/s 303.7790 Ops/s $\color{#35bf28}+0.41\%$
test_a2c_speed[False-backward] 6.7288ms 6.2897ms 158.9891 Ops/s 152.9718 Ops/s $\color{#35bf28}+3.93\%$
test_a2c_speed[True-None] 1.5335ms 1.3278ms 753.1515 Ops/s 734.6654 Ops/s $\color{#35bf28}+2.52\%$
test_a2c_speed[True-backward] 3.2067ms 2.9624ms 337.5591 Ops/s 339.5649 Ops/s $\color{#d91a1a}-0.59\%$
test_a2c_speed[reduce-overhead-None] 1.2024ms 0.9881ms 1.0121 KOps/s 1.0208 KOps/s $\color{#d91a1a}-0.85\%$
test_ppo_speed[False-None] 4.0425ms 3.9068ms 255.9624 Ops/s 260.7333 Ops/s $\color{#d91a1a}-1.83\%$
test_ppo_speed[False-backward] 7.6636ms 7.1017ms 140.8120 Ops/s 144.2268 Ops/s $\color{#d91a1a}-2.37\%$
test_ppo_speed[True-None] 1.6549ms 1.4173ms 705.5828 Ops/s 702.7747 Ops/s $\color{#35bf28}+0.40\%$
test_ppo_speed[True-backward] 3.0864ms 3.0510ms 327.7600 Ops/s 307.9649 Ops/s $\textbf{\color{#35bf28}+6.43\%}$
test_ppo_speed[reduce-overhead-None] 1.1106ms 1.0487ms 953.5794 Ops/s 914.2238 Ops/s $\color{#35bf28}+4.30\%$
test_reinforce_speed[False-None] 2.4050ms 2.3088ms 433.1235 Ops/s 431.7163 Ops/s $\color{#35bf28}+0.33\%$
test_reinforce_speed[False-backward] 3.4184ms 3.3476ms 298.7254 Ops/s 298.5431 Ops/s $\color{#35bf28}+0.06\%$
test_reinforce_speed[True-None] 1.5101ms 1.2842ms 778.6658 Ops/s 779.5397 Ops/s $\color{#d91a1a}-0.11\%$
test_reinforce_speed[True-backward] 2.9599ms 2.8824ms 346.9309 Ops/s 324.4374 Ops/s $\textbf{\color{#35bf28}+6.93\%}$
test_reinforce_speed[reduce-overhead-None] 17.5432ms 9.5854ms 104.3249 Ops/s 105.0080 Ops/s $\color{#d91a1a}-0.65\%$
test_iql_speed[False-None] 10.0591ms 9.4830ms 105.4520 Ops/s 106.1949 Ops/s $\color{#d91a1a}-0.70\%$
test_iql_speed[False-backward] 13.7331ms 13.2559ms 75.4384 Ops/s 77.0311 Ops/s $\color{#d91a1a}-2.07\%$
test_iql_speed[True-None] 2.4769ms 2.1718ms 460.4519 Ops/s 455.9638 Ops/s $\color{#35bf28}+0.98\%$
test_iql_speed[True-backward] 4.8996ms 4.7220ms 211.7739 Ops/s 205.3635 Ops/s $\color{#35bf28}+3.12\%$
test_iql_speed[reduce-overhead-None] 18.0106ms 10.6459ms 93.9327 Ops/s 73.5631 Ops/s $\textbf{\color{#35bf28}+27.69\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3734ms 6.0108ms 166.3681 Ops/s 163.7765 Ops/s $\color{#35bf28}+1.58\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0609ms 0.3314ms 3.0174 KOps/s 2.8157 KOps/s $\textbf{\color{#35bf28}+7.16\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7189ms 0.2821ms 3.5453 KOps/s 3.0124 KOps/s $\textbf{\color{#35bf28}+17.69\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9621ms 5.7387ms 174.2551 Ops/s 168.7051 Ops/s $\color{#35bf28}+3.29\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5952ms 0.3559ms 2.8095 KOps/s 2.5673 KOps/s $\textbf{\color{#35bf28}+9.43\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6154ms 0.3378ms 2.9607 KOps/s 2.8677 KOps/s $\color{#35bf28}+3.24\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.7605ms 1.3977ms 715.4457 Ops/s 696.3536 Ops/s $\color{#35bf28}+2.74\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5665ms 1.3150ms 760.4389 Ops/s 710.1871 Ops/s $\textbf{\color{#35bf28}+7.08\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3888ms 5.9864ms 167.0457 Ops/s 164.0427 Ops/s $\color{#35bf28}+1.83\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2503ms 0.4361ms 2.2928 KOps/s 2.3219 KOps/s $\color{#d91a1a}-1.25\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6652ms 0.4400ms 2.2726 KOps/s 2.1069 KOps/s $\textbf{\color{#35bf28}+7.86\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9692ms 5.8775ms 170.1402 Ops/s 166.5311 Ops/s $\color{#35bf28}+2.17\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.0895ms 0.3418ms 2.9260 KOps/s 3.4693 KOps/s $\textbf{\color{#d91a1a}-15.66\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5423ms 0.3265ms 3.0630 KOps/s 3.6890 KOps/s $\textbf{\color{#d91a1a}-16.97\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0589ms 5.8208ms 171.7981 Ops/s 168.3408 Ops/s $\color{#35bf28}+2.05\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8165ms 0.3758ms 2.6613 KOps/s 3.1919 KOps/s $\textbf{\color{#d91a1a}-16.62\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5226ms 0.3192ms 3.1325 KOps/s 3.7083 KOps/s $\textbf{\color{#d91a1a}-15.53\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1850ms 5.9788ms 167.2563 Ops/s 163.0718 Ops/s $\color{#35bf28}+2.57\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0089ms 0.4963ms 2.0150 KOps/s 1.9744 KOps/s $\color{#35bf28}+2.05\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6116ms 0.4193ms 2.3847 KOps/s 2.1256 KOps/s $\textbf{\color{#35bf28}+12.19\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5880s 16.7276ms 59.7813 Ops/s 50.8491 Ops/s $\textbf{\color{#35bf28}+17.57\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 4.1434ms 1.8948ms 527.7602 Ops/s 493.4286 Ops/s $\textbf{\color{#35bf28}+6.96\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.9520ms 1.2886ms 776.0075 Ops/s 790.1595 Ops/s $\color{#d91a1a}-1.79\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.7367ms 5.1265ms 195.0659 Ops/s 194.7735 Ops/s $\color{#35bf28}+0.15\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 11.2216ms 1.9986ms 500.3419 Ops/s 507.3703 Ops/s $\color{#d91a1a}-1.39\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.1895ms 1.2873ms 776.8031 Ops/s 1.0848 KOps/s $\textbf{\color{#d91a1a}-28.39\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.5992ms 5.2243ms 191.4117 Ops/s 185.7216 Ops/s $\color{#35bf28}+3.06\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.1584ms 2.0012ms 499.7001 Ops/s 518.5096 Ops/s $\color{#d91a1a}-3.63\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.8089ms 1.1419ms 875.7589 Ops/s 841.7692 Ops/s $\color{#35bf28}+4.04\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 39.3911ms 35.8293ms 27.9101 Ops/s 27.4143 Ops/s $\color{#35bf28}+1.81\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.5338ms 18.0037ms 55.5441 Ops/s 54.8191 Ops/s $\color{#35bf28}+1.32\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.6289ms 37.4267ms 26.7189 Ops/s 26.3601 Ops/s $\color{#35bf28}+1.36\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.5497ms 18.5997ms 53.7643 Ops/s 53.3084 Ops/s $\color{#35bf28}+0.86\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.7482ms 39.1346ms 25.5528 Ops/s 25.2061 Ops/s $\color{#35bf28}+1.38\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 22.0649ms 20.3195ms 49.2138 Ops/s 49.3016 Ops/s $\color{#d91a1a}-0.18\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8892ms 0.2205ms 4.5355 KOps/s 4.5885 KOps/s $\color{#d91a1a}-1.16\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.6169ms 1.4150ms 706.7357 Ops/s 708.1132 Ops/s $\color{#d91a1a}-0.19\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.6025ms 2.2903ms 436.6299 Ops/s 432.6899 Ops/s $\color{#35bf28}+0.91\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.1628ms 2.9269ms 341.6623 Ops/s 341.9737 Ops/s $\color{#d91a1a}-0.09\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2387ms 0.1618ms 6.1810 KOps/s 6.2261 KOps/s $\color{#d91a1a}-0.72\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3701ms 0.2237ms 4.4703 KOps/s 4.5376 KOps/s $\color{#d91a1a}-1.48\%$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9276ms 1.8092ms 552.7194 Ops/s 536.5891 Ops/s $\color{#35bf28}+3.01\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.4499ms 1.3310ms 751.2917 Ops/s 726.9620 Ops/s $\color{#35bf28}+3.35\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2635ms 1.1562ms 864.9097 Ops/s 867.8736 Ops/s $\color{#d91a1a}-0.34\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.9092ms 3.7417ms 267.2598 Ops/s 277.5280 Ops/s $\color{#d91a1a}-3.70\%$
test_collector_stack_then_write[100-img_shape2-large_img] 10.9096ms 5.8460ms 171.0571 Ops/s 174.6251 Ops/s $\color{#d91a1a}-2.04\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 15.0787ms 7.0360ms 142.1261 Ops/s 141.5573 Ops/s $\color{#35bf28}+0.40\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.5058ms 0.2720ms 3.6761 KOps/s 3.6723 KOps/s $\color{#35bf28}+0.10\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6726ms 1.5151ms 660.0204 Ops/s 656.1176 Ops/s $\color{#35bf28}+0.59\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.5757ms 2.4560ms 407.1643 Ops/s 413.3153 Ops/s $\color{#d91a1a}-1.49\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.3211ms 3.1495ms 317.5125 Ops/s 320.3369 Ops/s $\color{#d91a1a}-0.88\%$
test_collector_without_rb[100-img_shape0-atari] 35.0682ms 34.3942ms 29.0747 Ops/s 29.0946 Ops/s $\color{#d91a1a}-0.07\%$
test_collector_without_rb[200-img_shape1-large_batch] 68.1071ms 67.4734ms 14.8206 Ops/s 14.7822 Ops/s $\color{#35bf28}+0.26\%$
test_collector_with_rb[100-img_shape0-atari] 39.0764ms 38.6240ms 25.8907 Ops/s 25.7270 Ops/s $\color{#35bf28}+0.64\%$
test_collector_with_rb[200-img_shape1-large_batch] 76.6019ms 75.9051ms 13.1743 Ops/s 13.1270 Ops/s $\color{#35bf28}+0.36\%$
test_collector_without_rb_cuda[100-img_shape0-atari] 59.0903ms 57.8849ms 17.2757 Ops/s 17.5751 Ops/s $\color{#d91a1a}-1.70\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1169s 0.1149s 8.7032 Ops/s 8.8493 Ops/s $\color{#d91a1a}-1.65\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 60.3938ms 59.5358ms 16.7966 Ops/s 17.0184 Ops/s $\color{#d91a1a}-1.30\%$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1201s 0.1183s 8.4517 Ops/s 8.5326 Ops/s $\color{#d91a1a}-0.95\%$

@vmoens
Copy link
Collaborator Author

vmoens commented Feb 10, 2026

Superseded by rebuilt stack (cleanup folded into the right commits)

@vmoens vmoens closed this Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant