Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Feb 10, 2026

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 10, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3482

Note: Links to docs will display an error until the docs builds have been completed.

❌ 11 New Failures

As of commit 7844461 with merge base 0bc6d20 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Feb 10, 2026
Move importlib, os, Bounded, and Unbounded imports to module level.
Keep local imports only where strictly necessary:
- isaaclab/gymnasium: optional deps, imported inside isaaclab branch
- Ray closure bodies: serialised by cloudpickle, run in fresh processes
- GRUCell: conditional on config flag

Co-authored-by: Cursor <[email protected]>
ghstack-source-id: 208baaf
Pull-Request: #3482
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 10, 2026
@github-actions github-actions bot added Refactoring Refactoring of an existing feature sota-implementations/ and removed Refactoring Refactoring of an existing feature labels Feb 10, 2026
@github-actions github-actions bot added the Refactoring Refactoring of an existing feature label Feb 10, 2026
@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}19$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 81.8493μs 80.3857μs 12.4400 KOps/s 12.3252 KOps/s $\color{#35bf28}+0.93\%$
test_tensor_to_bytestream_speed[torch.save] 0.1404ms 0.1400ms 7.1453 KOps/s 7.2250 KOps/s $\color{#d91a1a}-1.10\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1228s 0.1225s 8.1631 Ops/s 9.0569 Ops/s $\textbf{\color{#d91a1a}-9.87\%}$
test_tensor_to_bytestream_speed[numpy] 2.4673μs 2.4473μs 408.6218 KOps/s 391.8299 KOps/s $\color{#35bf28}+4.29\%$
test_tensor_to_bytestream_speed[safetensors] 38.8287μs 38.3997μs 26.0419 KOps/s 25.8401 KOps/s $\color{#35bf28}+0.78\%$
test_simple 0.7943s 0.7930s 1.2610 Ops/s 1.2286 Ops/s $\color{#35bf28}+2.64\%$
test_transformed 1.5419s 1.4492s 0.6900 Ops/s 0.6936 Ops/s $\color{#d91a1a}-0.51\%$
test_serial 2.4095s 2.3149s 0.4320 Ops/s 0.4344 Ops/s $\color{#d91a1a}-0.56\%$
test_parallel 1.9002s 1.8310s 0.5461 Ops/s 0.5590 Ops/s $\color{#d91a1a}-2.30\%$
test_step_mdp_speed[True-True-True-True-True] 0.3679ms 44.5249μs 22.4593 KOps/s 21.8573 KOps/s $\color{#35bf28}+2.75\%$
test_step_mdp_speed[True-True-True-True-False] 54.2710μs 25.0428μs 39.9317 KOps/s 39.0749 KOps/s $\color{#35bf28}+2.19\%$
test_step_mdp_speed[True-True-True-False-True] 72.9020μs 25.1577μs 39.7493 KOps/s 39.5182 KOps/s $\color{#35bf28}+0.58\%$
test_step_mdp_speed[True-True-True-False-False] 46.9810μs 13.8718μs 72.0885 KOps/s 69.3818 KOps/s $\color{#35bf28}+3.90\%$
test_step_mdp_speed[True-True-False-True-True] 0.1004ms 47.9603μs 20.8506 KOps/s 20.9366 KOps/s $\color{#d91a1a}-0.41\%$
test_step_mdp_speed[True-True-False-True-False] 69.3710μs 27.6937μs 36.1092 KOps/s 36.0666 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[True-True-False-False-True] 80.4110μs 27.8782μs 35.8703 KOps/s 35.3679 KOps/s $\color{#35bf28}+1.42\%$
test_step_mdp_speed[True-True-False-False-False] 55.3210μs 16.4871μs 60.6535 KOps/s 59.3783 KOps/s $\color{#35bf28}+2.15\%$
test_step_mdp_speed[True-False-True-True-True] 94.0410μs 50.6387μs 19.7477 KOps/s 19.7678 KOps/s $\color{#d91a1a}-0.10\%$
test_step_mdp_speed[True-False-True-True-False] 63.7310μs 30.5176μs 32.7679 KOps/s 32.0754 KOps/s $\color{#35bf28}+2.16\%$
test_step_mdp_speed[True-False-True-False-True] 74.8720μs 28.0140μs 35.6965 KOps/s 36.3298 KOps/s $\color{#d91a1a}-1.74\%$
test_step_mdp_speed[True-False-True-False-False] 54.8410μs 16.7674μs 59.6397 KOps/s 59.7877 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[True-False-False-True-True] 94.6810μs 53.1489μs 18.8151 KOps/s 19.0428 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[True-False-False-True-False] 74.4010μs 33.2256μs 30.0972 KOps/s 30.4477 KOps/s $\color{#d91a1a}-1.15\%$
test_step_mdp_speed[True-False-False-False-True] 66.1110μs 30.1968μs 33.1160 KOps/s 33.0044 KOps/s $\color{#35bf28}+0.34\%$
test_step_mdp_speed[True-False-False-False-False] 62.1510μs 19.3711μs 51.6234 KOps/s 49.8990 KOps/s $\color{#35bf28}+3.46\%$
test_step_mdp_speed[False-True-True-True-True] 0.1158ms 49.4340μs 20.2290 KOps/s 19.8595 KOps/s $\color{#35bf28}+1.86\%$
test_step_mdp_speed[False-True-True-True-False] 60.4410μs 30.6551μs 32.6210 KOps/s 33.1221 KOps/s $\color{#d91a1a}-1.51\%$
test_step_mdp_speed[False-True-True-False-True] 2.3127ms 31.3925μs 31.8548 KOps/s 31.2277 KOps/s $\color{#35bf28}+2.01\%$
test_step_mdp_speed[False-True-True-False-False] 50.6710μs 18.3892μs 54.3797 KOps/s 54.9999 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[False-True-False-True-True] 0.1044ms 53.3897μs 18.7302 KOps/s 18.8691 KOps/s $\color{#d91a1a}-0.74\%$
test_step_mdp_speed[False-True-False-True-False] 65.7910μs 33.5077μs 29.8439 KOps/s 29.8842 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[False-True-False-False-True] 67.0010μs 34.3152μs 29.1416 KOps/s 29.3894 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[False-True-False-False-False] 50.3210μs 21.1006μs 47.3919 KOps/s 47.7083 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[False-False-True-True-True] 98.0720μs 55.8756μs 17.8969 KOps/s 18.0712 KOps/s $\color{#d91a1a}-0.96\%$
test_step_mdp_speed[False-False-True-True-False] 68.5810μs 36.5561μs 27.3552 KOps/s 27.6821 KOps/s $\color{#d91a1a}-1.18\%$
test_step_mdp_speed[False-False-True-False-True] 72.2020μs 34.6429μs 28.8659 KOps/s 28.4847 KOps/s $\color{#35bf28}+1.34\%$
test_step_mdp_speed[False-False-True-False-False] 50.2210μs 21.3964μs 46.7369 KOps/s 47.4768 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[False-False-False-True-True] 0.1004ms 57.8678μs 17.2808 KOps/s 17.3605 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[False-False-False-True-False] 73.8110μs 39.0388μs 25.6155 KOps/s 25.8827 KOps/s $\color{#d91a1a}-1.03\%$
test_step_mdp_speed[False-False-False-False-True] 71.2810μs 36.6527μs 27.2831 KOps/s 28.2705 KOps/s $\color{#d91a1a}-3.49\%$
test_step_mdp_speed[False-False-False-False-False] 48.4510μs 23.7362μs 42.1297 KOps/s 42.5777 KOps/s $\color{#d91a1a}-1.05\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8646s 0.7738s 1.2924 Ops/s 1.3066 Ops/s $\color{#d91a1a}-1.09\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7286s 0.6329s 1.5801 Ops/s 1.5764 Ops/s $\color{#35bf28}+0.23\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7664s 1.6954s 0.5898 Ops/s 0.5830 Ops/s $\color{#35bf28}+1.18\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5283s 1.4609s 0.6845 Ops/s 0.6754 Ops/s $\color{#35bf28}+1.35\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0609s 1.9767s 0.5059 Ops/s 0.5067 Ops/s $\color{#d91a1a}-0.16\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7798s 1.7024s 0.5874 Ops/s 0.5824 Ops/s $\color{#35bf28}+0.85\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.8458s 4.6856s 0.2134 Ops/s 0.2158 Ops/s $\color{#d91a1a}-1.10\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.6494s 4.5700s 0.2188 Ops/s 0.2269 Ops/s $\color{#d91a1a}-3.56\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.0356s 1.9519s 0.5123 Ops/s 0.5278 Ops/s $\color{#d91a1a}-2.93\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6951s 1.6214s 0.6167 Ops/s 0.6010 Ops/s $\color{#35bf28}+2.61\%$
test_values[generalized_advantage_estimate-True-True] 20.6543ms 20.0934ms 49.7675 Ops/s 48.6551 Ops/s $\color{#35bf28}+2.29\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1424s 3.7694ms 265.2926 Ops/s 286.4900 Ops/s $\textbf{\color{#d91a1a}-7.40\%}$
test_values[td0_return_estimate-False-False] 0.1098ms 83.6944μs 11.9482 KOps/s 11.9345 KOps/s $\color{#35bf28}+0.11\%$
test_values[td1_return_estimate-False-False] 48.7350ms 48.3279ms 20.6920 Ops/s 20.5362 Ops/s $\color{#35bf28}+0.76\%$
test_values[vec_td1_return_estimate-False-False] 1.3084ms 1.0903ms 917.1944 Ops/s 915.8925 Ops/s $\color{#35bf28}+0.14\%$
test_values[td_lambda_return_estimate-True-False] 79.9082ms 79.2836ms 12.6129 Ops/s 12.5648 Ops/s $\color{#35bf28}+0.38\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2747ms 1.0851ms 921.6008 Ops/s 917.2946 Ops/s $\color{#35bf28}+0.47\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 20.5056ms 20.2910ms 49.2828 Ops/s 48.0354 Ops/s $\color{#35bf28}+2.60\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0627ms 0.7673ms 1.3033 KOps/s 1.3194 KOps/s $\color{#d91a1a}-1.21\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7350ms 0.6799ms 1.4709 KOps/s 1.4431 KOps/s $\color{#35bf28}+1.93\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5873ms 1.4942ms 669.2490 Ops/s 668.4803 Ops/s $\color{#35bf28}+0.11\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7702ms 0.6978ms 1.4331 KOps/s 1.4365 KOps/s $\color{#d91a1a}-0.24\%$
test_dqn_speed[False-None] 1.6370ms 1.5396ms 649.5214 Ops/s 641.2005 Ops/s $\color{#35bf28}+1.30\%$
test_dqn_speed[False-backward] 2.5691ms 2.2037ms 453.7793 Ops/s 454.2428 Ops/s $\color{#d91a1a}-0.10\%$
test_dqn_speed[True-None] 1.3050ms 0.5721ms 1.7480 KOps/s 1.7674 KOps/s $\color{#d91a1a}-1.10\%$
test_dqn_speed[True-backward] 1.1293ms 1.1005ms 908.6388 Ops/s 823.1539 Ops/s $\textbf{\color{#35bf28}+10.39\%}$
test_dqn_speed[reduce-overhead-None] 0.6490ms 0.5881ms 1.7005 KOps/s 1.6299 KOps/s $\color{#35bf28}+4.33\%$
test_ddpg_speed[False-None] 3.2952ms 2.9580ms 338.0643 Ops/s 341.7919 Ops/s $\color{#d91a1a}-1.09\%$
test_ddpg_speed[False-backward] 4.8369ms 4.2761ms 233.8597 Ops/s 228.1719 Ops/s $\color{#35bf28}+2.49\%$
test_ddpg_speed[True-None] 1.4430ms 1.3258ms 754.2369 Ops/s 748.0383 Ops/s $\color{#35bf28}+0.83\%$
test_ddpg_speed[True-backward] 2.5258ms 2.4039ms 415.9873 Ops/s 393.1547 Ops/s $\textbf{\color{#35bf28}+5.81\%}$
test_ddpg_speed[reduce-overhead-None] 1.5006ms 1.3649ms 732.6299 Ops/s 735.4838 Ops/s $\color{#d91a1a}-0.39\%$
test_sac_speed[False-None] 9.0804ms 8.5245ms 117.3091 Ops/s 117.8281 Ops/s $\color{#d91a1a}-0.44\%$
test_sac_speed[False-backward] 11.9958ms 11.5440ms 86.6252 Ops/s 84.5313 Ops/s $\color{#35bf28}+2.48\%$
test_sac_speed[True-None] 1.9377ms 1.8171ms 550.3301 Ops/s 542.1488 Ops/s $\color{#35bf28}+1.51\%$
test_sac_speed[True-backward] 3.5534ms 3.4449ms 290.2821 Ops/s 288.5868 Ops/s $\color{#35bf28}+0.59\%$
test_sac_speed[reduce-overhead-None] 0.3690s 12.0024ms 83.3163 Ops/s 91.3331 Ops/s $\textbf{\color{#d91a1a}-8.78\%}$
test_redq_deprec_speed[False-None] 9.9530ms 9.6446ms 103.6845 Ops/s 105.2216 Ops/s $\color{#d91a1a}-1.46\%$
test_redq_deprec_speed[False-backward] 13.6712ms 13.0889ms 76.4007 Ops/s 78.9574 Ops/s $\color{#d91a1a}-3.24\%$
test_redq_deprec_speed[True-None] 2.8076ms 2.5504ms 392.0901 Ops/s 373.8488 Ops/s $\color{#35bf28}+4.88\%$
test_redq_deprec_speed[True-backward] 4.3698ms 4.3125ms 231.8860 Ops/s 229.9125 Ops/s $\color{#35bf28}+0.86\%$
test_redq_deprec_speed[reduce-overhead-None] 16.1869ms 9.8949ms 101.0620 Ops/s 101.3882 Ops/s $\color{#d91a1a}-0.32\%$
test_td3_speed[False-None] 8.4263ms 8.2969ms 120.5270 Ops/s 119.0090 Ops/s $\color{#35bf28}+1.28\%$
test_td3_speed[False-backward] 11.6307ms 10.9352ms 91.4477 Ops/s 90.9731 Ops/s $\color{#35bf28}+0.52\%$
test_td3_speed[True-None] 1.6736ms 1.6430ms 608.6522 Ops/s 614.6504 Ops/s $\color{#d91a1a}-0.98\%$
test_td3_speed[True-backward] 3.3229ms 3.2694ms 305.8661 Ops/s 321.5764 Ops/s $\color{#d91a1a}-4.89\%$
test_td3_speed[reduce-overhead-None] 46.0545ms 23.8963ms 41.8476 Ops/s 40.2712 Ops/s $\color{#35bf28}+3.91\%$
test_cql_speed[False-None] 17.7091ms 17.3594ms 57.6056 Ops/s 57.4419 Ops/s $\color{#35bf28}+0.28\%$
test_cql_speed[False-backward] 23.7291ms 23.1370ms 43.2209 Ops/s 43.7415 Ops/s $\color{#d91a1a}-1.19\%$
test_cql_speed[True-None] 3.7366ms 3.2757ms 305.2772 Ops/s 306.7598 Ops/s $\color{#d91a1a}-0.48\%$
test_cql_speed[True-backward] 5.9535ms 5.5559ms 179.9877 Ops/s 179.0859 Ops/s $\color{#35bf28}+0.50\%$
test_cql_speed[reduce-overhead-None] 0.6917s 15.4881ms 64.5656 Ops/s 84.1365 Ops/s $\textbf{\color{#d91a1a}-23.26\%}$
test_a2c_speed[False-None] 3.9779ms 3.3065ms 302.4329 Ops/s 303.0337 Ops/s $\color{#d91a1a}-0.20\%$
test_a2c_speed[False-backward] 6.9025ms 6.5350ms 153.0227 Ops/s 152.3973 Ops/s $\color{#35bf28}+0.41\%$
test_a2c_speed[True-None] 1.5676ms 1.3528ms 739.1985 Ops/s 738.3568 Ops/s $\color{#35bf28}+0.11\%$
test_a2c_speed[True-backward] 3.1608ms 3.1219ms 320.3127 Ops/s 319.5733 Ops/s $\color{#35bf28}+0.23\%$
test_a2c_speed[reduce-overhead-None] 1.0843ms 0.9906ms 1.0094 KOps/s 1.0171 KOps/s $\color{#d91a1a}-0.76\%$
test_ppo_speed[False-None] 4.0620ms 3.9014ms 256.3184 Ops/s 256.2877 Ops/s $\color{#35bf28}+0.01\%$
test_ppo_speed[False-backward] 7.7569ms 7.3194ms 136.6224 Ops/s 135.3843 Ops/s $\color{#35bf28}+0.91\%$
test_ppo_speed[True-None] 1.5038ms 1.4256ms 701.4621 Ops/s 698.8051 Ops/s $\color{#35bf28}+0.38\%$
test_ppo_speed[True-backward] 3.3463ms 3.2687ms 305.9317 Ops/s 323.8102 Ops/s $\textbf{\color{#d91a1a}-5.52\%}$
test_ppo_speed[reduce-overhead-None] 1.1381ms 1.0487ms 953.5228 Ops/s 921.8580 Ops/s $\color{#35bf28}+3.43\%$
test_reinforce_speed[False-None] 2.4124ms 2.3097ms 432.9580 Ops/s 433.2143 Ops/s $\color{#d91a1a}-0.06\%$
test_reinforce_speed[False-backward] 4.0997ms 3.5218ms 283.9464 Ops/s 296.7607 Ops/s $\color{#d91a1a}-4.32\%$
test_reinforce_speed[True-None] 1.3419ms 1.2882ms 776.2636 Ops/s 781.2709 Ops/s $\color{#d91a1a}-0.64\%$
test_reinforce_speed[True-backward] 2.9588ms 2.9164ms 342.8843 Ops/s 323.2969 Ops/s $\textbf{\color{#35bf28}+6.06\%}$
test_reinforce_speed[reduce-overhead-None] 17.5266ms 9.5505ms 104.7061 Ops/s 105.4470 Ops/s $\color{#d91a1a}-0.70\%$
test_iql_speed[False-None] 10.0955ms 9.5298ms 104.9335 Ops/s 104.5943 Ops/s $\color{#35bf28}+0.32\%$
test_iql_speed[False-backward] 13.8294ms 13.2777ms 75.3142 Ops/s 73.0558 Ops/s $\color{#35bf28}+3.09\%$
test_iql_speed[True-None] 2.3784ms 2.1943ms 455.7223 Ops/s 458.0374 Ops/s $\color{#d91a1a}-0.51\%$
test_iql_speed[True-backward] 4.9261ms 4.7532ms 210.3863 Ops/s 206.9188 Ops/s $\color{#35bf28}+1.68\%$
test_iql_speed[reduce-overhead-None] 17.8162ms 10.4615ms 95.5884 Ops/s 74.6490 Ops/s $\textbf{\color{#35bf28}+28.05\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1566ms 5.9863ms 167.0482 Ops/s 166.9861 Ops/s $\color{#35bf28}+0.04\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7476ms 0.2823ms 3.5420 KOps/s 2.6023 KOps/s $\textbf{\color{#35bf28}+36.11\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5132ms 0.2643ms 3.7839 KOps/s 2.8129 KOps/s $\textbf{\color{#35bf28}+34.52\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9116ms 5.6837ms 175.9404 Ops/s 171.7683 Ops/s $\color{#35bf28}+2.43\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.6021ms 0.2757ms 3.6267 KOps/s 2.8423 KOps/s $\textbf{\color{#35bf28}+27.60\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4735ms 0.2604ms 3.8402 KOps/s 3.2565 KOps/s $\textbf{\color{#35bf28}+17.93\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6536ms 1.2662ms 789.7359 Ops/s 677.6686 Ops/s $\textbf{\color{#35bf28}+16.54\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.3980ms 1.1824ms 845.7164 Ops/s 724.3746 Ops/s $\textbf{\color{#35bf28}+16.75\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0174ms 5.9100ms 169.2058 Ops/s 166.2313 Ops/s $\color{#35bf28}+1.79\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8703ms 0.4846ms 2.0637 KOps/s 2.1247 KOps/s $\color{#d91a1a}-2.87\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6849ms 0.4212ms 2.3744 KOps/s 2.2453 KOps/s $\textbf{\color{#35bf28}+5.75\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9233ms 5.8463ms 171.0492 Ops/s 170.7527 Ops/s $\color{#35bf28}+0.17\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.7251ms 0.3423ms 2.9218 KOps/s 2.9766 KOps/s $\color{#d91a1a}-1.84\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4945ms 0.3222ms 3.1040 KOps/s 3.7954 KOps/s $\textbf{\color{#d91a1a}-18.22\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9606ms 5.7378ms 174.2841 Ops/s 172.7559 Ops/s $\color{#35bf28}+0.88\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6322ms 0.3265ms 3.0631 KOps/s 3.6232 KOps/s $\textbf{\color{#d91a1a}-15.46\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4733ms 0.2680ms 3.7316 KOps/s 2.9136 KOps/s $\textbf{\color{#35bf28}+28.08\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0285ms 5.9010ms 169.4616 Ops/s 165.8530 Ops/s $\color{#35bf28}+2.18\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8760ms 0.4863ms 2.0564 KOps/s 1.9537 KOps/s $\textbf{\color{#35bf28}+5.26\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6378ms 0.4106ms 2.4356 KOps/s 2.2485 KOps/s $\textbf{\color{#35bf28}+8.32\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5897s 16.7827ms 59.5852 Ops/s 51.7094 Ops/s $\textbf{\color{#35bf28}+15.23\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 4.0571ms 1.8973ms 527.0704 Ops/s 512.1279 Ops/s $\color{#35bf28}+2.92\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.0580ms 0.9398ms 1.0640 KOps/s 778.3786 Ops/s $\textbf{\color{#35bf28}+36.70\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.8517ms 5.1047ms 195.8988 Ops/s 196.6376 Ops/s $\color{#d91a1a}-0.38\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 12.8365ms 2.0557ms 486.4460 Ops/s 558.5615 Ops/s $\textbf{\color{#d91a1a}-12.91\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.0373ms 1.2564ms 795.9409 Ops/s 750.2785 Ops/s $\textbf{\color{#35bf28}+6.09\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.7779ms 5.2268ms 191.3217 Ops/s 187.3221 Ops/s $\color{#35bf28}+2.14\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 5.2338ms 2.0563ms 486.3065 Ops/s 61.8748 Ops/s $\textbf{\color{#35bf28}+685.95\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2068ms 1.1256ms 888.4202 Ops/s 858.1885 Ops/s $\color{#35bf28}+3.52\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 37.5961ms 35.5796ms 28.1060 Ops/s 27.7201 Ops/s $\color{#35bf28}+1.39\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.7791ms 18.0794ms 55.3117 Ops/s 55.4109 Ops/s $\color{#d91a1a}-0.18\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 39.7390ms 37.0296ms 27.0054 Ops/s 26.9257 Ops/s $\color{#35bf28}+0.30\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.0532ms 18.3695ms 54.4381 Ops/s 53.5373 Ops/s $\color{#35bf28}+1.68\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.1751ms 38.5851ms 25.9167 Ops/s 25.8081 Ops/s $\color{#35bf28}+0.42\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.5282ms 20.1389ms 49.6552 Ops/s 50.6234 Ops/s $\color{#d91a1a}-1.91\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8979ms 0.2242ms 4.4606 KOps/s 4.5941 KOps/s $\color{#d91a1a}-2.91\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.6112ms 1.3945ms 717.1269 Ops/s 712.6530 Ops/s $\color{#35bf28}+0.63\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.4674ms 2.3077ms 433.3351 Ops/s 439.9810 Ops/s $\color{#d91a1a}-1.51\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0640ms 2.9100ms 343.6448 Ops/s 344.4046 Ops/s $\color{#d91a1a}-0.22\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2434ms 0.1614ms 6.1971 KOps/s 6.1443 KOps/s $\color{#35bf28}+0.86\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.4221ms 0.2739ms 3.6510 KOps/s 3.3562 KOps/s $\textbf{\color{#35bf28}+8.78\%}$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9037ms 1.7728ms 564.0824 Ops/s 548.0580 Ops/s $\color{#35bf28}+2.92\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.4490ms 1.2969ms 771.0741 Ops/s 751.8936 Ops/s $\color{#35bf28}+2.55\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2873ms 1.1519ms 868.1637 Ops/s 876.8895 Ops/s $\color{#d91a1a}-1.00\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.8554ms 3.6277ms 275.6551 Ops/s 269.4557 Ops/s $\color{#35bf28}+2.30\%$
test_collector_stack_then_write[100-img_shape2-large_img] 6.0064ms 5.8181ms 171.8762 Ops/s 175.3868 Ops/s $\color{#d91a1a}-2.00\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.5935ms 7.3645ms 135.7872 Ops/s 143.3166 Ops/s $\textbf{\color{#d91a1a}-5.25\%}$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4391ms 0.2743ms 3.6457 KOps/s 3.6629 KOps/s $\color{#d91a1a}-0.47\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.7259ms 1.5392ms 649.6750 Ops/s 686.0864 Ops/s $\textbf{\color{#d91a1a}-5.31\%}$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.8732ms 2.4317ms 411.2284 Ops/s 419.0369 Ops/s $\color{#d91a1a}-1.86\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.4306ms 3.1303ms 319.4570 Ops/s 319.6687 Ops/s $\color{#d91a1a}-0.07\%$
test_collector_without_rb[100-img_shape0-atari] 34.4698ms 33.8537ms 29.5389 Ops/s 29.1730 Ops/s $\color{#35bf28}+1.25\%$
test_collector_without_rb[200-img_shape1-large_batch] 67.0496ms 66.7777ms 14.9751 Ops/s 14.8666 Ops/s $\color{#35bf28}+0.73\%$
test_collector_with_rb[100-img_shape0-atari] 39.2232ms 38.6237ms 25.8909 Ops/s 25.8801 Ops/s $\color{#35bf28}+0.04\%$
test_collector_with_rb[200-img_shape1-large_batch] 76.0673ms 75.6489ms 13.2190 Ops/s 13.2157 Ops/s $\color{#35bf28}+0.02\%$
test_collector_without_rb_cuda[100-img_shape0-atari] 56.8210ms 56.6730ms 17.6451 Ops/s 17.5258 Ops/s $\color{#35bf28}+0.68\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1130s 0.1128s 8.8644 Ops/s 8.8243 Ops/s $\color{#35bf28}+0.45\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 58.9930ms 58.5470ms 17.0803 Ops/s 16.9744 Ops/s $\color{#35bf28}+0.62\%$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1173s 0.1167s 8.5722 Ops/s 8.5209 Ops/s $\color{#35bf28}+0.60\%$

@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 173. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 82.5594μs 80.8270μs 12.3721 KOps/s 12.3204 KOps/s $\color{#35bf28}+0.42\%$
test_tensor_to_bytestream_speed[torch.save] 0.1404ms 0.1398ms 7.1530 KOps/s 7.1041 KOps/s $\color{#35bf28}+0.69\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1166s 0.1158s 8.6362 Ops/s 9.1589 Ops/s $\textbf{\color{#d91a1a}-5.71\%}$
test_tensor_to_bytestream_speed[numpy] 2.7606μs 2.7546μs 363.0321 KOps/s 376.5797 KOps/s $\color{#d91a1a}-3.60\%$
test_tensor_to_bytestream_speed[safetensors] 36.7882μs 36.4968μs 27.3997 KOps/s 26.9943 KOps/s $\color{#35bf28}+1.50\%$
test_simple 0.5734s 0.5620s 1.7795 Ops/s 1.7421 Ops/s $\color{#35bf28}+2.15\%$
test_transformed 1.1635s 1.1574s 0.8640 Ops/s 0.8621 Ops/s $\color{#35bf28}+0.22\%$
test_serial 1.7009s 1.6940s 0.5903 Ops/s 0.5871 Ops/s $\color{#35bf28}+0.55\%$
test_parallel 1.1519s 1.0567s 0.9464 Ops/s 0.9438 Ops/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[True-True-True-True-True] 0.1479ms 43.7546μs 22.8547 KOps/s 22.1189 KOps/s $\color{#35bf28}+3.33\%$
test_step_mdp_speed[True-True-True-True-False] 59.0010μs 25.1790μs 39.7156 KOps/s 38.5090 KOps/s $\color{#35bf28}+3.13\%$
test_step_mdp_speed[True-True-True-False-True] 73.6720μs 25.0535μs 39.9146 KOps/s 38.8140 KOps/s $\color{#35bf28}+2.84\%$
test_step_mdp_speed[True-True-True-False-False] 54.4710μs 13.7561μs 72.6949 KOps/s 70.3532 KOps/s $\color{#35bf28}+3.33\%$
test_step_mdp_speed[True-True-False-True-True] 96.8910μs 47.7000μs 20.9644 KOps/s 20.4307 KOps/s $\color{#35bf28}+2.61\%$
test_step_mdp_speed[True-True-False-True-False] 67.0710μs 27.8961μs 35.8473 KOps/s 34.9676 KOps/s $\color{#35bf28}+2.52\%$
test_step_mdp_speed[True-True-False-False-True] 70.8010μs 27.7041μs 36.0957 KOps/s 35.2600 KOps/s $\color{#35bf28}+2.37\%$
test_step_mdp_speed[True-True-False-False-False] 53.6110μs 16.6848μs 59.9348 KOps/s 58.4321 KOps/s $\color{#35bf28}+2.57\%$
test_step_mdp_speed[True-False-True-True-True] 97.5610μs 50.1666μs 19.9336 KOps/s 19.2925 KOps/s $\color{#35bf28}+3.32\%$
test_step_mdp_speed[True-False-True-True-False] 67.2510μs 31.0731μs 32.1821 KOps/s 31.9078 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[True-False-True-False-True] 70.8420μs 28.7971μs 34.7257 KOps/s 34.8332 KOps/s $\color{#d91a1a}-0.31\%$
test_step_mdp_speed[True-False-True-False-False] 1.0056ms 16.3487μs 61.1668 KOps/s 59.0506 KOps/s $\color{#35bf28}+3.58\%$
test_step_mdp_speed[True-False-False-True-True] 91.9710μs 52.4389μs 19.0698 KOps/s 18.8354 KOps/s $\color{#35bf28}+1.24\%$
test_step_mdp_speed[True-False-False-True-False] 65.0710μs 32.8589μs 30.4331 KOps/s 29.6174 KOps/s $\color{#35bf28}+2.75\%$
test_step_mdp_speed[True-False-False-False-True] 74.6010μs 29.7715μs 33.5892 KOps/s 32.8622 KOps/s $\color{#35bf28}+2.21\%$
test_step_mdp_speed[True-False-False-False-False] 67.7310μs 19.1828μs 52.1300 KOps/s 51.5671 KOps/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[False-True-True-True-True] 92.8810μs 50.5889μs 19.7672 KOps/s 19.6141 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[False-True-True-True-False] 72.9210μs 30.4407μs 32.8508 KOps/s 32.1332 KOps/s $\color{#35bf28}+2.23\%$
test_step_mdp_speed[False-True-True-False-True] 2.3324ms 31.7315μs 31.5145 KOps/s 31.8380 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[False-True-True-False-False] 46.8710μs 18.1087μs 55.2222 KOps/s 54.0762 KOps/s $\color{#35bf28}+2.12\%$
test_step_mdp_speed[False-True-False-True-True] 0.1042ms 52.7867μs 18.9442 KOps/s 18.8411 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[False-True-False-True-False] 72.6520μs 33.3264μs 30.0063 KOps/s 29.7312 KOps/s $\color{#35bf28}+0.93\%$
test_step_mdp_speed[False-True-False-False-True] 67.4610μs 34.1291μs 29.3005 KOps/s 29.6442 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[False-True-False-False-False] 55.4510μs 20.8820μs 47.8881 KOps/s 47.3491 KOps/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[False-False-True-True-True] 0.1012ms 55.6685μs 17.9635 KOps/s 17.8515 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[False-False-True-True-False] 77.8010μs 36.1884μs 27.6332 KOps/s 27.5641 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[False-False-True-False-True] 71.2410μs 34.1608μs 29.2733 KOps/s 29.2747 KOps/s $-0.00\%$
test_step_mdp_speed[False-False-True-False-False] 50.8610μs 20.8941μs 47.8603 KOps/s 47.0477 KOps/s $\color{#35bf28}+1.73\%$
test_step_mdp_speed[False-False-False-True-True] 97.8410μs 58.2371μs 17.1712 KOps/s 17.2304 KOps/s $\color{#d91a1a}-0.34\%$
test_step_mdp_speed[False-False-False-True-False] 80.4220μs 39.0543μs 25.6054 KOps/s 25.8735 KOps/s $\color{#d91a1a}-1.04\%$
test_step_mdp_speed[False-False-False-False-True] 72.2210μs 36.4595μs 27.4277 KOps/s 27.3819 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[False-False-False-False-False] 59.5310μs 23.6422μs 42.2973 KOps/s 42.5607 KOps/s $\color{#d91a1a}-0.62\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8758s 0.7784s 1.2847 Ops/s 1.2913 Ops/s $\color{#d91a1a}-0.52\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7352s 0.6398s 1.5630 Ops/s 1.5658 Ops/s $\color{#d91a1a}-0.18\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7780s 1.6985s 0.5887 Ops/s 0.5927 Ops/s $\color{#d91a1a}-0.67\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5459s 1.4687s 0.6809 Ops/s 0.6804 Ops/s $\color{#35bf28}+0.07\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0285s 1.9493s 0.5130 Ops/s 0.5141 Ops/s $\color{#d91a1a}-0.22\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7977s 1.7176s 0.5822 Ops/s 0.5827 Ops/s $\color{#d91a1a}-0.08\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.8422s 4.6889s 0.2133 Ops/s 0.2108 Ops/s $\color{#35bf28}+1.17\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5533s 4.4892s 0.2228 Ops/s 0.2227 Ops/s $\color{#35bf28}+0.03\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9869s 1.9060s 0.5247 Ops/s 0.5242 Ops/s $\color{#35bf28}+0.09\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7228s 1.6235s 0.6160 Ops/s 0.6078 Ops/s $\color{#35bf28}+1.34\%$
test_values[generalized_advantage_estimate-True-True] 10.9043ms 10.4463ms 95.7280 Ops/s 94.0794 Ops/s $\color{#35bf28}+1.75\%$
test_values[vec_generalized_advantage_estimate-True-True] 19.4849ms 17.7748ms 56.2594 Ops/s 88.0391 Ops/s $\textbf{\color{#d91a1a}-36.10\%}$
test_values[td0_return_estimate-False-False] 0.2257ms 0.1284ms 7.7866 KOps/s 7.6452 KOps/s $\color{#35bf28}+1.85\%$
test_values[td1_return_estimate-False-False] 28.9402ms 28.7172ms 34.8223 Ops/s 34.5373 Ops/s $\color{#35bf28}+0.83\%$
test_values[vec_td1_return_estimate-False-False] 18.7557ms 17.8997ms 55.8670 Ops/s 88.1729 Ops/s $\textbf{\color{#d91a1a}-36.64\%}$
test_values[td_lambda_return_estimate-True-False] 43.3991ms 42.6427ms 23.4507 Ops/s 23.4173 Ops/s $\color{#35bf28}+0.14\%$
test_values[vec_td_lambda_return_estimate-True-False] 18.9268ms 17.9682ms 55.6539 Ops/s 87.2534 Ops/s $\textbf{\color{#d91a1a}-36.22\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.6465ms 9.5497ms 104.7156 Ops/s 104.8284 Ops/s $\color{#d91a1a}-0.11\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.9014ms 1.4816ms 674.9319 Ops/s 643.2367 Ops/s $\color{#35bf28}+4.93\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4877ms 0.4293ms 2.3293 KOps/s 2.3075 KOps/s $\color{#35bf28}+0.95\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 35.3946ms 32.5083ms 30.7613 Ops/s 40.7062 Ops/s $\textbf{\color{#d91a1a}-24.43\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 2.1791ms 1.7503ms 571.3272 Ops/s 562.5135 Ops/s $\color{#35bf28}+1.57\%$
test_dqn_speed[False-None] 1.5349ms 1.4078ms 710.3283 Ops/s 701.6386 Ops/s $\color{#35bf28}+1.24\%$
test_dqn_speed[False-backward] 1.9622ms 1.9240ms 519.7556 Ops/s 512.0909 Ops/s $\color{#35bf28}+1.50\%$
test_dqn_speed[True-None] 0.7221ms 0.5757ms 1.7371 KOps/s 1.7837 KOps/s $\color{#d91a1a}-2.61\%$
test_dqn_speed[True-backward] 1.0535ms 1.0162ms 984.0378 Ops/s 851.0294 Ops/s $\textbf{\color{#35bf28}+15.63\%}$
test_dqn_speed[reduce-overhead-None] 0.8258ms 0.5608ms 1.7831 KOps/s 1.7389 KOps/s $\color{#35bf28}+2.54\%$
test_ddpg_speed[False-None] 3.2908ms 2.8952ms 345.4038 Ops/s 348.8629 Ops/s $\color{#d91a1a}-0.99\%$
test_ddpg_speed[False-backward] 4.2205ms 4.1107ms 243.2684 Ops/s 239.4253 Ops/s $\color{#35bf28}+1.61\%$
test_ddpg_speed[True-None] 1.5407ms 1.4414ms 693.7480 Ops/s 687.0508 Ops/s $\color{#35bf28}+0.97\%$
test_ddpg_speed[True-backward] 2.4843ms 2.4247ms 412.4203 Ops/s 393.1987 Ops/s $\color{#35bf28}+4.89\%$
test_ddpg_speed[reduce-overhead-None] 1.5504ms 1.4287ms 699.9264 Ops/s 696.9695 Ops/s $\color{#35bf28}+0.42\%$
test_sac_speed[False-None] 8.7553ms 8.1090ms 123.3205 Ops/s 123.9545 Ops/s $\color{#d91a1a}-0.51\%$
test_sac_speed[False-backward] 11.7531ms 11.3321ms 88.2446 Ops/s 88.0963 Ops/s $\color{#35bf28}+0.17\%$
test_sac_speed[True-None] 2.3217ms 2.1890ms 456.8288 Ops/s 449.2251 Ops/s $\color{#35bf28}+1.69\%$
test_sac_speed[True-backward] 4.1884ms 4.0774ms 245.2562 Ops/s 215.8045 Ops/s $\textbf{\color{#35bf28}+13.65\%}$
test_sac_speed[reduce-overhead-None] 2.5645ms 2.1771ms 459.3270 Ops/s 452.6961 Ops/s $\color{#35bf28}+1.46\%$
test_redq_speed[False-None] 16.2911ms 11.3172ms 88.3610 Ops/s 95.8379 Ops/s $\textbf{\color{#d91a1a}-7.80\%}$
test_redq_speed[False-backward] 18.6342ms 17.8549ms 56.0070 Ops/s 56.7781 Ops/s $\color{#d91a1a}-1.36\%$
test_redq_speed[True-None] 4.9298ms 4.5533ms 219.6208 Ops/s 215.9785 Ops/s $\color{#35bf28}+1.69\%$
test_redq_speed[True-backward] 10.0930ms 9.8007ms 102.0332 Ops/s 97.4468 Ops/s $\color{#35bf28}+4.71\%$
test_redq_speed[reduce-overhead-None] 4.8797ms 4.5274ms 220.8751 Ops/s 219.7099 Ops/s $\color{#35bf28}+0.53\%$
test_redq_deprec_speed[False-None] 11.7678ms 11.2061ms 89.2368 Ops/s 90.9791 Ops/s $\color{#d91a1a}-1.92\%$
test_redq_deprec_speed[False-backward] 0.3908s 23.3411ms 42.8429 Ops/s 63.5119 Ops/s $\textbf{\color{#d91a1a}-32.54\%}$
test_redq_deprec_speed[True-None] 4.2189ms 3.7059ms 269.8384 Ops/s 253.5222 Ops/s $\textbf{\color{#35bf28}+6.44\%}$
test_redq_deprec_speed[True-backward] 7.9215ms 7.6834ms 130.1515 Ops/s 123.5305 Ops/s $\textbf{\color{#35bf28}+5.36\%}$
test_redq_deprec_speed[reduce-overhead-None] 3.8849ms 3.6857ms 271.3222 Ops/s 267.1146 Ops/s $\color{#35bf28}+1.58\%$
test_td3_speed[False-None] 48.3867ms 8.4876ms 117.8195 Ops/s 123.9969 Ops/s $\color{#d91a1a}-4.98\%$
test_td3_speed[False-backward] 11.3204ms 10.9619ms 91.2250 Ops/s 91.0719 Ops/s $\color{#35bf28}+0.17\%$
test_td3_speed[True-None] 1.9439ms 1.8910ms 528.8131 Ops/s 533.9089 Ops/s $\color{#d91a1a}-0.95\%$
test_td3_speed[True-backward] 3.9073ms 3.6953ms 270.6120 Ops/s 239.3859 Ops/s $\textbf{\color{#35bf28}+13.04\%}$
test_td3_speed[reduce-overhead-None] 1.9232ms 1.8556ms 538.9194 Ops/s 544.2178 Ops/s $\color{#d91a1a}-0.97\%$
test_cql_speed[False-None] 29.2031ms 26.4324ms 37.8324 Ops/s 37.5318 Ops/s $\color{#35bf28}+0.80\%$
test_cql_speed[False-backward] 40.0109ms 36.3435ms 27.5153 Ops/s 27.4681 Ops/s $\color{#35bf28}+0.17\%$
test_cql_speed[True-None] 13.2185ms 12.5866ms 79.4498 Ops/s 80.2546 Ops/s $\color{#d91a1a}-1.00\%$
test_cql_speed[True-backward] 19.0904ms 18.4831ms 54.1036 Ops/s 55.1310 Ops/s $\color{#d91a1a}-1.86\%$
test_cql_speed[reduce-overhead-None] 13.1720ms 12.4756ms 80.1565 Ops/s 78.6823 Ops/s $\color{#35bf28}+1.87\%$
test_a2c_speed[False-None] 5.8789ms 5.3682ms 186.2821 Ops/s 181.2220 Ops/s $\color{#35bf28}+2.79\%$
test_a2c_speed[False-backward] 12.4000ms 11.9373ms 83.7707 Ops/s 83.7763 Ops/s $-0.01\%$
test_a2c_speed[True-None] 4.3768ms 3.7830ms 264.3421 Ops/s 257.3050 Ops/s $\color{#35bf28}+2.73\%$
test_a2c_speed[True-backward] 9.2085ms 8.6816ms 115.1863 Ops/s 107.2251 Ops/s $\textbf{\color{#35bf28}+7.42\%}$
test_a2c_speed[reduce-overhead-None] 4.8346ms 3.7718ms 265.1261 Ops/s 263.3318 Ops/s $\color{#35bf28}+0.68\%$
test_ppo_speed[False-None] 6.9091ms 5.9068ms 169.2954 Ops/s 166.3002 Ops/s $\color{#35bf28}+1.80\%$
test_ppo_speed[False-backward] 12.9893ms 12.5056ms 79.9642 Ops/s 79.0121 Ops/s $\color{#35bf28}+1.21\%$
test_ppo_speed[True-None] 4.1657ms 3.7077ms 269.7101 Ops/s 269.3869 Ops/s $\color{#35bf28}+0.12\%$
test_ppo_speed[True-backward] 8.7263ms 8.5084ms 117.5309 Ops/s 114.6065 Ops/s $\color{#35bf28}+2.55\%$
test_ppo_speed[reduce-overhead-None] 3.9121ms 3.6421ms 274.5693 Ops/s 272.3610 Ops/s $\color{#35bf28}+0.81\%$
test_reinforce_speed[False-None] 4.7716ms 4.5838ms 218.1588 Ops/s 215.0322 Ops/s $\color{#35bf28}+1.45\%$
test_reinforce_speed[False-backward] 7.9822ms 7.4432ms 134.3517 Ops/s 133.2798 Ops/s $\color{#35bf28}+0.80\%$
test_reinforce_speed[True-None] 3.4471ms 2.9414ms 339.9788 Ops/s 314.3435 Ops/s $\textbf{\color{#35bf28}+8.16\%}$
test_reinforce_speed[True-backward] 8.4441ms 7.8686ms 127.0867 Ops/s 127.6904 Ops/s $\color{#d91a1a}-0.47\%$
test_reinforce_speed[reduce-overhead-None] 3.1011ms 2.8941ms 345.5310 Ops/s 337.6866 Ops/s $\color{#35bf28}+2.32\%$
test_iql_speed[False-None] 26.2930ms 20.9386ms 47.7588 Ops/s 48.5084 Ops/s $\color{#d91a1a}-1.55\%$
test_iql_speed[False-backward] 36.4341ms 31.0621ms 32.1936 Ops/s 31.9604 Ops/s $\color{#35bf28}+0.73\%$
test_iql_speed[True-None] 11.5339ms 8.6332ms 115.8313 Ops/s 111.6269 Ops/s $\color{#35bf28}+3.77\%$
test_iql_speed[True-backward] 17.4581ms 16.7216ms 59.8027 Ops/s 59.3171 Ops/s $\color{#35bf28}+0.82\%$
test_iql_speed[reduce-overhead-None] 9.5175ms 8.7190ms 114.6922 Ops/s 114.7307 Ops/s $\color{#d91a1a}-0.03\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.5518ms 6.0952ms 164.0638 Ops/s 166.9254 Ops/s $\color{#d91a1a}-1.71\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.8453ms 0.3684ms 2.7143 KOps/s 2.9760 KOps/s $\textbf{\color{#d91a1a}-8.79\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7235ms 0.3569ms 2.8016 KOps/s 3.0547 KOps/s $\textbf{\color{#d91a1a}-8.28\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0849ms 5.7618ms 173.5555 Ops/s 171.6047 Ops/s $\color{#35bf28}+1.14\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6129ms 0.3414ms 2.9294 KOps/s 2.9049 KOps/s $\color{#35bf28}+0.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5640ms 0.3121ms 3.2037 KOps/s 3.0550 KOps/s $\color{#35bf28}+4.87\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9140ms 1.4507ms 689.3411 Ops/s 708.0024 Ops/s $\color{#d91a1a}-2.64\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.7961ms 1.3766ms 726.4213 Ops/s 748.6594 Ops/s $\color{#d91a1a}-2.97\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1056ms 5.9601ms 167.7837 Ops/s 167.8300 Ops/s $\color{#d91a1a}-0.03\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0943ms 0.5123ms 1.9520 KOps/s 2.0606 KOps/s $\textbf{\color{#d91a1a}-5.27\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7184ms 0.4974ms 2.0103 KOps/s 2.1666 KOps/s $\textbf{\color{#d91a1a}-7.21\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0774ms 5.8595ms 170.6641 Ops/s 171.0626 Ops/s $\color{#d91a1a}-0.23\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7312ms 0.3866ms 2.5864 KOps/s 3.1263 KOps/s $\textbf{\color{#d91a1a}-17.27\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6446ms 0.3711ms 2.6947 KOps/s 3.2773 KOps/s $\textbf{\color{#d91a1a}-17.78\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1661ms 5.7835ms 172.9048 Ops/s 173.8892 Ops/s $\color{#d91a1a}-0.57\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1820ms 0.3234ms 3.0923 KOps/s 2.9172 KOps/s $\textbf{\color{#35bf28}+6.00\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4954ms 0.2712ms 3.6876 KOps/s 2.7118 KOps/s $\textbf{\color{#35bf28}+35.98\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1984ms 5.9695ms 167.5189 Ops/s 167.3302 Ops/s $\color{#35bf28}+0.11\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0636ms 0.4504ms 2.2201 KOps/s 2.2081 KOps/s $\color{#35bf28}+0.54\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8220ms 0.5237ms 1.9096 KOps/s 2.3653 KOps/s $\textbf{\color{#d91a1a}-19.27\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5595s 16.1980ms 61.7361 Ops/s 58.0673 Ops/s $\textbf{\color{#35bf28}+6.32\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.9620ms 1.8546ms 539.1929 Ops/s 504.5867 Ops/s $\textbf{\color{#35bf28}+6.86\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.0626ms 0.8908ms 1.1226 KOps/s 1.1322 KOps/s $\color{#d91a1a}-0.84\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.2932ms 5.1622ms 193.7147 Ops/s 194.9161 Ops/s $\color{#d91a1a}-0.62\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 12.4803ms 1.9885ms 502.8802 Ops/s 521.8131 Ops/s $\color{#d91a1a}-3.63\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.3837ms 1.2304ms 812.7730 Ops/s 765.3894 Ops/s $\textbf{\color{#35bf28}+6.19\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.3890ms 5.3102ms 188.3179 Ops/s 59.5407 Ops/s $\textbf{\color{#35bf28}+216.28\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 5.1588ms 2.0356ms 491.2667 Ops/s 523.2323 Ops/s $\textbf{\color{#d91a1a}-6.11\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2591ms 1.0782ms 927.4623 Ops/s 952.6725 Ops/s $\color{#d91a1a}-2.65\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 38.4481ms 36.1974ms 27.6263 Ops/s 27.5061 Ops/s $\color{#35bf28}+0.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.2872ms 18.5848ms 53.8074 Ops/s 55.1253 Ops/s $\color{#d91a1a}-2.39\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.5544ms 37.4715ms 26.6870 Ops/s 26.6918 Ops/s $\color{#d91a1a}-0.02\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.4898ms 18.7097ms 53.4483 Ops/s 54.1455 Ops/s $\color{#d91a1a}-1.29\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 42.2531ms 40.3254ms 24.7983 Ops/s 25.6320 Ops/s $\color{#d91a1a}-3.25\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.7352ms 20.4233ms 48.9636 Ops/s 50.6123 Ops/s $\color{#d91a1a}-3.26\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8667ms 0.2300ms 4.3484 KOps/s 4.4385 KOps/s $\color{#d91a1a}-2.03\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7467ms 1.3974ms 715.6110 Ops/s 715.3169 Ops/s $\color{#35bf28}+0.04\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.8870ms 2.4480ms 408.4921 Ops/s 430.4449 Ops/s $\textbf{\color{#d91a1a}-5.10\%}$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.1064ms 2.9597ms 337.8709 Ops/s 341.2121 Ops/s $\color{#d91a1a}-0.98\%$
test_storage_write_contiguous[50-img_shape0-small] 0.5750ms 0.1403ms 7.1251 KOps/s 7.3072 KOps/s $\color{#d91a1a}-2.49\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3885ms 0.2048ms 4.8832 KOps/s 4.9375 KOps/s $\color{#d91a1a}-1.10\%$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9588ms 1.8003ms 555.4707 Ops/s 560.3788 Ops/s $\color{#d91a1a}-0.88\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.7131ms 1.3091ms 763.8927 Ops/s 749.3512 Ops/s $\color{#35bf28}+1.94\%$
test_collector_stack_then_write[50-img_shape0-small] 1.5545ms 1.1252ms 888.7218 Ops/s 896.0448 Ops/s $\color{#d91a1a}-0.82\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.9639ms 3.5176ms 284.2875 Ops/s 276.4604 Ops/s $\color{#35bf28}+2.83\%$
test_collector_stack_then_write[100-img_shape2-large_img] 10.2863ms 5.9678ms 167.5668 Ops/s 175.4267 Ops/s $\color{#d91a1a}-4.48\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.7002ms 7.3322ms 136.3839 Ops/s 143.5989 Ops/s $\textbf{\color{#d91a1a}-5.02\%}$
test_collector_lazystack_then_write[50-img_shape0-small] 0.7241ms 0.2842ms 3.5191 KOps/s 3.6721 KOps/s $\color{#d91a1a}-4.17\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 2.0026ms 1.5286ms 654.2068 Ops/s 659.0575 Ops/s $\color{#d91a1a}-0.74\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 3.0451ms 2.5512ms 391.9772 Ops/s 409.6611 Ops/s $\color{#d91a1a}-4.32\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.5961ms 3.1746ms 314.9983 Ops/s 316.2428 Ops/s $\color{#d91a1a}-0.39\%$
test_collector_without_rb[100-img_shape0-atari] 35.6661ms 34.7913ms 28.7428 Ops/s 29.0782 Ops/s $\color{#d91a1a}-1.15\%$
test_collector_without_rb[200-img_shape1-large_batch] 69.8324ms 68.9863ms 14.4956 Ops/s 14.7222 Ops/s $\color{#d91a1a}-1.54\%$
test_collector_with_rb[100-img_shape0-atari] 40.1372ms 39.0441ms 25.6121 Ops/s 25.5053 Ops/s $\color{#35bf28}+0.42\%$
test_collector_with_rb[200-img_shape1-large_batch] 78.7595ms 77.1627ms 12.9596 Ops/s 13.0929 Ops/s $\color{#d91a1a}-1.02\%$

@vmoens
Copy link
Collaborator Author

vmoens commented Feb 10, 2026

Superseded by rebuilt stack (cleanup folded into the right commits)

@vmoens vmoens closed this Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Refactoring Refactoring of an existing feature sota-implementations/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant