Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Feb 10, 2026

Stack from ghstack (oldest at bottom):

Extend the Dreamer example to work with pre-vectorized environments
(e.g., IsaacLab which provides batch_size=(4096,) natively):

  • TensorDictPrimer: set expand_specs=True for batched env compatibility
  • make_dreamer: accept optional test_env, handle batched env init
  • make_dreamer: configurable encoder_channels and out_channels
  • make_dreamer: dynamic observation_in_key based on backend
  • Model-based env: unbatch specs from pre-vectorized envs
  • make_replay_buffer: add gpu_storage option (LazyTensorStorage)

Co-authored-by: Cursor [email protected]

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 10, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3475

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures

As of commit 09cb880 with merge base 0bc6d20 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 173. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 79.6943μs 78.7794μs 12.6937 KOps/s 12.5021 KOps/s $\color{#35bf28}+1.53\%$
test_tensor_to_bytestream_speed[torch.save] 0.1345ms 0.1341ms 7.4555 KOps/s 7.2929 KOps/s $\color{#35bf28}+2.23\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1103s 0.1096s 9.1255 Ops/s 8.5575 Ops/s $\textbf{\color{#35bf28}+6.64\%}$
test_tensor_to_bytestream_speed[numpy] 2.4289μs 2.4190μs 413.3950 KOps/s 364.2240 KOps/s $\textbf{\color{#35bf28}+13.50\%}$
test_tensor_to_bytestream_speed[safetensors] 37.4850μs 37.2411μs 26.8521 KOps/s 26.8704 KOps/s $\color{#d91a1a}-0.07\%$
test_simple 0.5417s 0.5403s 1.8507 Ops/s 1.7655 Ops/s $\color{#35bf28}+4.83\%$
test_transformed 1.2416s 1.1389s 0.8780 Ops/s 0.8763 Ops/s $\color{#35bf28}+0.20\%$
test_serial 1.7584s 1.6657s 0.6003 Ops/s 0.5920 Ops/s $\color{#35bf28}+1.41\%$
test_parallel 1.1242s 1.0292s 0.9716 Ops/s 0.9597 Ops/s $\color{#35bf28}+1.24\%$
test_step_mdp_speed[True-True-True-True-True] 0.3096ms 43.8617μs 22.7990 KOps/s 22.3233 KOps/s $\color{#35bf28}+2.13\%$
test_step_mdp_speed[True-True-True-True-False] 59.4510μs 24.8256μs 40.2809 KOps/s 40.0774 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[True-True-True-False-True] 54.6510μs 24.6921μs 40.4987 KOps/s 39.8048 KOps/s $\color{#35bf28}+1.74\%$
test_step_mdp_speed[True-True-True-False-False] 45.7700μs 13.5552μs 73.7723 KOps/s 72.8162 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[True-True-False-True-True] 89.7620μs 47.0144μs 21.2701 KOps/s 21.4223 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[True-True-False-True-False] 94.1410μs 27.2115μs 36.7491 KOps/s 36.1789 KOps/s $\color{#35bf28}+1.58\%$
test_step_mdp_speed[True-True-False-False-True] 60.0810μs 27.2893μs 36.6443 KOps/s 35.9326 KOps/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[True-True-False-False-False] 47.2800μs 16.2021μs 61.7205 KOps/s 60.8622 KOps/s $\color{#35bf28}+1.41\%$
test_step_mdp_speed[True-False-True-True-True] 83.3010μs 49.6576μs 20.1379 KOps/s 19.8750 KOps/s $\color{#35bf28}+1.32\%$
test_step_mdp_speed[True-False-True-True-False] 67.0610μs 30.0646μs 33.2617 KOps/s 32.4083 KOps/s $\color{#35bf28}+2.63\%$
test_step_mdp_speed[True-False-True-False-True] 90.8720μs 27.3047μs 36.6237 KOps/s 35.7322 KOps/s $\color{#35bf28}+2.50\%$
test_step_mdp_speed[True-False-True-False-False] 50.5010μs 15.9532μs 62.6835 KOps/s 60.5101 KOps/s $\color{#35bf28}+3.59\%$
test_step_mdp_speed[True-False-False-True-True] 94.3220μs 52.3712μs 19.0945 KOps/s 18.7960 KOps/s $\color{#35bf28}+1.59\%$
test_step_mdp_speed[True-False-False-True-False] 75.8210μs 32.4555μs 30.8114 KOps/s 30.3822 KOps/s $\color{#35bf28}+1.41\%$
test_step_mdp_speed[True-False-False-False-True] 55.8010μs 29.9519μs 33.3869 KOps/s 33.1043 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[True-False-False-False-False] 54.4710μs 18.8170μs 53.1434 KOps/s 52.9430 KOps/s $\color{#35bf28}+0.38\%$
test_step_mdp_speed[False-True-True-True-True] 0.1101ms 49.6525μs 20.1400 KOps/s 19.9649 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[False-True-True-True-False] 62.5510μs 30.1561μs 33.1608 KOps/s 33.2888 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[False-True-True-False-True] 2.4092ms 31.4913μs 31.7548 KOps/s 31.4669 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[False-True-True-False-False] 61.3710μs 17.6930μs 56.5196 KOps/s 54.9267 KOps/s $\color{#35bf28}+2.90\%$
test_step_mdp_speed[False-True-False-True-True] 95.4110μs 52.1729μs 19.1670 KOps/s 18.9125 KOps/s $\color{#35bf28}+1.35\%$
test_step_mdp_speed[False-True-False-True-False] 65.6610μs 32.2523μs 31.0055 KOps/s 30.4091 KOps/s $\color{#35bf28}+1.96\%$
test_step_mdp_speed[False-True-False-False-True] 67.9710μs 33.3887μs 29.9502 KOps/s 29.7210 KOps/s $\color{#35bf28}+0.77\%$
test_step_mdp_speed[False-True-False-False-False] 61.7420μs 20.2863μs 49.2944 KOps/s 47.9924 KOps/s $\color{#35bf28}+2.71\%$
test_step_mdp_speed[False-False-True-True-True] 93.9420μs 54.5677μs 18.3259 KOps/s 18.0190 KOps/s $\color{#35bf28}+1.70\%$
test_step_mdp_speed[False-False-True-True-False] 68.6010μs 35.4209μs 28.2319 KOps/s 27.7808 KOps/s $\color{#35bf28}+1.62\%$
test_step_mdp_speed[False-False-True-False-True] 67.4810μs 33.8755μs 29.5199 KOps/s 29.9918 KOps/s $\color{#d91a1a}-1.57\%$
test_step_mdp_speed[False-False-True-False-False] 62.3410μs 20.4357μs 48.9339 KOps/s 48.2535 KOps/s $\color{#35bf28}+1.41\%$
test_step_mdp_speed[False-False-False-True-True] 95.2120μs 56.0232μs 17.8498 KOps/s 17.6961 KOps/s $\color{#35bf28}+0.87\%$
test_step_mdp_speed[False-False-False-True-False] 68.5620μs 37.4899μs 26.6739 KOps/s 26.3521 KOps/s $\color{#35bf28}+1.22\%$
test_step_mdp_speed[False-False-False-False-True] 62.4710μs 35.4817μs 28.1835 KOps/s 27.5397 KOps/s $\color{#35bf28}+2.34\%$
test_step_mdp_speed[False-False-False-False-False] 57.5410μs 22.8044μs 43.8512 KOps/s 43.0542 KOps/s $\color{#35bf28}+1.85\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8469s 0.7503s 1.3328 Ops/s 1.3176 Ops/s $\color{#35bf28}+1.15\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7153s 0.6183s 1.6173 Ops/s 1.6059 Ops/s $\color{#35bf28}+0.71\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7081s 1.6330s 0.6124 Ops/s 0.6072 Ops/s $\color{#35bf28}+0.85\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.4964s 1.4194s 0.7045 Ops/s 0.6988 Ops/s $\color{#35bf28}+0.82\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9595s 1.8826s 0.5312 Ops/s 0.5277 Ops/s $\color{#35bf28}+0.66\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7431s 1.6649s 0.6006 Ops/s 0.5876 Ops/s $\color{#35bf28}+2.23\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.6811s 4.5607s 0.2193 Ops/s 0.2164 Ops/s $\color{#35bf28}+1.32\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5645s 4.4971s 0.2224 Ops/s 0.2234 Ops/s $\color{#d91a1a}-0.48\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9618s 1.8793s 0.5321 Ops/s 0.5282 Ops/s $\color{#35bf28}+0.74\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6820s 1.5899s 0.6290 Ops/s 0.6295 Ops/s $\color{#d91a1a}-0.08\%$
test_values[generalized_advantage_estimate-True-True] 10.6339ms 10.4015ms 96.1396 Ops/s 92.8611 Ops/s $\color{#35bf28}+3.53\%$
test_values[vec_generalized_advantage_estimate-True-True] 20.8235ms 17.8544ms 56.0087 Ops/s 88.1706 Ops/s $\textbf{\color{#d91a1a}-36.48\%}$
test_values[td0_return_estimate-False-False] 0.2320ms 0.1303ms 7.6762 KOps/s 7.5571 KOps/s $\color{#35bf28}+1.58\%$
test_values[td1_return_estimate-False-False] 29.1688ms 28.3848ms 35.2301 Ops/s 34.1848 Ops/s $\color{#35bf28}+3.06\%$
test_values[vec_td1_return_estimate-False-False] 18.3783ms 17.8332ms 56.0752 Ops/s 87.4935 Ops/s $\textbf{\color{#d91a1a}-35.91\%}$
test_values[td_lambda_return_estimate-True-False] 43.3379ms 42.0042ms 23.8071 Ops/s 23.1063 Ops/s $\color{#35bf28}+3.03\%$
test_values[vec_td_lambda_return_estimate-True-False] 18.9344ms 17.8593ms 55.9933 Ops/s 87.5979 Ops/s $\textbf{\color{#d91a1a}-36.08\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.2902ms 9.2011ms 108.6832 Ops/s 104.4067 Ops/s $\color{#35bf28}+4.10\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.8776ms 1.5380ms 650.1789 Ops/s 638.3618 Ops/s $\color{#35bf28}+1.85\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4905ms 0.4354ms 2.2969 KOps/s 2.2981 KOps/s $\color{#d91a1a}-0.05\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 35.6866ms 35.1580ms 28.4430 Ops/s 33.6323 Ops/s $\textbf{\color{#d91a1a}-15.43\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 2.1254ms 1.7399ms 574.7468 Ops/s 569.9874 Ops/s $\color{#35bf28}+0.84\%$
test_dqn_speed[False-None] 1.5406ms 1.3915ms 718.6336 Ops/s 709.1449 Ops/s $\color{#35bf28}+1.34\%$
test_dqn_speed[False-backward] 2.0629ms 1.9824ms 504.4417 Ops/s 511.7730 Ops/s $\color{#d91a1a}-1.43\%$
test_dqn_speed[True-None] 0.6727ms 0.5529ms 1.8088 KOps/s 1.7542 KOps/s $\color{#35bf28}+3.11\%$
test_dqn_speed[True-backward] 1.0694ms 1.0163ms 984.0078 Ops/s 943.9805 Ops/s $\color{#35bf28}+4.24\%$
test_dqn_speed[reduce-overhead-None] 0.7388ms 0.5475ms 1.8264 KOps/s 1.7575 KOps/s $\color{#35bf28}+3.92\%$
test_ddpg_speed[False-None] 3.1745ms 2.8574ms 349.9697 Ops/s 348.1609 Ops/s $\color{#35bf28}+0.52\%$
test_ddpg_speed[False-backward] 4.4438ms 4.1457ms 241.2132 Ops/s 239.0094 Ops/s $\color{#35bf28}+0.92\%$
test_ddpg_speed[True-None] 1.8342ms 1.4260ms 701.2386 Ops/s 685.2147 Ops/s $\color{#35bf28}+2.34\%$
test_ddpg_speed[True-backward] 2.5356ms 2.4602ms 406.4701 Ops/s 327.4864 Ops/s $\textbf{\color{#35bf28}+24.12\%}$
test_ddpg_speed[reduce-overhead-None] 1.8164ms 1.4230ms 702.7267 Ops/s 684.3766 Ops/s $\color{#35bf28}+2.68\%$
test_sac_speed[False-None] 8.6251ms 8.0100ms 124.8445 Ops/s 123.3654 Ops/s $\color{#35bf28}+1.20\%$
test_sac_speed[False-backward] 11.8592ms 11.3876ms 87.8150 Ops/s 86.8212 Ops/s $\color{#35bf28}+1.14\%$
test_sac_speed[True-None] 2.6371ms 2.2093ms 452.6419 Ops/s 445.7061 Ops/s $\color{#35bf28}+1.56\%$
test_sac_speed[True-backward] 4.3676ms 4.1868ms 238.8469 Ops/s 234.5505 Ops/s $\color{#35bf28}+1.83\%$
test_sac_speed[reduce-overhead-None] 2.3578ms 2.2039ms 453.7353 Ops/s 445.9617 Ops/s $\color{#35bf28}+1.74\%$
test_redq_speed[False-None] 14.0002ms 10.6834ms 93.6029 Ops/s 94.8106 Ops/s $\color{#d91a1a}-1.27\%$
test_redq_speed[False-backward] 19.3674ms 18.5636ms 53.8688 Ops/s 54.3777 Ops/s $\color{#d91a1a}-0.94\%$
test_redq_speed[True-None] 5.0453ms 4.6631ms 214.4489 Ops/s 210.3585 Ops/s $\color{#35bf28}+1.94\%$
test_redq_speed[True-backward] 10.4745ms 10.1815ms 98.2177 Ops/s 99.5746 Ops/s $\color{#d91a1a}-1.36\%$
test_redq_speed[reduce-overhead-None] 4.7855ms 4.5822ms 218.2373 Ops/s 219.8705 Ops/s $\color{#d91a1a}-0.74\%$
test_redq_deprec_speed[False-None] 11.9454ms 11.2946ms 88.5376 Ops/s 88.5732 Ops/s $\color{#d91a1a}-0.04\%$
test_redq_deprec_speed[False-backward] 16.7625ms 16.3366ms 61.2123 Ops/s 61.1134 Ops/s $\color{#35bf28}+0.16\%$
test_redq_deprec_speed[True-None] 4.7633ms 3.8210ms 261.7121 Ops/s 245.4797 Ops/s $\textbf{\color{#35bf28}+6.61\%}$
test_redq_deprec_speed[True-backward] 8.3392ms 7.9025ms 126.5426 Ops/s 119.7127 Ops/s $\textbf{\color{#35bf28}+5.71\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.2687ms 3.7523ms 266.5064 Ops/s 260.1931 Ops/s $\color{#35bf28}+2.43\%$
test_td3_speed[False-None] 48.8358ms 8.4460ms 118.3988 Ops/s 123.5097 Ops/s $\color{#d91a1a}-4.14\%$
test_td3_speed[False-backward] 11.6561ms 11.1032ms 90.0640 Ops/s 90.0756 Ops/s $\color{#d91a1a}-0.01\%$
test_td3_speed[True-None] 2.0029ms 1.9244ms 519.6436 Ops/s 504.1248 Ops/s $\color{#35bf28}+3.08\%$
test_td3_speed[True-backward] 3.9623ms 3.8273ms 261.2810 Ops/s 256.4557 Ops/s $\color{#35bf28}+1.88\%$
test_td3_speed[reduce-overhead-None] 2.0089ms 1.8780ms 532.4764 Ops/s 517.2228 Ops/s $\color{#35bf28}+2.95\%$
test_cql_speed[False-None] 30.5348ms 26.7845ms 37.3351 Ops/s 37.5780 Ops/s $\color{#d91a1a}-0.65\%$
test_cql_speed[False-backward] 40.5640ms 36.3103ms 27.5404 Ops/s 27.2469 Ops/s $\color{#35bf28}+1.08\%$
test_cql_speed[True-None] 13.4717ms 12.8691ms 77.7052 Ops/s 76.9520 Ops/s $\color{#35bf28}+0.98\%$
test_cql_speed[True-backward] 19.2870ms 18.9233ms 52.8449 Ops/s 52.5355 Ops/s $\color{#35bf28}+0.59\%$
test_cql_speed[reduce-overhead-None] 13.2127ms 12.8958ms 77.5448 Ops/s 77.8745 Ops/s $\color{#d91a1a}-0.42\%$
test_a2c_speed[False-None] 5.8595ms 5.6045ms 178.4275 Ops/s 178.7254 Ops/s $\color{#d91a1a}-0.17\%$
test_a2c_speed[False-backward] 12.5257ms 12.2219ms 81.8201 Ops/s 82.5164 Ops/s $\color{#d91a1a}-0.84\%$
test_a2c_speed[True-None] 4.1320ms 3.8131ms 262.2540 Ops/s 257.7883 Ops/s $\color{#35bf28}+1.73\%$
test_a2c_speed[True-backward] 9.1061ms 8.8608ms 112.8565 Ops/s 107.0249 Ops/s $\textbf{\color{#35bf28}+5.45\%}$
test_a2c_speed[reduce-overhead-None] 4.2068ms 3.7654ms 265.5743 Ops/s 261.4892 Ops/s $\color{#35bf28}+1.56\%$
test_ppo_speed[False-None] 6.4513ms 6.0055ms 166.5148 Ops/s 165.5381 Ops/s $\color{#35bf28}+0.59\%$
test_ppo_speed[False-backward] 13.3677ms 12.9045ms 77.4922 Ops/s 77.7558 Ops/s $\color{#d91a1a}-0.34\%$
test_ppo_speed[True-None] 3.9213ms 3.7233ms 268.5772 Ops/s 264.3569 Ops/s $\color{#35bf28}+1.60\%$
test_ppo_speed[True-backward] 8.8920ms 8.6424ms 115.7093 Ops/s 115.0558 Ops/s $\color{#35bf28}+0.57\%$
test_ppo_speed[reduce-overhead-None] 4.8906ms 3.7283ms 268.2222 Ops/s 269.6617 Ops/s $\color{#d91a1a}-0.53\%$
test_reinforce_speed[False-None] 5.1267ms 4.6615ms 214.5236 Ops/s 216.7475 Ops/s $\color{#d91a1a}-1.03\%$
test_reinforce_speed[False-backward] 7.8428ms 7.5340ms 132.7324 Ops/s 134.0093 Ops/s $\color{#d91a1a}-0.95\%$
test_reinforce_speed[True-None] 3.1159ms 2.9499ms 339.0001 Ops/s 337.2955 Ops/s $\color{#35bf28}+0.51\%$
test_reinforce_speed[True-backward] 8.1804ms 7.8714ms 127.0427 Ops/s 115.7040 Ops/s $\textbf{\color{#35bf28}+9.80\%}$
test_reinforce_speed[reduce-overhead-None] 3.4388ms 2.9239ms 342.0070 Ops/s 329.4948 Ops/s $\color{#35bf28}+3.80\%$
test_iql_speed[False-None] 26.3424ms 21.1026ms 47.3875 Ops/s 48.9213 Ops/s $\color{#d91a1a}-3.14\%$
test_iql_speed[False-backward] 38.1119ms 31.8781ms 31.3695 Ops/s 32.1747 Ops/s $\color{#d91a1a}-2.50\%$
test_iql_speed[True-None] 9.0808ms 8.7463ms 114.3347 Ops/s 112.5022 Ops/s $\color{#35bf28}+1.63\%$
test_iql_speed[True-backward] 17.6895ms 17.2638ms 57.9248 Ops/s 56.2563 Ops/s $\color{#35bf28}+2.97\%$
test_iql_speed[reduce-overhead-None] 9.1078ms 8.8138ms 113.4586 Ops/s 111.8926 Ops/s $\color{#35bf28}+1.40\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0137ms 5.9036ms 169.3871 Ops/s 170.6582 Ops/s $\color{#d91a1a}-0.74\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.9333ms 0.3357ms 2.9788 KOps/s 3.1097 KOps/s $\color{#d91a1a}-4.21\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5819ms 0.3213ms 3.1123 KOps/s 3.1780 KOps/s $\color{#d91a1a}-2.07\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.8649ms 5.6414ms 177.2621 Ops/s 176.8538 Ops/s $\color{#35bf28}+0.23\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6265ms 0.3493ms 2.8631 KOps/s 2.6055 KOps/s $\textbf{\color{#35bf28}+9.89\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6857ms 0.3132ms 3.1927 KOps/s 2.5702 KOps/s $\textbf{\color{#35bf28}+24.22\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6262ms 1.3902ms 719.3205 Ops/s 737.7615 Ops/s $\color{#d91a1a}-2.50\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5519ms 1.3128ms 761.7087 Ops/s 757.0005 Ops/s $\color{#35bf28}+0.62\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.9062ms 5.7829ms 172.9246 Ops/s 171.8930 Ops/s $\color{#35bf28}+0.60\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8041ms 0.4450ms 2.2472 KOps/s 2.0031 KOps/s $\textbf{\color{#35bf28}+12.19\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8739ms 0.4293ms 2.3292 KOps/s 2.1637 KOps/s $\textbf{\color{#35bf28}+7.65\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8263ms 5.6912ms 175.7100 Ops/s 176.2774 Ops/s $\color{#d91a1a}-0.32\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1916ms 0.3720ms 2.6878 KOps/s 3.0070 KOps/s $\textbf{\color{#d91a1a}-10.61\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5190ms 0.3373ms 2.9645 KOps/s 2.6717 KOps/s $\textbf{\color{#35bf28}+10.96\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.8901ms 5.6262ms 177.7399 Ops/s 177.5304 Ops/s $\color{#35bf28}+0.12\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9412ms 0.3394ms 2.9461 KOps/s 3.3717 KOps/s $\textbf{\color{#d91a1a}-12.62\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6063ms 0.3208ms 3.1170 KOps/s 3.6850 KOps/s $\textbf{\color{#d91a1a}-15.41\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.9105ms 5.8281ms 171.5839 Ops/s 171.0691 Ops/s $\color{#35bf28}+0.30\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1337ms 0.4946ms 2.0219 KOps/s 2.2030 KOps/s $\textbf{\color{#d91a1a}-8.22\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7012ms 0.4825ms 2.0726 KOps/s 2.1458 KOps/s $\color{#d91a1a}-3.41\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5627s 16.1957ms 61.7450 Ops/s 57.1775 Ops/s $\textbf{\color{#35bf28}+7.99\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 12.7010ms 1.9770ms 505.8050 Ops/s 506.6916 Ops/s $\color{#d91a1a}-0.17\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.2434ms 1.2331ms 810.9740 Ops/s 817.0461 Ops/s $\color{#d91a1a}-0.74\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.5115ms 5.0116ms 199.5362 Ops/s 202.5588 Ops/s $\color{#d91a1a}-1.49\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 12.2333ms 2.0640ms 484.4867 Ops/s 489.6760 Ops/s $\color{#d91a1a}-1.06\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.9818ms 1.1282ms 886.4002 Ops/s 908.8077 Ops/s $\color{#d91a1a}-2.47\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.5769ms 5.2042ms 192.1536 Ops/s 56.3161 Ops/s $\textbf{\color{#35bf28}+241.21\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.2990ms 1.9663ms 508.5629 Ops/s 488.7742 Ops/s $\color{#35bf28}+4.05\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.1719ms 1.0470ms 955.1199 Ops/s 930.4469 Ops/s $\color{#35bf28}+2.65\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 37.8874ms 35.4950ms 28.1730 Ops/s 27.6260 Ops/s $\color{#35bf28}+1.98\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.2045ms 18.1766ms 55.0158 Ops/s 55.6263 Ops/s $\color{#d91a1a}-1.10\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 39.3572ms 36.7244ms 27.2298 Ops/s 26.7660 Ops/s $\color{#35bf28}+1.73\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.2953ms 18.5262ms 53.9775 Ops/s 54.4079 Ops/s $\color{#d91a1a}-0.79\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.4946ms 38.5320ms 25.9525 Ops/s 25.4831 Ops/s $\color{#35bf28}+1.84\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.3189ms 19.7920ms 50.5255 Ops/s 49.5753 Ops/s $\color{#35bf28}+1.92\%$
test_storage_write_lazystack[50-img_shape0-small] 0.9320ms 0.2175ms 4.5969 KOps/s 4.4192 KOps/s $\color{#35bf28}+4.02\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.6586ms 1.4481ms 690.5439 Ops/s 709.0813 Ops/s $\color{#d91a1a}-2.61\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.5564ms 2.3941ms 417.6866 Ops/s 400.5450 Ops/s $\color{#35bf28}+4.28\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.1982ms 2.9848ms 335.0255 Ops/s 331.0177 Ops/s $\color{#35bf28}+1.21\%$
test_storage_write_contiguous[50-img_shape0-small] 0.5439ms 0.1324ms 7.5537 KOps/s 7.5127 KOps/s $\color{#35bf28}+0.55\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3392ms 0.1948ms 5.1323 KOps/s 5.3958 KOps/s $\color{#d91a1a}-4.88\%$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9764ms 1.7616ms 567.6767 Ops/s 562.6299 Ops/s $\color{#35bf28}+0.90\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5546ms 1.3380ms 747.3847 Ops/s 757.0667 Ops/s $\color{#d91a1a}-1.28\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2753ms 1.0898ms 917.6178 Ops/s 918.7505 Ops/s $\color{#d91a1a}-0.12\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.6639ms 3.4951ms 286.1115 Ops/s 278.1811 Ops/s $\color{#35bf28}+2.85\%$
test_collector_stack_then_write[100-img_shape2-large_img] 10.9989ms 5.8025ms 172.3391 Ops/s 173.6308 Ops/s $\color{#d91a1a}-0.74\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 13.7645ms 7.0933ms 140.9775 Ops/s 144.0732 Ops/s $\color{#d91a1a}-2.15\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4110ms 0.2686ms 3.7234 KOps/s 3.6961 KOps/s $\color{#35bf28}+0.74\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.7603ms 1.5496ms 645.3121 Ops/s 647.7133 Ops/s $\color{#d91a1a}-0.37\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.8758ms 2.4954ms 400.7427 Ops/s 384.0152 Ops/s $\color{#35bf28}+4.36\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.5328ms 3.1920ms 313.2860 Ops/s 311.4564 Ops/s $\color{#35bf28}+0.59\%$
test_collector_without_rb[100-img_shape0-atari] 34.3324ms 33.6937ms 29.6792 Ops/s 29.1779 Ops/s $\color{#35bf28}+1.72\%$
test_collector_without_rb[200-img_shape1-large_batch] 66.1060ms 65.8965ms 15.1753 Ops/s 14.8866 Ops/s $\color{#35bf28}+1.94\%$
test_collector_with_rb[100-img_shape0-atari] 38.3729ms 37.7315ms 26.5031 Ops/s 26.0848 Ops/s $\color{#35bf28}+1.60\%$
test_collector_with_rb[200-img_shape1-large_batch] 74.3394ms 73.9019ms 13.5315 Ops/s 12.9602 Ops/s $\color{#35bf28}+4.41\%$

@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 85.7573μs 81.9044μs 12.2094 KOps/s 12.6444 KOps/s $\color{#d91a1a}-3.44\%$
test_tensor_to_bytestream_speed[torch.save] 0.1451ms 0.1446ms 6.9180 KOps/s 7.2912 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_tensor_to_bytestream_speed[untyped_storage] 0.1060s 0.1056s 9.4738 Ops/s 9.0867 Ops/s $\color{#35bf28}+4.26\%$
test_tensor_to_bytestream_speed[numpy] 2.5196μs 2.5083μs 398.6772 KOps/s 377.4838 KOps/s $\textbf{\color{#35bf28}+5.61\%}$
test_tensor_to_bytestream_speed[safetensors] 38.4164μs 38.0793μs 26.2610 KOps/s 28.1934 KOps/s $\textbf{\color{#d91a1a}-6.85\%}$
test_simple 0.8014s 0.7948s 1.2582 Ops/s 1.2433 Ops/s $\color{#35bf28}+1.20\%$
test_transformed 1.5305s 1.4399s 0.6945 Ops/s 0.7060 Ops/s $\color{#d91a1a}-1.63\%$
test_serial 2.4281s 2.3279s 0.4296 Ops/s 0.4423 Ops/s $\color{#d91a1a}-2.87\%$
test_parallel 1.9023s 1.8162s 0.5506 Ops/s 0.5600 Ops/s $\color{#d91a1a}-1.68\%$
test_step_mdp_speed[True-True-True-True-True] 0.3577ms 43.7456μs 22.8595 KOps/s 23.3179 KOps/s $\color{#d91a1a}-1.97\%$
test_step_mdp_speed[True-True-True-True-False] 64.0220μs 24.4339μs 40.9268 KOps/s 41.0928 KOps/s $\color{#d91a1a}-0.40\%$
test_step_mdp_speed[True-True-True-False-True] 64.3220μs 24.3737μs 41.0279 KOps/s 41.6493 KOps/s $\color{#d91a1a}-1.49\%$
test_step_mdp_speed[True-True-True-False-False] 49.6220μs 13.3667μs 74.8130 KOps/s 75.4721 KOps/s $\color{#d91a1a}-0.87\%$
test_step_mdp_speed[True-True-False-True-True] 87.4730μs 44.8179μs 22.3125 KOps/s 22.0816 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[True-True-False-True-False] 66.0330μs 26.1174μs 38.2887 KOps/s 37.4402 KOps/s $\color{#35bf28}+2.27\%$
test_step_mdp_speed[True-True-False-False-True] 57.8820μs 26.3626μs 37.9325 KOps/s 38.1534 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[True-True-False-False-False] 71.4130μs 15.8756μs 62.9897 KOps/s 63.4769 KOps/s $\color{#d91a1a}-0.77\%$
test_step_mdp_speed[True-False-True-True-True] 0.1024ms 48.0864μs 20.7959 KOps/s 20.7177 KOps/s $\color{#35bf28}+0.38\%$
test_step_mdp_speed[True-False-True-True-False] 56.0720μs 29.0730μs 34.3962 KOps/s 33.3390 KOps/s $\color{#35bf28}+3.17\%$
test_step_mdp_speed[True-False-True-False-True] 66.6320μs 26.7587μs 37.3710 KOps/s 37.4630 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[True-False-True-False-False] 49.7020μs 16.1517μs 61.9130 KOps/s 62.2586 KOps/s $\color{#d91a1a}-0.56\%$
test_step_mdp_speed[True-False-False-True-True] 0.1155ms 51.1099μs 19.5657 KOps/s 19.8898 KOps/s $\color{#d91a1a}-1.63\%$
test_step_mdp_speed[True-False-False-True-False] 70.1130μs 32.0868μs 31.1655 KOps/s 31.0519 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[True-False-False-False-True] 98.7140μs 29.6104μs 33.7719 KOps/s 34.5399 KOps/s $\color{#d91a1a}-2.22\%$
test_step_mdp_speed[True-False-False-False-False] 50.3320μs 18.6922μs 53.4981 KOps/s 52.7663 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[False-True-True-True-True] 83.1530μs 48.5710μs 20.5884 KOps/s 20.6063 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-True-True-True-False] 67.6820μs 29.3148μs 34.1125 KOps/s 33.6441 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[False-True-True-False-True] 2.4305ms 30.4952μs 32.7921 KOps/s 32.6701 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[False-True-True-False-False] 50.9020μs 17.4731μs 57.2307 KOps/s 55.7424 KOps/s $\color{#35bf28}+2.67\%$
test_step_mdp_speed[False-True-False-True-True] 85.4540μs 50.4491μs 19.8220 KOps/s 19.5201 KOps/s $\color{#35bf28}+1.55\%$
test_step_mdp_speed[False-True-False-True-False] 67.5830μs 31.3928μs 31.8544 KOps/s 30.8514 KOps/s $\color{#35bf28}+3.25\%$
test_step_mdp_speed[False-True-False-False-True] 65.5920μs 33.0369μs 30.2692 KOps/s 30.5008 KOps/s $\color{#d91a1a}-0.76\%$
test_step_mdp_speed[False-True-False-False-False] 67.0230μs 20.2242μs 49.4457 KOps/s 48.6867 KOps/s $\color{#35bf28}+1.56\%$
test_step_mdp_speed[False-False-True-True-True] 94.4640μs 53.6481μs 18.6400 KOps/s 18.4283 KOps/s $\color{#35bf28}+1.15\%$
test_step_mdp_speed[False-False-True-True-False] 68.3030μs 34.3146μs 29.1421 KOps/s 28.0957 KOps/s $\color{#35bf28}+3.72\%$
test_step_mdp_speed[False-False-True-False-True] 72.3630μs 33.0831μs 30.2269 KOps/s 30.1711 KOps/s $\color{#35bf28}+0.19\%$
test_step_mdp_speed[False-False-True-False-False] 61.9630μs 19.8894μs 50.2780 KOps/s 49.8254 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[False-False-False-True-True] 88.8430μs 54.7911μs 18.2512 KOps/s 17.8754 KOps/s $\color{#35bf28}+2.10\%$
test_step_mdp_speed[False-False-False-True-False] 61.6230μs 36.6575μs 27.2795 KOps/s 26.5826 KOps/s $\color{#35bf28}+2.62\%$
test_step_mdp_speed[False-False-False-False-True] 0.1143ms 34.7376μs 28.7872 KOps/s 28.1975 KOps/s $\color{#35bf28}+2.09\%$
test_step_mdp_speed[False-False-False-False-False] 69.3030μs 22.7067μs 44.0399 KOps/s 44.2561 KOps/s $\color{#d91a1a}-0.49\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8725s 0.7775s 1.2862 Ops/s 1.3249 Ops/s $\color{#d91a1a}-2.92\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7080s 0.6161s 1.6232 Ops/s 1.6106 Ops/s $\color{#35bf28}+0.78\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7417s 1.6581s 0.6031 Ops/s 0.6079 Ops/s $\color{#d91a1a}-0.79\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5421s 1.4474s 0.6909 Ops/s 0.7009 Ops/s $\color{#d91a1a}-1.43\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0050s 1.9193s 0.5210 Ops/s 0.5291 Ops/s $\color{#d91a1a}-1.53\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7892s 1.7109s 0.5845 Ops/s 0.5998 Ops/s $\color{#d91a1a}-2.56\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7664s 4.6380s 0.2156 Ops/s 0.2151 Ops/s $\color{#35bf28}+0.26\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5670s 4.4323s 0.2256 Ops/s 0.2265 Ops/s $\color{#d91a1a}-0.38\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9191s 1.8750s 0.5333 Ops/s 0.5269 Ops/s $\color{#35bf28}+1.23\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6378s 1.5766s 0.6343 Ops/s 0.6345 Ops/s $\color{#d91a1a}-0.03\%$
test_values[generalized_advantage_estimate-True-True] 20.3837ms 19.9207ms 50.1990 Ops/s 50.4178 Ops/s $\color{#d91a1a}-0.43\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1538s 3.9970ms 250.1907 Ops/s 260.3168 Ops/s $\color{#d91a1a}-3.89\%$
test_values[td0_return_estimate-False-False] 0.1069ms 83.3147μs 12.0027 KOps/s 12.0585 KOps/s $\color{#d91a1a}-0.46\%$
test_values[td1_return_estimate-False-False] 48.0689ms 47.5495ms 21.0307 Ops/s 20.9980 Ops/s $\color{#35bf28}+0.16\%$
test_values[vec_td1_return_estimate-False-False] 1.3470ms 1.0928ms 915.0784 Ops/s 920.7815 Ops/s $\color{#d91a1a}-0.62\%$
test_values[td_lambda_return_estimate-True-False] 78.9260ms 78.3485ms 12.7635 Ops/s 12.7729 Ops/s $\color{#d91a1a}-0.07\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2940ms 1.0866ms 920.2620 Ops/s 924.0947 Ops/s $\color{#d91a1a}-0.41\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 21.5962ms 20.7789ms 48.1258 Ops/s 49.6047 Ops/s $\color{#d91a1a}-2.98\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0308ms 0.7587ms 1.3181 KOps/s 1.3291 KOps/s $\color{#d91a1a}-0.83\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7437ms 0.6988ms 1.4311 KOps/s 1.4784 KOps/s $\color{#d91a1a}-3.20\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5616ms 1.4939ms 669.3950 Ops/s 672.0869 Ops/s $\color{#d91a1a}-0.40\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7454ms 0.6935ms 1.4419 KOps/s 1.4470 KOps/s $\color{#d91a1a}-0.35\%$
test_dqn_speed[False-None] 1.6265ms 1.5306ms 653.3548 Ops/s 658.0998 Ops/s $\color{#d91a1a}-0.72\%$
test_dqn_speed[False-backward] 2.4318ms 2.1837ms 457.9423 Ops/s 459.2003 Ops/s $\color{#d91a1a}-0.27\%$
test_dqn_speed[True-None] 0.6226ms 0.5503ms 1.8173 KOps/s 1.7879 KOps/s $\color{#35bf28}+1.64\%$
test_dqn_speed[True-backward] 1.2356ms 1.1921ms 838.8368 Ops/s 838.3927 Ops/s $\color{#35bf28}+0.05\%$
test_dqn_speed[reduce-overhead-None] 0.6682ms 0.5909ms 1.6922 KOps/s 1.6157 KOps/s $\color{#35bf28}+4.74\%$
test_ddpg_speed[False-None] 3.4030ms 2.9108ms 343.5434 Ops/s 344.8548 Ops/s $\color{#d91a1a}-0.38\%$
test_ddpg_speed[False-backward] 4.6370ms 4.2938ms 232.8965 Ops/s 229.9865 Ops/s $\color{#35bf28}+1.27\%$
test_ddpg_speed[True-None] 1.3743ms 1.2883ms 776.2139 Ops/s 769.0242 Ops/s $\color{#35bf28}+0.93\%$
test_ddpg_speed[True-backward] 2.5648ms 2.4808ms 403.1013 Ops/s 400.5486 Ops/s $\color{#35bf28}+0.64\%$
test_ddpg_speed[reduce-overhead-None] 1.4946ms 1.3174ms 759.0508 Ops/s 751.9015 Ops/s $\color{#35bf28}+0.95\%$
test_sac_speed[False-None] 8.9555ms 8.3631ms 119.5733 Ops/s 119.0884 Ops/s $\color{#35bf28}+0.41\%$
test_sac_speed[False-backward] 12.1351ms 11.6814ms 85.6060 Ops/s 84.7034 Ops/s $\color{#35bf28}+1.07\%$
test_sac_speed[True-None] 1.9431ms 1.7990ms 555.8493 Ops/s 548.4960 Ops/s $\color{#35bf28}+1.34\%$
test_sac_speed[True-backward] 4.0586ms 3.5745ms 279.7586 Ops/s 276.6610 Ops/s $\color{#35bf28}+1.12\%$
test_sac_speed[reduce-overhead-None] 0.3654s 12.0837ms 82.7559 Ops/s 92.5845 Ops/s $\textbf{\color{#d91a1a}-10.62\%}$
test_redq_deprec_speed[False-None] 9.8687ms 9.3194ms 107.3033 Ops/s 107.1133 Ops/s $\color{#35bf28}+0.18\%$
test_redq_deprec_speed[False-backward] 13.4588ms 12.8861ms 77.6032 Ops/s 78.1797 Ops/s $\color{#d91a1a}-0.74\%$
test_redq_deprec_speed[True-None] 2.6444ms 2.5222ms 396.4727 Ops/s 400.1554 Ops/s $\color{#d91a1a}-0.92\%$
test_redq_deprec_speed[True-backward] 4.7219ms 4.2674ms 234.3328 Ops/s 241.4236 Ops/s $\color{#d91a1a}-2.94\%$
test_redq_deprec_speed[reduce-overhead-None] 16.5145ms 9.9502ms 100.5004 Ops/s 103.0633 Ops/s $\color{#d91a1a}-2.49\%$
test_td3_speed[False-None] 8.4672ms 8.2129ms 121.7595 Ops/s 121.1582 Ops/s $\color{#35bf28}+0.50\%$
test_td3_speed[False-backward] 11.3125ms 10.9021ms 91.7250 Ops/s 93.2151 Ops/s $\color{#d91a1a}-1.60\%$
test_td3_speed[True-None] 1.6484ms 1.6188ms 617.7426 Ops/s 620.3965 Ops/s $\color{#d91a1a}-0.43\%$
test_td3_speed[True-backward] 3.5682ms 3.0885ms 323.7849 Ops/s 304.4036 Ops/s $\textbf{\color{#35bf28}+6.37\%}$
test_td3_speed[reduce-overhead-None] 45.5808ms 23.6431ms 42.2957 Ops/s 42.0019 Ops/s $\color{#35bf28}+0.70\%$
test_cql_speed[False-None] 17.3804ms 17.0929ms 58.5039 Ops/s 57.8644 Ops/s $\color{#35bf28}+1.11\%$
test_cql_speed[False-backward] 23.8528ms 22.5789ms 44.2891 Ops/s 43.2003 Ops/s $\color{#35bf28}+2.52\%$
test_cql_speed[True-None] 3.3423ms 3.1969ms 312.8032 Ops/s 306.9239 Ops/s $\color{#35bf28}+1.92\%$
test_cql_speed[True-backward] 5.7364ms 5.2884ms 189.0941 Ops/s 178.2813 Ops/s $\textbf{\color{#35bf28}+6.06\%}$
test_cql_speed[reduce-overhead-None] 0.6888s 15.4170ms 64.8634 Ops/s 84.9400 Ops/s $\textbf{\color{#d91a1a}-23.64\%}$
test_a2c_speed[False-None] 4.0226ms 3.2182ms 310.7293 Ops/s 306.3985 Ops/s $\color{#35bf28}+1.41\%$
test_a2c_speed[False-backward] 6.7521ms 6.3242ms 158.1232 Ops/s 157.7763 Ops/s $\color{#35bf28}+0.22\%$
test_a2c_speed[True-None] 1.3752ms 1.3044ms 766.6258 Ops/s 757.7787 Ops/s $\color{#35bf28}+1.17\%$
test_a2c_speed[True-backward] 3.0335ms 2.9252ms 341.8599 Ops/s 335.4644 Ops/s $\color{#35bf28}+1.91\%$
test_a2c_speed[reduce-overhead-None] 1.0464ms 0.9730ms 1.0277 KOps/s 1.0255 KOps/s $\color{#35bf28}+0.22\%$
test_ppo_speed[False-None] 3.9912ms 3.8531ms 259.5283 Ops/s 252.5692 Ops/s $\color{#35bf28}+2.76\%$
test_ppo_speed[False-backward] 7.4578ms 7.0121ms 142.6111 Ops/s 137.2824 Ops/s $\color{#35bf28}+3.88\%$
test_ppo_speed[True-None] 1.4738ms 1.3894ms 719.7347 Ops/s 712.4149 Ops/s $\color{#35bf28}+1.03\%$
test_ppo_speed[True-backward] 3.1316ms 3.0385ms 329.1125 Ops/s 321.1260 Ops/s $\color{#35bf28}+2.49\%$
test_ppo_speed[reduce-overhead-None] 1.1094ms 1.0344ms 966.7520 Ops/s 947.2916 Ops/s $\color{#35bf28}+2.05\%$
test_reinforce_speed[False-None] 2.4758ms 2.2868ms 437.2907 Ops/s 439.2652 Ops/s $\color{#d91a1a}-0.45\%$
test_reinforce_speed[False-backward] 3.4025ms 3.3244ms 300.8075 Ops/s 288.5168 Ops/s $\color{#35bf28}+4.26\%$
test_reinforce_speed[True-None] 1.3506ms 1.2625ms 792.0988 Ops/s 796.1630 Ops/s $\color{#d91a1a}-0.51\%$
test_reinforce_speed[True-backward] 2.9450ms 2.8755ms 347.7606 Ops/s 327.4039 Ops/s $\textbf{\color{#35bf28}+6.22\%}$
test_reinforce_speed[reduce-overhead-None] 17.6402ms 9.7258ms 102.8192 Ops/s 107.9939 Ops/s $\color{#d91a1a}-4.79\%$
test_iql_speed[False-None] 10.3871ms 9.4658ms 105.6439 Ops/s 105.9230 Ops/s $\color{#d91a1a}-0.26\%$
test_iql_speed[False-backward] 13.7990ms 13.2530ms 75.4547 Ops/s 73.5560 Ops/s $\color{#35bf28}+2.58\%$
test_iql_speed[True-None] 2.2574ms 2.1442ms 466.3769 Ops/s 464.6749 Ops/s $\color{#35bf28}+0.37\%$
test_iql_speed[True-backward] 5.0821ms 4.6711ms 214.0802 Ops/s 209.9111 Ops/s $\color{#35bf28}+1.99\%$
test_iql_speed[reduce-overhead-None] 18.1209ms 10.5455ms 94.8270 Ops/s 97.0802 Ops/s $\color{#d91a1a}-2.32\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8873ms 5.7693ms 173.3315 Ops/s 170.0513 Ops/s $\color{#35bf28}+1.93\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1156ms 0.3612ms 2.7684 KOps/s 3.3264 KOps/s $\textbf{\color{#d91a1a}-16.77\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6774ms 0.2560ms 3.9065 KOps/s 3.3295 KOps/s $\textbf{\color{#35bf28}+17.33\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.7112ms 5.4808ms 182.4560 Ops/s 175.2138 Ops/s $\color{#35bf28}+4.13\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7435ms 0.3200ms 3.1246 KOps/s 2.4936 KOps/s $\textbf{\color{#35bf28}+25.30\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5049ms 0.3077ms 3.2503 KOps/s 3.8965 KOps/s $\textbf{\color{#d91a1a}-16.59\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6280ms 1.2443ms 803.6550 Ops/s 803.0927 Ops/s $\color{#35bf28}+0.07\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4186ms 1.1744ms 851.4831 Ops/s 860.6729 Ops/s $\color{#d91a1a}-1.07\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.8409ms 5.7464ms 174.0216 Ops/s 169.7632 Ops/s $\color{#35bf28}+2.51\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0177ms 0.4225ms 2.3671 KOps/s 1.9228 KOps/s $\textbf{\color{#35bf28}+23.10\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7945ms 0.4027ms 2.4832 KOps/s 2.0327 KOps/s $\textbf{\color{#35bf28}+22.16\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8344ms 5.6447ms 177.1572 Ops/s 172.6720 Ops/s $\color{#35bf28}+2.60\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9726ms 0.2744ms 3.6437 KOps/s 2.7487 KOps/s $\textbf{\color{#35bf28}+32.56\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6359ms 0.3464ms 2.8867 KOps/s 2.8373 KOps/s $\color{#35bf28}+1.74\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.7956ms 5.5473ms 180.2681 Ops/s 175.2370 Ops/s $\color{#35bf28}+2.87\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6283ms 0.2997ms 3.3369 KOps/s 3.1011 KOps/s $\textbf{\color{#35bf28}+7.60\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6289ms 0.2787ms 3.5881 KOps/s 3.2606 KOps/s $\textbf{\color{#35bf28}+10.05\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.8901ms 5.7276ms 174.5924 Ops/s 170.3418 Ops/s $\color{#35bf28}+2.50\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9949ms 0.4329ms 2.3099 KOps/s 2.3277 KOps/s $\color{#d91a1a}-0.77\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5929ms 0.3997ms 2.5018 KOps/s 2.4289 KOps/s $\color{#35bf28}+3.00\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5827s 16.5579ms 60.3941 Ops/s 50.7185 Ops/s $\textbf{\color{#35bf28}+19.08\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.4308ms 1.9954ms 501.1627 Ops/s 517.3086 Ops/s $\color{#d91a1a}-3.12\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.2104ms 0.9290ms 1.0764 KOps/s 874.7644 Ops/s $\textbf{\color{#35bf28}+23.05\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 8.0143ms 5.0849ms 196.6598 Ops/s 197.9684 Ops/s $\color{#d91a1a}-0.66\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.6969ms 1.9920ms 502.0138 Ops/s 553.5021 Ops/s $\textbf{\color{#d91a1a}-9.30\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.1018ms 1.2799ms 781.3395 Ops/s 739.3456 Ops/s $\textbf{\color{#35bf28}+5.68\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.5856ms 5.1695ms 193.4420 Ops/s 191.4307 Ops/s $\color{#35bf28}+1.05\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 12.3774ms 2.1342ms 468.5621 Ops/s 475.5260 Ops/s $\color{#d91a1a}-1.46\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.7207ms 1.1388ms 878.0979 Ops/s 930.3396 Ops/s $\textbf{\color{#d91a1a}-5.62\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 37.5739ms 35.0136ms 28.5604 Ops/s 27.8517 Ops/s $\color{#35bf28}+2.54\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.4429ms 17.8194ms 56.1187 Ops/s 55.7999 Ops/s $\color{#35bf28}+0.57\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 39.3783ms 36.5624ms 27.3505 Ops/s 27.0630 Ops/s $\color{#35bf28}+1.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.6933ms 18.0884ms 55.2842 Ops/s 54.8429 Ops/s $\color{#35bf28}+0.80\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 41.4700ms 38.1284ms 26.2272 Ops/s 25.7219 Ops/s $\color{#35bf28}+1.96\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.3070ms 19.6329ms 50.9348 Ops/s 50.3927 Ops/s $\color{#35bf28}+1.08\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8726ms 0.2121ms 4.7137 KOps/s 4.6319 KOps/s $\color{#35bf28}+1.77\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7040ms 1.3697ms 730.1068 Ops/s 716.4621 Ops/s $\color{#35bf28}+1.90\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.9988ms 2.2909ms 436.5049 Ops/s 436.6685 Ops/s $\color{#d91a1a}-0.04\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0492ms 2.9063ms 344.0821 Ops/s 345.2060 Ops/s $\color{#d91a1a}-0.33\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2374ms 0.1590ms 6.2893 KOps/s 6.3228 KOps/s $\color{#d91a1a}-0.53\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.4118ms 0.2097ms 4.7686 KOps/s 4.4258 KOps/s $\textbf{\color{#35bf28}+7.74\%}$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9204ms 1.7763ms 562.9591 Ops/s 601.5735 Ops/s $\textbf{\color{#d91a1a}-6.42\%}$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5244ms 1.3587ms 736.0179 Ops/s 715.4121 Ops/s $\color{#35bf28}+2.88\%$
test_collector_stack_then_write[50-img_shape0-small] 1.3142ms 1.1181ms 894.4140 Ops/s 889.4943 Ops/s $\color{#35bf28}+0.55\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.7017ms 3.4973ms 285.9333 Ops/s 281.3695 Ops/s $\color{#35bf28}+1.62\%$
test_collector_stack_then_write[100-img_shape2-large_img] 6.1231ms 5.7190ms 174.8554 Ops/s 175.2368 Ops/s $\color{#d91a1a}-0.22\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 14.9124ms 6.9330ms 144.2381 Ops/s 141.4607 Ops/s $\color{#35bf28}+1.96\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4288ms 0.2689ms 3.7185 KOps/s 3.7230 KOps/s $\color{#d91a1a}-0.12\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6604ms 1.5102ms 662.1517 Ops/s 662.3749 Ops/s $\color{#d91a1a}-0.03\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.8001ms 2.4081ms 415.2628 Ops/s 419.0688 Ops/s $\color{#d91a1a}-0.91\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.4087ms 3.1151ms 321.0217 Ops/s 318.7065 Ops/s $\color{#35bf28}+0.73\%$
test_collector_without_rb[100-img_shape0-atari] 34.5289ms 33.3961ms 29.9436 Ops/s 29.8167 Ops/s $\color{#35bf28}+0.43\%$
test_collector_without_rb[200-img_shape1-large_batch] 65.6326ms 65.3077ms 15.3121 Ops/s 15.1087 Ops/s $\color{#35bf28}+1.35\%$
test_collector_with_rb[100-img_shape0-atari] 38.2618ms 37.4794ms 26.6813 Ops/s 26.5051 Ops/s $\color{#35bf28}+0.66\%$
test_collector_with_rb[200-img_shape1-large_batch] 73.7158ms 73.1794ms 13.6650 Ops/s 13.4583 Ops/s $\color{#35bf28}+1.54\%$
test_collector_without_rb_cuda[100-img_shape0-atari] 56.0195ms 55.7585ms 17.9345 Ops/s 17.9178 Ops/s $\color{#35bf28}+0.09\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1112s 0.1110s 9.0064 Ops/s 8.9839 Ops/s $\color{#35bf28}+0.25\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 58.4782ms 57.7823ms 17.3063 Ops/s 17.3068 Ops/s $-0.00\%$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1153s 0.1148s 8.7135 Ops/s 8.7479 Ops/s $\color{#d91a1a}-0.39\%$

@vmoens
Copy link
Collaborator Author

vmoens commented Feb 10, 2026

Superseded by rebuilt stack (cleanup folded into the right commits)

@vmoens vmoens closed this Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Feature New feature sota-implementations/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant