Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Feb 10, 2026

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 10, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3486

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 38 Pending

As of commit 37de83c with merge base 0bc6d20 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 10, 2026
- knowledge_base/ISAACLAB.md: comprehensive guide covering import order,
  TiledCamera pixel observations, pre-vectorized env gotchas, 2-GPU
  async pipeline, replay buffer sizing, and 20+ documented pitfalls
- setup-and-run.sh: idempotent cluster setup script for running Dreamer
  with IsaacLab in Docker containers

Co-authored-by: Cursor <[email protected]>
ghstack-source-id: 6371f14
Pull-Request: #3486
Co-authored-by: Cursor <[email protected]>
@vmoens vmoens merged commit 37de83c into gh/vmoens/237/base Feb 10, 2026
102 of 104 checks passed
@vmoens vmoens deleted the gh/vmoens/237/head branch February 10, 2026 22:20
@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 173. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 81.7102μs 80.7586μs 12.3826 KOps/s 11.8144 KOps/s $\color{#35bf28}+4.81\%$
test_tensor_to_bytestream_speed[torch.save] 0.1383ms 0.1378ms 7.2552 KOps/s 7.1364 KOps/s $\color{#35bf28}+1.66\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1140s 0.1135s 8.8095 Ops/s 9.5406 Ops/s $\textbf{\color{#d91a1a}-7.66\%}$
test_tensor_to_bytestream_speed[numpy] 2.5601μs 2.5539μs 391.5622 KOps/s 394.7141 KOps/s $\color{#d91a1a}-0.80\%$
test_tensor_to_bytestream_speed[safetensors] 36.4411μs 36.3192μs 27.5337 KOps/s 26.9738 KOps/s $\color{#35bf28}+2.08\%$
test_simple 0.5495s 0.5478s 1.8254 Ops/s 1.7519 Ops/s $\color{#35bf28}+4.20\%$
test_transformed 1.1271s 1.1244s 0.8893 Ops/s 0.8718 Ops/s $\color{#35bf28}+2.01\%$
test_serial 1.6536s 1.6482s 0.6067 Ops/s 0.5981 Ops/s $\color{#35bf28}+1.45\%$
test_parallel 1.1248s 1.0283s 0.9725 Ops/s 0.9616 Ops/s $\color{#35bf28}+1.13\%$
test_step_mdp_speed[True-True-True-True-True] 0.3537ms 44.6912μs 22.3758 KOps/s 22.5229 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[True-True-True-True-False] 56.7810μs 25.4219μs 39.3361 KOps/s 39.2961 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[True-True-True-False-True] 60.4210μs 24.8482μs 40.2443 KOps/s 39.5487 KOps/s $\color{#35bf28}+1.76\%$
test_step_mdp_speed[True-True-True-False-False] 40.3610μs 13.8858μs 72.0158 KOps/s 71.6996 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[True-True-False-True-True] 84.2020μs 47.4152μs 21.0903 KOps/s 21.0240 KOps/s $\color{#35bf28}+0.32\%$
test_step_mdp_speed[True-True-False-True-False] 65.7910μs 27.7598μs 36.0233 KOps/s 35.9978 KOps/s $\color{#35bf28}+0.07\%$
test_step_mdp_speed[True-True-False-False-True] 62.1120μs 26.9634μs 37.0873 KOps/s 35.6740 KOps/s $\color{#35bf28}+3.96\%$
test_step_mdp_speed[True-True-False-False-False] 55.4610μs 16.6131μs 60.1934 KOps/s 59.8719 KOps/s $\color{#35bf28}+0.54\%$
test_step_mdp_speed[True-False-True-True-True] 0.1228ms 49.5142μs 20.1962 KOps/s 19.8473 KOps/s $\color{#35bf28}+1.76\%$
test_step_mdp_speed[True-False-True-True-False] 58.5620μs 30.5970μs 32.6829 KOps/s 32.5809 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[True-False-True-False-True] 60.0720μs 27.0535μs 36.9637 KOps/s 35.9049 KOps/s $\color{#35bf28}+2.95\%$
test_step_mdp_speed[True-False-True-False-False] 45.6910μs 16.5736μs 60.3370 KOps/s 60.3746 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[True-False-False-True-True] 85.2020μs 52.2718μs 19.1308 KOps/s 18.7033 KOps/s $\color{#35bf28}+2.29\%$
test_step_mdp_speed[True-False-False-True-False] 62.6010μs 33.0324μs 30.2733 KOps/s 29.8896 KOps/s $\color{#35bf28}+1.28\%$
test_step_mdp_speed[True-False-False-False-True] 80.5720μs 29.8272μs 33.5265 KOps/s 32.5479 KOps/s $\color{#35bf28}+3.01\%$
test_step_mdp_speed[True-False-False-False-False] 46.1310μs 19.5779μs 51.0779 KOps/s 51.6485 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[False-True-True-True-True] 84.1620μs 50.4167μs 19.8347 KOps/s 19.6431 KOps/s $\color{#35bf28}+0.98\%$
test_step_mdp_speed[False-True-True-True-False] 60.4420μs 30.4565μs 32.8337 KOps/s 32.4960 KOps/s $\color{#35bf28}+1.04\%$
test_step_mdp_speed[False-True-True-False-True] 2.4015ms 31.6468μs 31.5988 KOps/s 31.3037 KOps/s $\color{#35bf28}+0.94\%$
test_step_mdp_speed[False-True-True-False-False] 51.4010μs 18.8241μs 53.1233 KOps/s 54.3645 KOps/s $\color{#d91a1a}-2.28\%$
test_step_mdp_speed[False-True-False-True-True] 90.6620μs 53.2875μs 18.7661 KOps/s 18.8007 KOps/s $\color{#d91a1a}-0.18\%$
test_step_mdp_speed[False-True-False-True-False] 64.6620μs 33.1947μs 30.1253 KOps/s 29.4591 KOps/s $\color{#35bf28}+2.26\%$
test_step_mdp_speed[False-True-False-False-True] 63.1320μs 33.7381μs 29.6401 KOps/s 29.2847 KOps/s $\color{#35bf28}+1.21\%$
test_step_mdp_speed[False-True-False-False-False] 55.4010μs 20.8879μs 47.8747 KOps/s 47.5805 KOps/s $\color{#35bf28}+0.62\%$
test_step_mdp_speed[False-False-True-True-True] 97.1520μs 55.7028μs 17.9524 KOps/s 17.8600 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[False-False-True-True-False] 69.2420μs 36.5397μs 27.3675 KOps/s 27.6964 KOps/s $\color{#d91a1a}-1.19\%$
test_step_mdp_speed[False-False-True-False-True] 74.1920μs 34.0248μs 29.3904 KOps/s 28.9079 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[False-False-True-False-False] 63.3120μs 20.6637μs 48.3940 KOps/s 47.1329 KOps/s $\color{#35bf28}+2.68\%$
test_step_mdp_speed[False-False-False-True-True] 0.1062ms 57.3165μs 17.4470 KOps/s 17.0465 KOps/s $\color{#35bf28}+2.35\%$
test_step_mdp_speed[False-False-False-True-False] 64.9420μs 38.6786μs 25.8541 KOps/s 25.6787 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[False-False-False-False-True] 74.0420μs 36.1753μs 27.6432 KOps/s 27.5801 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[False-False-False-False-False] 52.9910μs 23.3903μs 42.7528 KOps/s 42.2581 KOps/s $\color{#35bf28}+1.17\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.7516s 0.7471s 1.3384 Ops/s 1.3068 Ops/s $\color{#35bf28}+2.42\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7272s 0.6331s 1.5797 Ops/s 1.5984 Ops/s $\color{#d91a1a}-1.17\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7635s 1.6867s 0.5929 Ops/s 0.5993 Ops/s $\color{#d91a1a}-1.07\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5409s 1.4637s 0.6832 Ops/s 0.6919 Ops/s $\color{#d91a1a}-1.26\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0073s 1.9302s 0.5181 Ops/s 0.5221 Ops/s $\color{#d91a1a}-0.77\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7867s 1.7033s 0.5871 Ops/s 0.5924 Ops/s $\color{#d91a1a}-0.89\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7624s 4.6893s 0.2133 Ops/s 0.2142 Ops/s $\color{#d91a1a}-0.42\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5388s 4.4652s 0.2240 Ops/s 0.2275 Ops/s $\color{#d91a1a}-1.56\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9893s 1.9105s 0.5234 Ops/s 0.5265 Ops/s $\color{#d91a1a}-0.58\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7693s 1.6505s 0.6059 Ops/s 0.6229 Ops/s $\color{#d91a1a}-2.73\%$
test_values[generalized_advantage_estimate-True-True] 10.2368ms 10.0723ms 99.2818 Ops/s 101.5958 Ops/s $\color{#d91a1a}-2.28\%$
test_values[vec_generalized_advantage_estimate-True-True] 15.1616ms 11.1928ms 89.3435 Ops/s 86.8529 Ops/s $\color{#35bf28}+2.87\%$
test_values[td0_return_estimate-False-False] 0.2266ms 0.1300ms 7.6921 KOps/s 7.8636 KOps/s $\color{#d91a1a}-2.18\%$
test_values[td1_return_estimate-False-False] 28.8612ms 27.1751ms 36.7983 Ops/s 38.2842 Ops/s $\color{#d91a1a}-3.88\%$
test_values[vec_td1_return_estimate-False-False] 12.2101ms 11.2487ms 88.8991 Ops/s 88.2283 Ops/s $\color{#35bf28}+0.76\%$
test_values[td_lambda_return_estimate-True-False] 40.9153ms 40.1745ms 24.8914 Ops/s 26.0500 Ops/s $\color{#d91a1a}-4.45\%$
test_values[vec_td_lambda_return_estimate-True-False] 12.1689ms 11.2033ms 89.2594 Ops/s 89.5975 Ops/s $\color{#d91a1a}-0.38\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.1703ms 8.9276ms 112.0119 Ops/s 113.3666 Ops/s $\color{#d91a1a}-1.19\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.8419ms 1.5134ms 660.7837 Ops/s 662.9349 Ops/s $\color{#d91a1a}-0.32\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4724ms 0.4082ms 2.4501 KOps/s 2.4872 KOps/s $\color{#d91a1a}-1.49\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 23.3475ms 22.5240ms 44.3970 Ops/s 34.2143 Ops/s $\textbf{\color{#35bf28}+29.76\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1.8566ms 1.7159ms 582.7767 Ops/s 587.0456 Ops/s $\color{#d91a1a}-0.73\%$
test_dqn_speed[False-None] 1.7815ms 1.3805ms 724.3534 Ops/s 728.7759 Ops/s $\color{#d91a1a}-0.61\%$
test_dqn_speed[False-backward] 1.9600ms 1.9005ms 526.1901 Ops/s 534.6627 Ops/s $\color{#d91a1a}-1.58\%$
test_dqn_speed[True-None] 0.6605ms 0.5512ms 1.8144 KOps/s 1.8276 KOps/s $\color{#d91a1a}-0.73\%$
test_dqn_speed[True-backward] 1.0427ms 0.9924ms 1.0076 KOps/s 893.2065 Ops/s $\textbf{\color{#35bf28}+12.81\%}$
test_dqn_speed[reduce-overhead-None] 0.9207ms 0.5316ms 1.8811 KOps/s 1.8105 KOps/s $\color{#35bf28}+3.90\%$
test_ddpg_speed[False-None] 3.2355ms 2.8199ms 354.6164 Ops/s 357.4313 Ops/s $\color{#d91a1a}-0.79\%$
test_ddpg_speed[False-backward] 4.1506ms 3.9847ms 250.9592 Ops/s 250.2503 Ops/s $\color{#35bf28}+0.28\%$
test_ddpg_speed[True-None] 1.8175ms 1.3994ms 714.5688 Ops/s 690.4837 Ops/s $\color{#35bf28}+3.49\%$
test_ddpg_speed[True-backward] 2.4274ms 2.3664ms 422.5800 Ops/s 349.2233 Ops/s $\textbf{\color{#35bf28}+21.01\%}$
test_ddpg_speed[reduce-overhead-None] 1.8113ms 1.3845ms 722.2858 Ops/s 708.5830 Ops/s $\color{#35bf28}+1.93\%$
test_sac_speed[False-None] 8.5499ms 7.9168ms 126.3131 Ops/s 127.7819 Ops/s $\color{#d91a1a}-1.15\%$
test_sac_speed[False-backward] 11.5802ms 11.0988ms 90.0995 Ops/s 90.8421 Ops/s $\color{#d91a1a}-0.82\%$
test_sac_speed[True-None] 2.2780ms 2.1339ms 468.6147 Ops/s 470.7036 Ops/s $\color{#d91a1a}-0.44\%$
test_sac_speed[True-backward] 4.2300ms 4.0451ms 247.2157 Ops/s 244.3309 Ops/s $\color{#35bf28}+1.18\%$
test_sac_speed[reduce-overhead-None] 2.3370ms 2.1390ms 467.5074 Ops/s 465.2547 Ops/s $\color{#35bf28}+0.48\%$
test_redq_speed[False-None] 11.2136ms 10.3450ms 96.6655 Ops/s 98.5387 Ops/s $\color{#d91a1a}-1.90\%$
test_redq_speed[False-backward] 18.5972ms 17.7885ms 56.2160 Ops/s 58.4277 Ops/s $\color{#d91a1a}-3.79\%$
test_redq_speed[True-None] 5.0563ms 4.4432ms 225.0624 Ops/s 223.9938 Ops/s $\color{#35bf28}+0.48\%$
test_redq_speed[True-backward] 10.0345ms 9.7952ms 102.0904 Ops/s 108.0486 Ops/s $\textbf{\color{#d91a1a}-5.51\%}$
test_redq_speed[reduce-overhead-None] 4.8359ms 4.3991ms 227.3169 Ops/s 232.4927 Ops/s $\color{#d91a1a}-2.23\%$
test_redq_deprec_speed[False-None] 11.4580ms 10.9917ms 90.9777 Ops/s 93.1402 Ops/s $\color{#d91a1a}-2.32\%$
test_redq_deprec_speed[False-backward] 0.3933s 23.2140ms 43.0775 Ops/s 64.5020 Ops/s $\textbf{\color{#d91a1a}-33.22\%}$
test_redq_deprec_speed[True-None] 3.9898ms 3.7028ms 270.0679 Ops/s 257.8642 Ops/s $\color{#35bf28}+4.73\%$
test_redq_deprec_speed[True-backward] 7.9329ms 7.6104ms 131.3986 Ops/s 118.4571 Ops/s $\textbf{\color{#35bf28}+10.93\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.0155ms 3.5818ms 279.1915 Ops/s 278.6344 Ops/s $\color{#35bf28}+0.20\%$
test_td3_speed[False-None] 48.5755ms 8.3308ms 120.0371 Ops/s 128.4914 Ops/s $\textbf{\color{#d91a1a}-6.58\%}$
test_td3_speed[False-backward] 11.1805ms 10.7478ms 93.0420 Ops/s 94.3859 Ops/s $\color{#d91a1a}-1.42\%$
test_td3_speed[True-None] 1.9258ms 1.8410ms 543.1806 Ops/s 548.3090 Ops/s $\color{#d91a1a}-0.94\%$
test_td3_speed[True-backward] 3.7991ms 3.6595ms 273.2597 Ops/s 251.7163 Ops/s $\textbf{\color{#35bf28}+8.56\%}$
test_td3_speed[reduce-overhead-None] 1.9059ms 1.7908ms 558.4072 Ops/s 550.9152 Ops/s $\color{#35bf28}+1.36\%$
test_cql_speed[False-None] 28.9719ms 25.9593ms 38.5219 Ops/s 39.2096 Ops/s $\color{#d91a1a}-1.75\%$
test_cql_speed[False-backward] 35.4132ms 34.6177ms 28.8869 Ops/s 28.9462 Ops/s $\color{#d91a1a}-0.20\%$
test_cql_speed[True-None] 12.6417ms 12.2760ms 81.4598 Ops/s 81.7506 Ops/s $\color{#d91a1a}-0.36\%$
test_cql_speed[True-backward] 18.8502ms 18.4802ms 54.1121 Ops/s 56.1393 Ops/s $\color{#d91a1a}-3.61\%$
test_cql_speed[reduce-overhead-None] 12.6714ms 12.2981ms 81.3132 Ops/s 78.7522 Ops/s $\color{#35bf28}+3.25\%$
test_a2c_speed[False-None] 5.8395ms 5.4364ms 183.9459 Ops/s 188.4617 Ops/s $\color{#d91a1a}-2.40\%$
test_a2c_speed[False-backward] 12.0936ms 11.7470ms 85.1282 Ops/s 85.9389 Ops/s $\color{#d91a1a}-0.94\%$
test_a2c_speed[True-None] 4.1338ms 3.7115ms 269.4297 Ops/s 270.4947 Ops/s $\color{#d91a1a}-0.39\%$
test_a2c_speed[True-backward] 8.9213ms 8.5978ms 116.3084 Ops/s 117.3462 Ops/s $\color{#d91a1a}-0.88\%$
test_a2c_speed[reduce-overhead-None] 4.1716ms 3.6981ms 270.4104 Ops/s 269.3869 Ops/s $\color{#35bf28}+0.38\%$
test_ppo_speed[False-None] 6.2592ms 5.8557ms 170.7729 Ops/s 170.5543 Ops/s $\color{#35bf28}+0.13\%$
test_ppo_speed[False-backward] 12.6559ms 12.3121ms 81.2211 Ops/s 80.8238 Ops/s $\color{#35bf28}+0.49\%$
test_ppo_speed[True-None] 4.0062ms 3.6247ms 275.8821 Ops/s 276.4883 Ops/s $\color{#d91a1a}-0.22\%$
test_ppo_speed[True-backward] 8.7191ms 8.4004ms 119.0415 Ops/s 118.6334 Ops/s $\color{#35bf28}+0.34\%$
test_ppo_speed[reduce-overhead-None] 4.0738ms 3.5738ms 279.8120 Ops/s 278.6424 Ops/s $\color{#35bf28}+0.42\%$
test_reinforce_speed[False-None] 4.9606ms 4.5185ms 221.3145 Ops/s 220.7336 Ops/s $\color{#35bf28}+0.26\%$
test_reinforce_speed[False-backward] 7.4482ms 7.2642ms 137.6621 Ops/s 135.6418 Ops/s $\color{#35bf28}+1.49\%$
test_reinforce_speed[True-None] 3.1616ms 2.8818ms 347.0062 Ops/s 348.1037 Ops/s $\color{#d91a1a}-0.32\%$
test_reinforce_speed[True-backward] 7.9687ms 7.7052ms 129.7820 Ops/s 129.5709 Ops/s $\color{#35bf28}+0.16\%$
test_reinforce_speed[reduce-overhead-None] 3.1620ms 2.8402ms 352.0928 Ops/s 340.7198 Ops/s $\color{#35bf28}+3.34\%$
test_iql_speed[False-None] 20.3397ms 19.6358ms 50.9273 Ops/s 48.9939 Ops/s $\color{#35bf28}+3.95\%$
test_iql_speed[False-backward] 34.9287ms 30.1974ms 33.1154 Ops/s 32.9890 Ops/s $\color{#35bf28}+0.38\%$
test_iql_speed[True-None] 8.7829ms 8.4692ms 118.0752 Ops/s 116.5924 Ops/s $\color{#35bf28}+1.27\%$
test_iql_speed[True-backward] 16.9419ms 16.6824ms 59.9435 Ops/s 59.5676 Ops/s $\color{#35bf28}+0.63\%$
test_iql_speed[reduce-overhead-None] 8.9593ms 8.5618ms 116.7975 Ops/s 114.4478 Ops/s $\color{#35bf28}+2.05\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3419ms 6.1131ms 163.5819 Ops/s 164.6074 Ops/s $\color{#d91a1a}-0.62\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.0416ms 0.3480ms 2.8735 KOps/s 2.8587 KOps/s $\color{#35bf28}+0.52\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5191ms 0.3055ms 3.2737 KOps/s 2.9935 KOps/s $\textbf{\color{#35bf28}+9.36\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0473ms 5.7719ms 173.2520 Ops/s 169.3361 Ops/s $\color{#35bf28}+2.31\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9416ms 0.2838ms 3.5233 KOps/s 3.4023 KOps/s $\color{#35bf28}+3.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5143ms 0.2910ms 3.4365 KOps/s 3.6214 KOps/s $\textbf{\color{#d91a1a}-5.11\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6081ms 1.3480ms 741.8278 Ops/s 809.6003 Ops/s $\textbf{\color{#d91a1a}-8.37\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5256ms 1.2402ms 806.3046 Ops/s 861.0816 Ops/s $\textbf{\color{#d91a1a}-6.36\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1744ms 5.9409ms 168.3253 Ops/s 163.5160 Ops/s $\color{#35bf28}+2.94\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8459ms 0.4433ms 2.2557 KOps/s 2.1263 KOps/s $\textbf{\color{#35bf28}+6.09\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7375ms 0.4622ms 2.1635 KOps/s 2.1809 KOps/s $\color{#d91a1a}-0.80\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.8154ms 5.8691ms 170.3852 Ops/s 168.1810 Ops/s $\color{#35bf28}+1.31\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9628ms 0.2810ms 3.5593 KOps/s 2.7252 KOps/s $\textbf{\color{#35bf28}+30.61\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4388ms 0.2617ms 3.8214 KOps/s 2.7083 KOps/s $\textbf{\color{#35bf28}+41.10\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1204ms 5.7688ms 173.3475 Ops/s 169.6903 Ops/s $\color{#35bf28}+2.16\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8036ms 0.2745ms 3.6424 KOps/s 3.3432 KOps/s $\textbf{\color{#35bf28}+8.95\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4321ms 0.2584ms 3.8703 KOps/s 3.8827 KOps/s $\color{#d91a1a}-0.32\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0892ms 5.9596ms 167.7971 Ops/s 165.8078 Ops/s $\color{#35bf28}+1.20\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.9965ms 0.4739ms 2.1101 KOps/s 1.9260 KOps/s $\textbf{\color{#35bf28}+9.56\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6682ms 0.4322ms 2.3140 KOps/s 2.2878 KOps/s $\color{#35bf28}+1.15\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5727s 16.3547ms 61.1445 Ops/s 51.4863 Ops/s $\textbf{\color{#35bf28}+18.76\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.8966ms 1.7969ms 556.5094 Ops/s 560.5723 Ops/s $\color{#d91a1a}-0.72\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.1720ms 0.8772ms 1.1400 KOps/s 857.1612 Ops/s $\textbf{\color{#35bf28}+33.00\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.1839ms 5.0690ms 197.2783 Ops/s 197.9837 Ops/s $\color{#d91a1a}-0.36\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.8978ms 1.7736ms 563.8366 Ops/s 572.5721 Ops/s $\color{#d91a1a}-1.53\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.1092ms 1.2700ms 787.4043 Ops/s 828.6175 Ops/s $\color{#d91a1a}-4.97\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.6922ms 5.1952ms 192.4840 Ops/s 192.8030 Ops/s $\color{#d91a1a}-0.17\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.2540ms 1.9649ms 508.9369 Ops/s 503.1938 Ops/s $\color{#35bf28}+1.14\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.7224ms 1.1109ms 900.2095 Ops/s 942.9763 Ops/s $\color{#d91a1a}-4.54\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 37.8463ms 35.7101ms 28.0033 Ops/s 28.2161 Ops/s $\color{#d91a1a}-0.75\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.3438ms 18.4059ms 54.3305 Ops/s 56.6026 Ops/s $\color{#d91a1a}-4.01\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.0786ms 36.8738ms 27.1196 Ops/s 27.1260 Ops/s $\color{#d91a1a}-0.02\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.8076ms 18.1448ms 55.1121 Ops/s 54.8157 Ops/s $\color{#35bf28}+0.54\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 41.4274ms 38.3867ms 26.0507 Ops/s 25.9553 Ops/s $\color{#35bf28}+0.37\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.6244ms 19.7589ms 50.6100 Ops/s 50.9733 Ops/s $\color{#d91a1a}-0.71\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8651ms 0.2147ms 4.6579 KOps/s 4.5528 KOps/s $\color{#35bf28}+2.31\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7162ms 1.3921ms 718.3551 Ops/s 721.3452 Ops/s $\color{#d91a1a}-0.41\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.7107ms 2.2934ms 436.0247 Ops/s 418.5437 Ops/s $\color{#35bf28}+4.18\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.1868ms 2.9316ms 341.1057 Ops/s 342.0727 Ops/s $\color{#d91a1a}-0.28\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2352ms 0.1308ms 7.6467 KOps/s 7.6813 KOps/s $\color{#d91a1a}-0.45\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3432ms 0.1835ms 5.4492 KOps/s 5.5434 KOps/s $\color{#d91a1a}-1.70\%$
test_storage_write_contiguous[100-img_shape2-large_img] 2.1770ms 1.7358ms 576.0987 Ops/s 563.9671 Ops/s $\color{#35bf28}+2.15\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.4435ms 1.3209ms 757.0782 Ops/s 791.3899 Ops/s $\color{#d91a1a}-4.34\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2480ms 1.1196ms 893.1791 Ops/s 898.7946 Ops/s $\color{#d91a1a}-0.62\%$
test_collector_stack_then_write[100-img_shape1-atari] 7.0227ms 3.5530ms 281.4508 Ops/s 281.4091 Ops/s $\color{#35bf28}+0.01\%$
test_collector_stack_then_write[100-img_shape2-large_img] 11.2242ms 5.8221ms 171.7590 Ops/s 180.7267 Ops/s $\color{#d91a1a}-4.96\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.4925ms 7.3178ms 136.6532 Ops/s 143.6299 Ops/s $\color{#d91a1a}-4.86\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4362ms 0.2828ms 3.5356 KOps/s 3.7061 KOps/s $\color{#d91a1a}-4.60\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6977ms 1.4993ms 666.9908 Ops/s 665.7683 Ops/s $\color{#35bf28}+0.18\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.8260ms 2.4189ms 413.4064 Ops/s 399.4476 Ops/s $\color{#35bf28}+3.49\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.4457ms 3.1300ms 319.4895 Ops/s 318.8911 Ops/s $\color{#35bf28}+0.19\%$
test_collector_without_rb[100-img_shape0-atari] 34.9814ms 33.6804ms 29.6909 Ops/s 29.8148 Ops/s $\color{#d91a1a}-0.42\%$
test_collector_without_rb[200-img_shape1-large_batch] 66.3660ms 66.1590ms 15.1151 Ops/s 15.0925 Ops/s $\color{#35bf28}+0.15\%$
test_collector_with_rb[100-img_shape0-atari] 39.0067ms 38.1129ms 26.2379 Ops/s 26.1065 Ops/s $\color{#35bf28}+0.50\%$
test_collector_with_rb[200-img_shape1-large_batch] 97.6736ms 76.3807ms 13.0923 Ops/s 13.2801 Ops/s $\color{#d91a1a}-1.41\%$

@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 83.0261μs 82.2244μs 12.1618 KOps/s 12.3657 KOps/s $\color{#d91a1a}-1.65\%$
test_tensor_to_bytestream_speed[torch.save] 0.1409ms 0.1406ms 7.1115 KOps/s 7.1234 KOps/s $\color{#d91a1a}-0.17\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1194s 0.1190s 8.4034 Ops/s 8.2055 Ops/s $\color{#35bf28}+2.41\%$
test_tensor_to_bytestream_speed[numpy] 2.5104μs 2.4882μs 401.8984 KOps/s 401.1725 KOps/s $\color{#35bf28}+0.18\%$
test_tensor_to_bytestream_speed[safetensors] 36.9748μs 36.7492μs 27.2115 KOps/s 27.0354 KOps/s $\color{#35bf28}+0.65\%$
test_simple 0.7917s 0.7907s 1.2647 Ops/s 1.2328 Ops/s $\color{#35bf28}+2.59\%$
test_transformed 1.5403s 1.4470s 0.6911 Ops/s 0.6929 Ops/s $\color{#d91a1a}-0.26\%$
test_serial 2.4059s 2.3116s 0.4326 Ops/s 0.4327 Ops/s $\color{#d91a1a}-0.02\%$
test_parallel 1.9039s 1.8222s 0.5488 Ops/s 0.5412 Ops/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[True-True-True-True-True] 0.2619ms 45.1642μs 22.1414 KOps/s 22.7787 KOps/s $\color{#d91a1a}-2.80\%$
test_step_mdp_speed[True-True-True-True-False] 55.1710μs 25.6327μs 39.0127 KOps/s 40.0301 KOps/s $\color{#d91a1a}-2.54\%$
test_step_mdp_speed[True-True-True-False-True] 57.4610μs 25.1366μs 39.7826 KOps/s 40.6037 KOps/s $\color{#d91a1a}-2.02\%$
test_step_mdp_speed[True-True-True-False-False] 41.7600μs 13.9003μs 71.9408 KOps/s 72.1589 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[True-True-False-True-True] 79.6810μs 48.4170μs 20.6539 KOps/s 21.0876 KOps/s $\color{#d91a1a}-2.06\%$
test_step_mdp_speed[True-True-False-True-False] 67.8610μs 27.6402μs 36.1792 KOps/s 36.6618 KOps/s $\color{#d91a1a}-1.32\%$
test_step_mdp_speed[True-True-False-False-True] 62.1710μs 27.6961μs 36.1062 KOps/s 36.4622 KOps/s $\color{#d91a1a}-0.98\%$
test_step_mdp_speed[True-True-False-False-False] 62.1410μs 16.6385μs 60.1017 KOps/s 60.2185 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[True-False-True-True-True] 83.9920μs 50.7417μs 19.7076 KOps/s 20.1270 KOps/s $\color{#d91a1a}-2.08\%$
test_step_mdp_speed[True-False-True-True-False] 94.9720μs 29.4067μs 34.0058 KOps/s 33.1822 KOps/s $\color{#35bf28}+2.48\%$
test_step_mdp_speed[True-False-True-False-True] 53.6210μs 27.4152μs 36.4761 KOps/s 36.4140 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[True-False-True-False-False] 47.2400μs 16.7011μs 59.8764 KOps/s 59.6904 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[True-False-False-True-True] 84.6720μs 52.7246μs 18.9665 KOps/s 19.0876 KOps/s $\color{#d91a1a}-0.63\%$
test_step_mdp_speed[True-False-False-True-False] 61.1710μs 33.4043μs 29.9362 KOps/s 30.5321 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[True-False-False-False-True] 0.1041ms 30.3404μs 32.9593 KOps/s 32.9913 KOps/s $\color{#d91a1a}-0.10\%$
test_step_mdp_speed[True-False-False-False-False] 47.4300μs 19.4027μs 51.5393 KOps/s 51.4353 KOps/s $\color{#35bf28}+0.20\%$
test_step_mdp_speed[False-True-True-True-True] 86.2420μs 50.1046μs 19.9583 KOps/s 19.9156 KOps/s $\color{#35bf28}+0.21\%$
test_step_mdp_speed[False-True-True-True-False] 57.3310μs 30.5992μs 32.6806 KOps/s 32.7230 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[False-True-True-False-True] 2.3032ms 32.2124μs 31.0440 KOps/s 31.7605 KOps/s $\color{#d91a1a}-2.26\%$
test_step_mdp_speed[False-True-True-False-False] 48.8510μs 18.3313μs 54.5514 KOps/s 54.6208 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[False-True-False-True-True] 84.5620μs 53.9537μs 18.5344 KOps/s 18.8511 KOps/s $\color{#d91a1a}-1.68\%$
test_step_mdp_speed[False-True-False-True-False] 82.7110μs 33.4019μs 29.9385 KOps/s 30.0843 KOps/s $\color{#d91a1a}-0.48\%$
test_step_mdp_speed[False-True-False-False-True] 59.7110μs 33.6592μs 29.7095 KOps/s 29.8418 KOps/s $\color{#d91a1a}-0.44\%$
test_step_mdp_speed[False-True-False-False-False] 48.5710μs 21.1036μs 47.3853 KOps/s 48.8427 KOps/s $\color{#d91a1a}-2.98\%$
test_step_mdp_speed[False-False-True-True-True] 85.4620μs 56.6943μs 17.6385 KOps/s 18.4526 KOps/s $\color{#d91a1a}-4.41\%$
test_step_mdp_speed[False-False-True-True-False] 57.7510μs 36.1959μs 27.6275 KOps/s 28.1192 KOps/s $\color{#d91a1a}-1.75\%$
test_step_mdp_speed[False-False-True-False-True] 77.3020μs 33.9059μs 29.4934 KOps/s 29.8202 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[False-False-True-False-False] 73.0220μs 21.0200μs 47.5737 KOps/s 47.8047 KOps/s $\color{#d91a1a}-0.48\%$
test_step_mdp_speed[False-False-False-True-True] 96.7810μs 57.9523μs 17.2556 KOps/s 17.6730 KOps/s $\color{#d91a1a}-2.36\%$
test_step_mdp_speed[False-False-False-True-False] 68.9110μs 38.9397μs 25.6807 KOps/s 26.7146 KOps/s $\color{#d91a1a}-3.87\%$
test_step_mdp_speed[False-False-False-False-True] 66.8720μs 36.9884μs 27.0355 KOps/s 27.7678 KOps/s $\color{#d91a1a}-2.64\%$
test_step_mdp_speed[False-False-False-False-False] 52.6500μs 23.8144μs 41.9914 KOps/s 43.5522 KOps/s $\color{#d91a1a}-3.58\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8746s 0.7746s 1.2910 Ops/s 1.2960 Ops/s $\color{#d91a1a}-0.38\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7318s 0.6382s 1.5670 Ops/s 1.5595 Ops/s $\color{#35bf28}+0.48\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7700s 1.6932s 0.5906 Ops/s 0.5928 Ops/s $\color{#d91a1a}-0.38\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5374s 1.4587s 0.6855 Ops/s 0.6882 Ops/s $\color{#d91a1a}-0.39\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0203s 1.9430s 0.5147 Ops/s 0.5151 Ops/s $\color{#d91a1a}-0.08\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7861s 1.7054s 0.5864 Ops/s 0.5838 Ops/s $\color{#35bf28}+0.45\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.6856s 4.6551s 0.2148 Ops/s 0.2136 Ops/s $\color{#35bf28}+0.56\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.6197s 4.4420s 0.2251 Ops/s 0.2249 Ops/s $\color{#35bf28}+0.11\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9802s 1.8951s 0.5277 Ops/s 0.5234 Ops/s $\color{#35bf28}+0.81\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6965s 1.6403s 0.6097 Ops/s 0.6229 Ops/s $\color{#d91a1a}-2.12\%$
test_values[generalized_advantage_estimate-True-True] 20.3172ms 19.9027ms 50.2444 Ops/s 50.3262 Ops/s $\color{#d91a1a}-0.16\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1327s 3.5736ms 279.8264 Ops/s 254.2095 Ops/s $\textbf{\color{#35bf28}+10.08\%}$
test_values[td0_return_estimate-False-False] 0.1068ms 82.1505μs 12.1728 KOps/s 12.1626 KOps/s $\color{#35bf28}+0.08\%$
test_values[td1_return_estimate-False-False] 48.0556ms 47.6375ms 20.9919 Ops/s 21.0884 Ops/s $\color{#d91a1a}-0.46\%$
test_values[vec_td1_return_estimate-False-False] 1.2821ms 1.0844ms 922.1751 Ops/s 920.7260 Ops/s $\color{#35bf28}+0.16\%$
test_values[td_lambda_return_estimate-True-False] 78.5905ms 78.0975ms 12.8045 Ops/s 12.8171 Ops/s $\color{#d91a1a}-0.10\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2700ms 1.0806ms 925.3731 Ops/s 922.6889 Ops/s $\color{#35bf28}+0.29\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 21.0917ms 20.3681ms 49.0963 Ops/s 49.7663 Ops/s $\color{#d91a1a}-1.35\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0190ms 0.7511ms 1.3313 KOps/s 1.3246 KOps/s $\color{#35bf28}+0.51\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7212ms 0.6734ms 1.4850 KOps/s 1.4798 KOps/s $\color{#35bf28}+0.35\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5410ms 1.4872ms 672.4082 Ops/s 671.7738 Ops/s $\color{#35bf28}+0.09\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8617ms 0.6904ms 1.4484 KOps/s 1.4434 KOps/s $\color{#35bf28}+0.34\%$
test_dqn_speed[False-None] 1.6136ms 1.5307ms 653.2787 Ops/s 647.0644 Ops/s $\color{#35bf28}+0.96\%$
test_dqn_speed[False-backward] 2.2406ms 2.1839ms 457.8915 Ops/s 457.3120 Ops/s $\color{#35bf28}+0.13\%$
test_dqn_speed[True-None] 1.2573ms 0.5597ms 1.7867 KOps/s 1.7701 KOps/s $\color{#35bf28}+0.94\%$
test_dqn_speed[True-backward] 1.1945ms 1.0929ms 914.9659 Ops/s 899.7096 Ops/s $\color{#35bf28}+1.70\%$
test_dqn_speed[reduce-overhead-None] 0.7224ms 0.5823ms 1.7172 KOps/s 1.6778 KOps/s $\color{#35bf28}+2.35\%$
test_ddpg_speed[False-None] 3.2987ms 2.9306ms 341.2215 Ops/s 341.1430 Ops/s $\color{#35bf28}+0.02\%$
test_ddpg_speed[False-backward] 4.3911ms 4.1882ms 238.7656 Ops/s 237.2956 Ops/s $\color{#35bf28}+0.62\%$
test_ddpg_speed[True-None] 1.7088ms 1.3158ms 760.0109 Ops/s 757.4475 Ops/s $\color{#35bf28}+0.34\%$
test_ddpg_speed[True-backward] 2.9983ms 2.3746ms 421.1323 Ops/s 419.5757 Ops/s $\color{#35bf28}+0.37\%$
test_ddpg_speed[reduce-overhead-None] 1.7664ms 1.3370ms 747.9278 Ops/s 745.2724 Ops/s $\color{#35bf28}+0.36\%$
test_sac_speed[False-None] 8.9497ms 8.4262ms 118.6768 Ops/s 118.5248 Ops/s $\color{#35bf28}+0.13\%$
test_sac_speed[False-backward] 11.8935ms 11.4254ms 87.5243 Ops/s 86.7806 Ops/s $\color{#35bf28}+0.86\%$
test_sac_speed[True-None] 2.2354ms 1.8021ms 554.9044 Ops/s 542.5944 Ops/s $\color{#35bf28}+2.27\%$
test_sac_speed[True-backward] 3.4610ms 3.4168ms 292.6736 Ops/s 286.3543 Ops/s $\color{#35bf28}+2.21\%$
test_sac_speed[reduce-overhead-None] 0.3605s 12.0819ms 82.7683 Ops/s 91.1615 Ops/s $\textbf{\color{#d91a1a}-9.21\%}$
test_redq_deprec_speed[False-None] 10.0963ms 9.3960ms 106.4283 Ops/s 106.4338 Ops/s $-0.01\%$
test_redq_deprec_speed[False-backward] 12.9798ms 12.5107ms 79.9318 Ops/s 80.2449 Ops/s $\color{#d91a1a}-0.39\%$
test_redq_deprec_speed[True-None] 3.0733ms 2.5276ms 395.6254 Ops/s 394.6995 Ops/s $\color{#35bf28}+0.23\%$
test_redq_deprec_speed[True-backward] 4.7611ms 4.1360ms 241.7817 Ops/s 229.9908 Ops/s $\textbf{\color{#35bf28}+5.13\%}$
test_redq_deprec_speed[reduce-overhead-None] 16.3459ms 9.9661ms 100.3398 Ops/s 101.5940 Ops/s $\color{#d91a1a}-1.23\%$
test_td3_speed[False-None] 8.5072ms 8.2469ms 121.2577 Ops/s 120.9916 Ops/s $\color{#35bf28}+0.22\%$
test_td3_speed[False-backward] 11.2559ms 10.7094ms 93.3755 Ops/s 93.7989 Ops/s $\color{#d91a1a}-0.45\%$
test_td3_speed[True-None] 1.7847ms 1.7078ms 585.5539 Ops/s 614.3327 Ops/s $\color{#d91a1a}-4.68\%$
test_td3_speed[True-backward] 3.2661ms 3.0766ms 325.0319 Ops/s 319.5671 Ops/s $\color{#35bf28}+1.71\%$
test_td3_speed[reduce-overhead-None] 46.3183ms 24.1639ms 41.3841 Ops/s 40.6973 Ops/s $\color{#35bf28}+1.69\%$
test_cql_speed[False-None] 18.4479ms 17.4704ms 57.2395 Ops/s 57.2303 Ops/s $\color{#35bf28}+0.02\%$
test_cql_speed[False-backward] 23.0840ms 22.5698ms 44.3070 Ops/s 43.1504 Ops/s $\color{#35bf28}+2.68\%$
test_cql_speed[True-None] 3.4021ms 3.2250ms 310.0744 Ops/s 303.8887 Ops/s $\color{#35bf28}+2.04\%$
test_cql_speed[True-backward] 5.8015ms 5.3297ms 187.6293 Ops/s 177.6869 Ops/s $\textbf{\color{#35bf28}+5.60\%}$
test_cql_speed[reduce-overhead-None] 0.6913s 15.5423ms 64.3406 Ops/s 84.1474 Ops/s $\textbf{\color{#d91a1a}-23.54\%}$
test_a2c_speed[False-None] 4.0105ms 3.2640ms 306.3719 Ops/s 305.3259 Ops/s $\color{#35bf28}+0.34\%$
test_a2c_speed[False-backward] 6.6241ms 6.2370ms 160.3331 Ops/s 155.3418 Ops/s $\color{#35bf28}+3.21\%$
test_a2c_speed[True-None] 1.7673ms 1.3326ms 750.3880 Ops/s 749.7848 Ops/s $\color{#35bf28}+0.08\%$
test_a2c_speed[True-backward] 3.2066ms 2.9633ms 337.4649 Ops/s 334.9804 Ops/s $\color{#35bf28}+0.74\%$
test_a2c_speed[reduce-overhead-None] 1.4046ms 0.9891ms 1.0110 KOps/s 1.0226 KOps/s $\color{#d91a1a}-1.13\%$
test_ppo_speed[False-None] 4.2732ms 3.8689ms 258.4723 Ops/s 257.5786 Ops/s $\color{#35bf28}+0.35\%$
test_ppo_speed[False-backward] 7.4372ms 7.0044ms 142.7669 Ops/s 140.3636 Ops/s $\color{#35bf28}+1.71\%$
test_ppo_speed[True-None] 1.9176ms 1.4153ms 706.5608 Ops/s 707.8507 Ops/s $\color{#d91a1a}-0.18\%$
test_ppo_speed[True-backward] 3.1809ms 3.0734ms 325.3771 Ops/s 303.3800 Ops/s $\textbf{\color{#35bf28}+7.25\%}$
test_ppo_speed[reduce-overhead-None] 1.2403ms 1.0478ms 954.3358 Ops/s 928.6321 Ops/s $\color{#35bf28}+2.77\%$
test_reinforce_speed[False-None] 2.4093ms 2.3064ms 433.5704 Ops/s 432.8489 Ops/s $\color{#35bf28}+0.17\%$
test_reinforce_speed[False-backward] 3.5245ms 3.4408ms 290.6307 Ops/s 289.5590 Ops/s $\color{#35bf28}+0.37\%$
test_reinforce_speed[True-None] 1.3498ms 1.2767ms 783.2672 Ops/s 786.1074 Ops/s $\color{#d91a1a}-0.36\%$
test_reinforce_speed[True-backward] 3.1579ms 3.0350ms 329.4842 Ops/s 326.3899 Ops/s $\color{#35bf28}+0.95\%$
test_reinforce_speed[reduce-overhead-None] 17.9349ms 9.6999ms 103.0942 Ops/s 104.8484 Ops/s $\color{#d91a1a}-1.67\%$
test_iql_speed[False-None] 10.0674ms 9.4646ms 105.6573 Ops/s 104.6401 Ops/s $\color{#35bf28}+0.97\%$
test_iql_speed[False-backward] 13.5729ms 13.3492ms 74.9108 Ops/s 75.0681 Ops/s $\color{#d91a1a}-0.21\%$
test_iql_speed[True-None] 2.4617ms 2.1891ms 456.8140 Ops/s 455.1660 Ops/s $\color{#35bf28}+0.36\%$
test_iql_speed[True-backward] 4.9093ms 4.7217ms 211.7901 Ops/s 202.2992 Ops/s $\color{#35bf28}+4.69\%$
test_iql_speed[reduce-overhead-None] 18.3155ms 10.6912ms 93.5351 Ops/s 74.4123 Ops/s $\textbf{\color{#35bf28}+25.70\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3831ms 5.9290ms 168.6617 Ops/s 166.2242 Ops/s $\color{#35bf28}+1.47\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1599ms 0.3677ms 2.7197 KOps/s 3.2094 KOps/s $\textbf{\color{#d91a1a}-15.26\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6217ms 0.3491ms 2.8649 KOps/s 3.4360 KOps/s $\textbf{\color{#d91a1a}-16.62\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9434ms 5.7003ms 175.4298 Ops/s 170.1733 Ops/s $\color{#35bf28}+3.09\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6909ms 0.3326ms 3.0070 KOps/s 3.2690 KOps/s $\textbf{\color{#d91a1a}-8.01\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5649ms 0.3353ms 2.9826 KOps/s 2.8692 KOps/s $\color{#35bf28}+3.95\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6335ms 1.3486ms 741.5233 Ops/s 789.2935 Ops/s $\textbf{\color{#d91a1a}-6.05\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5851ms 1.2384ms 807.5021 Ops/s 812.7871 Ops/s $\color{#d91a1a}-0.65\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0328ms 5.9038ms 169.3812 Ops/s 165.6762 Ops/s $\color{#35bf28}+2.24\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1039ms 0.4285ms 2.3338 KOps/s 2.1153 KOps/s $\textbf{\color{#35bf28}+10.33\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6362ms 0.4638ms 2.1559 KOps/s 2.0071 KOps/s $\textbf{\color{#35bf28}+7.42\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8818ms 5.7520ms 173.8517 Ops/s 168.9052 Ops/s $\color{#35bf28}+2.93\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8829ms 0.3493ms 2.8633 KOps/s 3.1583 KOps/s $\textbf{\color{#d91a1a}-9.34\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5868ms 0.3578ms 2.7945 KOps/s 3.1009 KOps/s $\textbf{\color{#d91a1a}-9.88\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9576ms 5.7288ms 174.5576 Ops/s 169.9191 Ops/s $\color{#35bf28}+2.73\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9999ms 0.3672ms 2.7235 KOps/s 3.5110 KOps/s $\textbf{\color{#d91a1a}-22.43\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5538ms 0.3080ms 3.2465 KOps/s 3.5728 KOps/s $\textbf{\color{#d91a1a}-9.13\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3284ms 5.9256ms 168.7590 Ops/s 166.9488 Ops/s $\color{#35bf28}+1.08\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8194ms 0.4473ms 2.2355 KOps/s 2.0855 KOps/s $\textbf{\color{#35bf28}+7.19\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7316ms 0.4805ms 2.0810 KOps/s 2.0598 KOps/s $\color{#35bf28}+1.03\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5867s 16.6918ms 59.9097 Ops/s 51.0079 Ops/s $\textbf{\color{#35bf28}+17.45\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.8648ms 1.9785ms 505.4253 Ops/s 511.8408 Ops/s $\color{#d91a1a}-1.25\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 10.0296ms 1.2839ms 778.9042 Ops/s 769.1568 Ops/s $\color{#35bf28}+1.27\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.0279ms 5.1135ms 195.5605 Ops/s 193.3938 Ops/s $\color{#35bf28}+1.12\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 4.0779ms 1.8044ms 554.1961 Ops/s 544.2040 Ops/s $\color{#35bf28}+1.84\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.0805ms 0.9612ms 1.0403 KOps/s 763.0440 Ops/s $\textbf{\color{#35bf28}+36.34\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.9073ms 5.2435ms 190.7124 Ops/s 185.8729 Ops/s $\color{#35bf28}+2.60\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 12.1760ms 2.0934ms 477.6919 Ops/s 475.8261 Ops/s $\color{#35bf28}+0.39\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.7427ms 1.1589ms 862.8739 Ops/s 858.0124 Ops/s $\color{#35bf28}+0.57\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 38.9487ms 36.2911ms 27.5550 Ops/s 27.0557 Ops/s $\color{#35bf28}+1.85\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.9842ms 18.0933ms 55.2690 Ops/s 53.8841 Ops/s $\color{#35bf28}+2.57\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.0938ms 37.2913ms 26.8159 Ops/s 26.4343 Ops/s $\color{#35bf28}+1.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.9401ms 18.4927ms 54.0754 Ops/s 53.0523 Ops/s $\color{#35bf28}+1.93\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.5990ms 38.8559ms 25.7361 Ops/s 25.2887 Ops/s $\color{#35bf28}+1.77\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.1681ms 19.9090ms 50.2285 Ops/s 47.5597 Ops/s $\textbf{\color{#35bf28}+5.61\%}$
test_storage_write_lazystack[50-img_shape0-small] 0.8682ms 0.2164ms 4.6206 KOps/s 4.6086 KOps/s $\color{#35bf28}+0.26\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.6881ms 1.3622ms 734.1297 Ops/s 748.3202 Ops/s $\color{#d91a1a}-1.90\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.4099ms 2.2395ms 446.5205 Ops/s 438.1674 Ops/s $\color{#35bf28}+1.91\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0508ms 2.8889ms 346.1548 Ops/s 345.4003 Ops/s $\color{#35bf28}+0.22\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2990ms 0.1620ms 6.1747 KOps/s 5.9827 KOps/s $\color{#35bf28}+3.21\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3890ms 0.2301ms 4.3454 KOps/s 3.9773 KOps/s $\textbf{\color{#35bf28}+9.25\%}$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9803ms 1.7891ms 558.9433 Ops/s 543.9639 Ops/s $\color{#35bf28}+2.75\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5384ms 1.3634ms 733.4512 Ops/s 749.6782 Ops/s $\color{#d91a1a}-2.16\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2894ms 1.1330ms 882.6163 Ops/s 874.8045 Ops/s $\color{#35bf28}+0.89\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.7778ms 3.5556ms 281.2486 Ops/s 279.6839 Ops/s $\color{#35bf28}+0.56\%$
test_collector_stack_then_write[100-img_shape2-large_img] 5.9587ms 5.7430ms 174.1259 Ops/s 172.6844 Ops/s $\color{#35bf28}+0.83\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.5001ms 7.2533ms 137.8688 Ops/s 141.9397 Ops/s $\color{#d91a1a}-2.87\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4263ms 0.2702ms 3.7016 KOps/s 3.6169 KOps/s $\color{#35bf28}+2.34\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6399ms 1.4743ms 678.2905 Ops/s 630.2407 Ops/s $\textbf{\color{#35bf28}+7.62\%}$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.5095ms 2.3664ms 422.5775 Ops/s 412.6322 Ops/s $\color{#35bf28}+2.41\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.3732ms 3.1016ms 322.4093 Ops/s 322.6182 Ops/s $\color{#d91a1a}-0.06\%$
test_collector_without_rb[100-img_shape0-atari] 34.7153ms 34.0315ms 29.3845 Ops/s 29.6990 Ops/s $\color{#d91a1a}-1.06\%$
test_collector_without_rb[200-img_shape1-large_batch] 68.0992ms 66.9974ms 14.9259 Ops/s 14.9626 Ops/s $\color{#d91a1a}-0.24\%$
test_collector_with_rb[100-img_shape0-atari] 39.1290ms 38.6514ms 25.8723 Ops/s 26.1457 Ops/s $\color{#d91a1a}-1.05\%$
test_collector_with_rb[200-img_shape1-large_batch] 76.7974ms 76.3449ms 13.0985 Ops/s 13.0110 Ops/s $\color{#35bf28}+0.67\%$
test_collector_without_rb_cuda[100-img_shape0-atari] 57.9411ms 56.5523ms 17.6827 Ops/s 17.7130 Ops/s $\color{#d91a1a}-0.17\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1157s 0.1135s 8.8112 Ops/s 8.8809 Ops/s $\color{#d91a1a}-0.79\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 60.0152ms 58.7074ms 17.0336 Ops/s 17.0945 Ops/s $\color{#d91a1a}-0.36\%$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1194s 0.1167s 8.5676 Ops/s 8.6028 Ops/s $\color{#d91a1a}-0.41\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant