[Performance] Use log_metrics in sota-implementations#3454
Merged
Conversation
Replace loops calling log_scalar with single log_metrics calls across all sota-implementations. This provides two benefits: 1. Efficiency: For loggers with batch APIs (wandb, mlflow), this uses a single API call instead of N calls for N metrics. 2. CUDA sync optimization: The new log_metrics method batches CUDA->CPU tensor transfers with non_blocking=True and syncs once, avoiding the overhead of multiple implicit synchronizations from calling .item() on each tensor individually. Updated implementations: - PPO (mujoco, atari) - A2C (mujoco, atari) - IMPALA (single_node, multi_node_submitit, multi_node_ray) - DQN (cartpole, atari) - SAC, TD3, TD3-BC, DDPG, CrossQ, CQL, IQL, Discrete SAC - Decision Transformer, Dreamer, GAIL - Expert-Iteration (sync, async) - GRPO (sync, async) - Multiagent logging utility Co-authored-by: Cursor <[email protected]>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3454
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 6592140 with merge base 190a43d ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 85.0430μs | 82.5966μs | 12.1070 KOps/s | 12.1626 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1402ms | 0.1392ms | 7.1836 KOps/s | 7.1155 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1141s | 0.1137s | 8.7951 Ops/s | 9.6974 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.7125μs | 2.6990μs | 370.5059 KOps/s | 374.8855 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 38.9159μs | 37.2543μs | 26.8425 KOps/s | 25.6223 KOps/s | |
| test_simple | 0.5477s | 0.5471s | 1.8278 Ops/s | 1.7467 Ops/s | |
| test_transformed | 1.1343s | 1.1324s | 0.8830 Ops/s | 0.8622 Ops/s | |
| test_serial | 1.6749s | 1.6718s | 0.5981 Ops/s | 0.5812 Ops/s | |
| test_parallel | 1.2043s | 1.1096s | 0.9012 Ops/s | 0.8449 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.1695ms | 45.7086μs | 21.8777 KOps/s | 21.8114 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 59.4340μs | 26.1897μs | 38.1829 KOps/s | 39.4094 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 49.8830μs | 25.7900μs | 38.7747 KOps/s | 37.9571 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 48.2820μs | 14.5480μs | 68.7378 KOps/s | 69.9203 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 88.2940μs | 49.3703μs | 20.2551 KOps/s | 20.6893 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 56.6930μs | 28.4163μs | 35.1911 KOps/s | 34.8666 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 73.6240μs | 28.5782μs | 34.9917 KOps/s | 35.2180 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 48.8730μs | 17.4823μs | 57.2008 KOps/s | 57.4273 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 84.4940μs | 50.7998μs | 19.6851 KOps/s | 19.3276 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 71.6840μs | 31.0957μs | 32.1588 KOps/s | 32.9968 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 53.1520μs | 29.4706μs | 33.9322 KOps/s | 34.9494 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 46.2220μs | 17.4363μs | 57.3515 KOps/s | 58.7648 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 82.8240μs | 54.8045μs | 18.2467 KOps/s | 18.8138 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 80.7840μs | 33.6319μs | 29.7336 KOps/s | 29.9242 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 0.1019ms | 30.9694μs | 32.2900 KOps/s | 32.7260 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 43.7720μs | 20.0871μs | 49.7833 KOps/s | 51.2012 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 80.0740μs | 51.9699μs | 19.2419 KOps/s | 19.4163 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 61.0430μs | 31.1173μs | 32.1365 KOps/s | 32.2376 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.2866ms | 33.2377μs | 30.0863 KOps/s | 30.8905 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 45.1420μs | 19.3471μs | 51.6873 KOps/s | 54.3669 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 97.1750μs | 53.7714μs | 18.5972 KOps/s | 18.9215 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 74.2040μs | 33.6997μs | 29.6739 KOps/s | 29.7798 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 62.7930μs | 34.9401μs | 28.6204 KOps/s | 28.7776 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 50.6330μs | 22.0631μs | 45.3245 KOps/s | 47.0442 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 0.1029ms | 57.4482μs | 17.4070 KOps/s | 17.8708 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 63.8430μs | 36.5890μs | 27.3306 KOps/s | 27.5240 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 64.6140μs | 34.9227μs | 28.6347 KOps/s | 28.9269 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 66.1140μs | 21.8451μs | 45.7768 KOps/s | 46.8797 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 98.8050μs | 58.8740μs | 16.9854 KOps/s | 16.9976 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 68.5630μs | 39.0650μs | 25.5984 KOps/s | 25.6242 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 64.3130μs | 36.7796μs | 27.1890 KOps/s | 27.1437 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 47.9320μs | 23.9540μs | 41.7467 KOps/s | 41.5233 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7560s | 0.7486s | 1.3359 Ops/s | 1.2882 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7348s | 0.6394s | 1.5639 Ops/s | 1.5747 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7795s | 1.6990s | 0.5886 Ops/s | 0.5930 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5493s | 1.4668s | 0.6818 Ops/s | 0.6845 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 2.0250s | 1.9461s | 0.5139 Ops/s | 0.5152 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7967s | 1.7163s | 0.5826 Ops/s | 0.5845 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6945s | 4.6094s | 0.2169 Ops/s | 0.2150 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.4669s | 4.3892s | 0.2278 Ops/s | 0.2259 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.1628s | 2.0105s | 0.4974 Ops/s | 0.5065 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7792s | 1.6843s | 0.5937 Ops/s | 0.6006 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 10.1337ms | 10.0173ms | 99.8270 Ops/s | 101.3051 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 20.0555ms | 17.4772ms | 57.2175 Ops/s | 56.8928 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2312ms | 0.1250ms | 8.0023 KOps/s | 8.0208 KOps/s | |
| test_values[td1_return_estimate-False-False] | 26.9427ms | 26.5917ms | 37.6058 Ops/s | 37.8137 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 17.9280ms | 17.5094ms | 57.1123 Ops/s | 56.1712 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 40.8512ms | 39.3363ms | 25.4218 Ops/s | 25.4613 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 17.8355ms | 17.5211ms | 57.0742 Ops/s | 55.8646 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 10.5703ms | 8.8658ms | 112.7929 Ops/s | 114.7183 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.6809ms | 1.5182ms | 658.6784 Ops/s | 672.6666 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4600ms | 0.4067ms | 2.4591 KOps/s | 2.3868 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 35.0793ms | 30.2490ms | 33.0590 Ops/s | 28.6700 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 2.0884ms | 1.6944ms | 590.1882 Ops/s | 583.6917 Ops/s | |
| test_dqn_speed[False-None] | 1.7872ms | 1.3796ms | 724.8277 Ops/s | 723.8121 Ops/s | |
| test_dqn_speed[False-backward] | 1.9518ms | 1.8933ms | 528.1908 Ops/s | 532.7579 Ops/s | |
| test_dqn_speed[True-None] | 0.9726ms | 0.5518ms | 1.8123 KOps/s | 1.8336 KOps/s | |
| test_dqn_speed[True-backward] | 1.0289ms | 0.9970ms | 1.0031 KOps/s | 838.3931 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.8152ms | 0.5393ms | 1.8541 KOps/s | 1.7810 KOps/s | |
| test_ddpg_speed[False-None] | 3.1599ms | 2.8208ms | 354.5121 Ops/s | 352.4031 Ops/s | |
| test_ddpg_speed[False-backward] | 4.0533ms | 3.9864ms | 250.8532 Ops/s | 249.0320 Ops/s | |
| test_ddpg_speed[True-None] | 1.7957ms | 1.4162ms | 706.1066 Ops/s | 689.7158 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4422ms | 2.3951ms | 417.5206 Ops/s | 406.1231 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 2.5099ms | 1.4503ms | 689.5354 Ops/s | 704.7657 Ops/s | |
| test_sac_speed[False-None] | 8.5491ms | 7.9026ms | 126.5403 Ops/s | 126.9746 Ops/s | |
| test_sac_speed[False-backward] | 11.6219ms | 11.0645ms | 90.3790 Ops/s | 90.6775 Ops/s | |
| test_sac_speed[True-None] | 2.3174ms | 2.1552ms | 463.9984 Ops/s | 456.7417 Ops/s | |
| test_sac_speed[True-backward] | 4.1502ms | 4.0325ms | 247.9859 Ops/s | 217.5160 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.4984ms | 2.1387ms | 467.5736 Ops/s | 456.0948 Ops/s | |
| test_redq_speed[False-None] | 15.7678ms | 10.6537ms | 93.8641 Ops/s | 96.2757 Ops/s | |
| test_redq_speed[False-backward] | 19.1545ms | 17.8029ms | 56.1706 Ops/s | 55.4614 Ops/s | |
| test_redq_speed[True-None] | 4.8960ms | 4.4868ms | 222.8767 Ops/s | 213.1646 Ops/s | |
| test_redq_speed[True-backward] | 10.1267ms | 9.8137ms | 101.8984 Ops/s | 100.6980 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 4.7420ms | 4.4939ms | 222.5243 Ops/s | 217.7400 Ops/s | |
| test_redq_deprec_speed[False-None] | 11.4016ms | 10.9698ms | 91.1591 Ops/s | 93.1968 Ops/s | |
| test_redq_deprec_speed[False-backward] | 16.0237ms | 15.7491ms | 63.4955 Ops/s | 63.7081 Ops/s | |
| test_redq_deprec_speed[True-None] | 4.4024ms | 3.6987ms | 270.3637 Ops/s | 272.3596 Ops/s | |
| test_redq_deprec_speed[True-backward] | 7.8586ms | 7.6550ms | 130.6342 Ops/s | 135.7647 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 4.0859ms | 3.6450ms | 274.3486 Ops/s | 282.7344 Ops/s | |
| test_td3_speed[False-None] | 8.2078ms | 8.0180ms | 124.7199 Ops/s | 126.8121 Ops/s | |
| test_td3_speed[False-backward] | 11.3144ms | 10.8319ms | 92.3203 Ops/s | 93.2823 Ops/s | |
| test_td3_speed[True-None] | 1.9169ms | 1.8761ms | 533.0302 Ops/s | 536.0122 Ops/s | |
| test_td3_speed[True-backward] | 4.0322ms | 3.6477ms | 274.1477 Ops/s | 223.3206 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.8861ms | 1.8189ms | 549.7749 Ops/s | 543.7373 Ops/s | |
| test_cql_speed[False-None] | 28.9880ms | 25.9938ms | 38.4707 Ops/s | 38.9990 Ops/s | |
| test_cql_speed[False-backward] | 35.7227ms | 35.1475ms | 28.4515 Ops/s | 28.7727 Ops/s | |
| test_cql_speed[True-None] | 12.8378ms | 12.3898ms | 80.7118 Ops/s | 82.0499 Ops/s | |
| test_cql_speed[True-backward] | 19.6570ms | 18.5451ms | 53.9226 Ops/s | 56.4633 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 15.4813ms | 12.5364ms | 79.7675 Ops/s | 81.0537 Ops/s | |
| test_a2c_speed[False-None] | 5.8913ms | 5.4298ms | 184.1685 Ops/s | 187.5218 Ops/s | |
| test_a2c_speed[False-backward] | 12.2260ms | 11.7828ms | 84.8692 Ops/s | 85.4270 Ops/s | |
| test_a2c_speed[True-None] | 3.9516ms | 3.7272ms | 268.3005 Ops/s | 265.8301 Ops/s | |
| test_a2c_speed[True-backward] | 8.8816ms | 8.5896ms | 116.4201 Ops/s | 115.8520 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 4.1699ms | 3.7457ms | 266.9748 Ops/s | 268.8357 Ops/s | |
| test_ppo_speed[False-None] | 6.3735ms | 5.8533ms | 170.8426 Ops/s | 170.0460 Ops/s | |
| test_ppo_speed[False-backward] | 12.7946ms | 12.3976ms | 80.6607 Ops/s | 80.7476 Ops/s | |
| test_ppo_speed[True-None] | 3.7746ms | 3.6573ms | 273.4275 Ops/s | 275.0724 Ops/s | |
| test_ppo_speed[True-backward] | 8.7169ms | 8.5046ms | 117.5833 Ops/s | 117.7128 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 4.0064ms | 3.6283ms | 275.6079 Ops/s | 274.6750 Ops/s | |
| test_reinforce_speed[False-None] | 4.9082ms | 4.5357ms | 220.4731 Ops/s | 219.5673 Ops/s | |
| test_reinforce_speed[False-backward] | 7.6155ms | 7.3700ms | 135.6850 Ops/s | 136.8679 Ops/s | |
| test_reinforce_speed[True-None] | 3.3404ms | 2.9422ms | 339.8769 Ops/s | 334.8271 Ops/s | |
| test_reinforce_speed[True-backward] | 8.2779ms | 7.7507ms | 129.0207 Ops/s | 121.4286 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.3118ms | 2.9005ms | 344.7701 Ops/s | 336.4737 Ops/s | |
| test_iql_speed[False-None] | 26.1106ms | 20.7903ms | 48.0995 Ops/s | 50.1294 Ops/s | |
| test_iql_speed[False-backward] | 35.6334ms | 30.4081ms | 32.8860 Ops/s | 32.8840 Ops/s | |
| test_iql_speed[True-None] | 8.8503ms | 8.5591ms | 116.8346 Ops/s | 116.2994 Ops/s | |
| test_iql_speed[True-backward] | 17.2872ms | 16.8181ms | 59.4597 Ops/s | 59.5545 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 9.1433ms | 8.6201ms | 116.0085 Ops/s | 111.8361 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.1708ms | 6.0666ms | 164.8360 Ops/s | 162.5528 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 3.2246ms | 0.3180ms | 3.1444 KOps/s | 2.8371 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7214ms | 0.2708ms | 3.6929 KOps/s | 2.9488 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0158ms | 5.7585ms | 173.6564 Ops/s | 168.7517 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9494ms | 0.3232ms | 3.0939 KOps/s | 2.7530 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6610ms | 0.3014ms | 3.3173 KOps/s | 2.8674 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.5583ms | 1.2471ms | 801.8866 Ops/s | 740.3827 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.4156ms | 1.1653ms | 858.1298 Ops/s | 786.3162 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 10.0129ms | 6.0691ms | 164.7683 Ops/s | 164.0964 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.0609ms | 0.4917ms | 2.0339 KOps/s | 1.8962 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7739ms | 0.4173ms | 2.3964 KOps/s | 1.9773 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.8747ms | 5.7798ms | 173.0177 Ops/s | 166.9078 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8703ms | 0.3702ms | 2.7015 KOps/s | 2.7702 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6245ms | 0.3506ms | 2.8526 KOps/s | 2.8994 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0053ms | 5.7792ms | 173.0341 Ops/s | 169.7299 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.5676ms | 0.2817ms | 3.5500 KOps/s | 2.8664 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.4527ms | 0.2651ms | 3.7723 KOps/s | 3.1984 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.1310ms | 6.0056ms | 166.5112 Ops/s | 163.5487 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.0973ms | 0.4932ms | 2.0278 KOps/s | 2.0896 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6890ms | 0.4623ms | 2.1629 KOps/s | 2.1655 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.6221ms | 5.1712ms | 193.3779 Ops/s | 198.4815 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 7.5886ms | 2.1554ms | 463.9435 Ops/s | 546.1681 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 10.8587ms | 1.2699ms | 787.4343 Ops/s | 1.1161 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.5690s | 16.4601ms | 60.7528 Ops/s | 197.6535 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 3.9787ms | 1.7293ms | 578.2623 Ops/s | 503.7559 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 10.0326ms | 1.3286ms | 752.6773 Ops/s | 863.4206 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 8.3455ms | 5.4535ms | 183.3671 Ops/s | 56.5598 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 4.1764ms | 1.9577ms | 510.7974 Ops/s | 498.7510 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 3.4540ms | 1.1009ms | 908.3444 Ops/s | 682.8482 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 40.0165ms | 37.5821ms | 26.6084 Ops/s | 26.9737 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.5994ms | 18.0012ms | 55.5519 Ops/s | 54.7787 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 41.0527ms | 37.1311ms | 26.9316 Ops/s | 26.7272 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.0900ms | 18.5316ms | 53.9620 Ops/s | 54.8305 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 42.2963ms | 39.9061ms | 25.0588 Ops/s | 25.7349 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.4428ms | 19.9074ms | 50.2325 Ops/s | 50.9713 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8979ms | 0.2193ms | 4.5600 KOps/s | 4.5983 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.7526ms | 1.4260ms | 701.2672 Ops/s | 715.0794 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.7232ms | 2.3529ms | 425.0031 Ops/s | 427.8909 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.1329ms | 2.9209ms | 342.3582 Ops/s | 343.1238 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.4688ms | 0.1411ms | 7.0876 KOps/s | 7.5339 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3678ms | 0.2040ms | 4.9014 KOps/s | 5.1754 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.9903ms | 1.7711ms | 564.6076 Ops/s | 573.7236 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.4660ms | 1.3238ms | 755.4121 Ops/s | 771.8771 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.2025ms | 1.1207ms | 892.2715 Ops/s | 895.9150 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.7717ms | 3.5816ms | 279.2053 Ops/s | 273.8470 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 11.0308ms | 5.6890ms | 175.7775 Ops/s | 178.0604 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.1794ms | 6.9301ms | 144.2977 Ops/s | 136.9742 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4525ms | 0.2790ms | 3.5841 KOps/s | 3.5054 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.7842ms | 1.5419ms | 648.5397 Ops/s | 651.3517 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.8929ms | 2.4289ms | 411.7170 Ops/s | 405.1632 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.3933ms | 3.0961ms | 322.9855 Ops/s | 314.9180 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 34.2847ms | 33.6894ms | 29.6829 Ops/s | 29.7729 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 0.5761s | 0.1012s | 9.8856 Ops/s | 15.0986 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 39.4897ms | 38.9160ms | 25.6963 Ops/s | 26.1327 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 76.8399ms | 76.0352ms | 13.1518 Ops/s | 13.3420 Ops/s |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 85.4610μs | 84.4386μs | 11.8429 KOps/s | 12.4221 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1452ms | 0.1420ms | 7.0398 KOps/s | 7.1426 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1138s | 0.1136s | 8.8051 Ops/s | 9.3829 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.6539μs | 2.6477μs | 377.6842 KOps/s | 375.1945 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 37.3998μs | 37.1105μs | 26.9466 KOps/s | 26.8916 KOps/s | |
| test_simple | 0.8078s | 0.8053s | 1.2417 Ops/s | 1.1872 Ops/s | |
| test_transformed | 1.5568s | 1.4645s | 0.6828 Ops/s | 0.6778 Ops/s | |
| test_serial | 2.4491s | 2.3555s | 0.4245 Ops/s | 0.4216 Ops/s | |
| test_parallel | 2.0364s | 1.9874s | 0.5032 Ops/s | 0.5103 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.3319ms | 44.7768μs | 22.3330 KOps/s | 22.8173 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 56.9710μs | 25.6488μs | 38.9882 KOps/s | 40.2397 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 79.8420μs | 25.1051μs | 39.8325 KOps/s | 39.8204 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 82.6820μs | 13.9584μs | 71.6417 KOps/s | 71.8653 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 0.1044ms | 47.0279μs | 21.2640 KOps/s | 21.0782 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 76.6710μs | 27.7667μs | 36.0143 KOps/s | 36.5411 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 66.8010μs | 27.6932μs | 36.1100 KOps/s | 36.2730 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 51.2100μs | 16.6285μs | 60.1375 KOps/s | 61.3029 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 0.1008ms | 50.7087μs | 19.7205 KOps/s | 19.6842 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 94.0210μs | 30.9777μs | 32.2813 KOps/s | 32.8417 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 77.5810μs | 28.3792μs | 35.2370 KOps/s | 36.5981 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 46.1510μs | 17.0671μs | 58.5922 KOps/s | 60.5754 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 94.9020μs | 53.7547μs | 18.6030 KOps/s | 18.6603 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 72.8410μs | 33.6901μs | 29.6823 KOps/s | 30.1182 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 99.4720μs | 30.7539μs | 32.5162 KOps/s | 32.8472 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 50.0310μs | 19.6282μs | 50.9472 KOps/s | 52.7525 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 0.1112ms | 52.3025μs | 19.1196 KOps/s | 19.7965 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 0.1247ms | 30.6397μs | 32.6374 KOps/s | 32.1158 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.4103ms | 31.7765μs | 31.4698 KOps/s | 30.6545 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 47.5310μs | 18.3011μs | 54.6415 KOps/s | 54.8725 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 0.1011ms | 53.7843μs | 18.5928 KOps/s | 18.9274 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 71.4210μs | 32.8530μs | 30.4386 KOps/s | 30.0422 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 69.2220μs | 33.9966μs | 29.4147 KOps/s | 29.2858 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 52.5210μs | 21.3244μs | 46.8945 KOps/s | 47.3296 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 97.8620μs | 56.2053μs | 17.7919 KOps/s | 17.5627 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 71.3510μs | 37.0344μs | 27.0020 KOps/s | 27.7586 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 76.7720μs | 34.6056μs | 28.8970 KOps/s | 28.2265 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 51.4810μs | 21.1988μs | 47.1725 KOps/s | 46.8393 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 92.6110μs | 58.2528μs | 17.1665 KOps/s | 17.0782 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 71.0220μs | 38.6147μs | 25.8969 KOps/s | 25.9231 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 81.4710μs | 36.5097μs | 27.3900 KOps/s | 26.9986 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 69.3720μs | 23.7819μs | 42.0487 KOps/s | 43.7083 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8763s | 0.7766s | 1.2877 Ops/s | 1.2879 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7340s | 0.6397s | 1.5632 Ops/s | 1.5670 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7690s | 1.6896s | 0.5919 Ops/s | 0.5906 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5453s | 1.4665s | 0.6819 Ops/s | 0.6799 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 2.0254s | 1.9361s | 0.5165 Ops/s | 0.5076 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.8112s | 1.7277s | 0.5788 Ops/s | 0.5822 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.8881s | 4.7149s | 0.2121 Ops/s | 0.2139 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.6797s | 4.5332s | 0.2206 Ops/s | 0.2238 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.0407s | 1.9670s | 0.5084 Ops/s | 0.5022 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7725s | 1.6753s | 0.5969 Ops/s | 0.5972 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 22.1914ms | 21.2384ms | 47.0846 Ops/s | 45.6630 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1283s | 3.5013ms | 285.6120 Ops/s | 270.2041 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1105ms | 85.5443μs | 11.6899 KOps/s | 11.1964 KOps/s | |
| test_values[td1_return_estimate-False-False] | 51.6589ms | 50.0778ms | 19.9689 Ops/s | 19.3251 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.3543ms | 1.1129ms | 898.5295 Ops/s | 890.2228 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 85.6707ms | 83.5948ms | 11.9625 Ops/s | 11.7899 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.3521ms | 1.1081ms | 902.4116 Ops/s | 894.9477 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 22.7317ms | 21.5209ms | 46.4665 Ops/s | 45.4149 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0538ms | 0.7806ms | 1.2810 KOps/s | 1.2644 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.8114ms | 0.7240ms | 1.3813 KOps/s | 1.4158 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5924ms | 1.5142ms | 660.4085 Ops/s | 658.2046 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.8039ms | 0.7411ms | 1.3494 KOps/s | 1.3755 KOps/s | |
| test_dqn_speed[False-None] | 1.6420ms | 1.5415ms | 648.7202 Ops/s | 636.4202 Ops/s | |
| test_dqn_speed[False-backward] | 2.2691ms | 2.2020ms | 454.1311 Ops/s | 449.4755 Ops/s | |
| test_dqn_speed[True-None] | 0.6485ms | 0.5654ms | 1.7688 KOps/s | 1.7422 KOps/s | |
| test_dqn_speed[True-backward] | 1.2638ms | 1.2127ms | 824.5810 Ops/s | 898.7284 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.6370ms | 0.5868ms | 1.7042 KOps/s | 1.5773 KOps/s | |
| test_ddpg_speed[False-None] | 3.3053ms | 2.9190ms | 342.5867 Ops/s | 339.0740 Ops/s | |
| test_ddpg_speed[False-backward] | 4.8252ms | 4.3364ms | 230.6044 Ops/s | 233.9040 Ops/s | |
| test_ddpg_speed[True-None] | 1.3772ms | 1.3136ms | 761.2903 Ops/s | 745.3632 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4586ms | 2.3754ms | 420.9816 Ops/s | 414.7822 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.4786ms | 1.3642ms | 733.0259 Ops/s | 728.3992 Ops/s | |
| test_sac_speed[False-None] | 9.2342ms | 8.5486ms | 116.9783 Ops/s | 116.6252 Ops/s | |
| test_sac_speed[False-backward] | 12.0005ms | 11.5636ms | 86.4786 Ops/s | 85.2801 Ops/s | |
| test_sac_speed[True-None] | 1.9990ms | 1.8166ms | 550.4659 Ops/s | 542.6964 Ops/s | |
| test_sac_speed[True-backward] | 3.8994ms | 3.4929ms | 286.2931 Ops/s | 284.6208 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 20.5804ms | 11.0437ms | 90.5491 Ops/s | 91.8493 Ops/s | |
| test_redq_deprec_speed[False-None] | 10.0237ms | 9.4716ms | 105.5783 Ops/s | 104.0812 Ops/s | |
| test_redq_deprec_speed[False-backward] | 13.2463ms | 12.7013ms | 78.7318 Ops/s | 77.8087 Ops/s | |
| test_redq_deprec_speed[True-None] | 2.6892ms | 2.5602ms | 390.5962 Ops/s | 382.8610 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.2819ms | 4.1712ms | 239.7402 Ops/s | 227.6198 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 16.0802ms | 9.9734ms | 100.2668 Ops/s | 101.3794 Ops/s | |
| test_td3_speed[False-None] | 8.5714ms | 8.3786ms | 119.3519 Ops/s | 111.8803 Ops/s | |
| test_td3_speed[False-backward] | 11.6677ms | 11.0452ms | 90.5368 Ops/s | 88.8425 Ops/s | |
| test_td3_speed[True-None] | 1.6756ms | 1.6485ms | 606.6283 Ops/s | 571.3746 Ops/s | |
| test_td3_speed[True-backward] | 3.4785ms | 3.2891ms | 304.0322 Ops/s | 298.0984 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 73.5148ms | 25.1513ms | 39.7594 Ops/s | 40.0975 Ops/s | |
| test_cql_speed[False-None] | 17.7769ms | 17.5388ms | 57.0165 Ops/s | 56.2290 Ops/s | |
| test_cql_speed[False-backward] | 23.7602ms | 23.3008ms | 42.9170 Ops/s | 42.9228 Ops/s | |
| test_cql_speed[True-None] | 3.6497ms | 3.2927ms | 303.6989 Ops/s | 301.6390 Ops/s | |
| test_cql_speed[True-backward] | 5.8584ms | 5.4293ms | 184.1865 Ops/s | 180.9975 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 18.9862ms | 11.8801ms | 84.1746 Ops/s | 84.2162 Ops/s | |
| test_a2c_speed[False-None] | 3.9885ms | 3.2987ms | 303.1537 Ops/s | 300.2236 Ops/s | |
| test_a2c_speed[False-backward] | 6.7217ms | 6.2681ms | 159.5382 Ops/s | 156.4950 Ops/s | |
| test_a2c_speed[True-None] | 1.4652ms | 1.3136ms | 761.2415 Ops/s | 735.8464 Ops/s | |
| test_a2c_speed[True-backward] | 3.0568ms | 2.9714ms | 336.5443 Ops/s | 317.2283 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.0945ms | 0.9891ms | 1.0110 KOps/s | 1.0003 KOps/s | |
| test_ppo_speed[False-None] | 4.3294ms | 3.9200ms | 255.1014 Ops/s | 244.5701 Ops/s | |
| test_ppo_speed[False-backward] | 7.5267ms | 7.0918ms | 141.0087 Ops/s | 133.3757 Ops/s | |
| test_ppo_speed[True-None] | 1.6885ms | 1.4375ms | 695.6628 Ops/s | 690.6939 Ops/s | |
| test_ppo_speed[True-backward] | 3.1949ms | 3.0977ms | 322.8225 Ops/s | 301.1901 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.1312ms | 1.0471ms | 955.0085 Ops/s | 917.8942 Ops/s | |
| test_reinforce_speed[False-None] | 2.5264ms | 2.3114ms | 432.6355 Ops/s | 426.1655 Ops/s | |
| test_reinforce_speed[False-backward] | 3.5510ms | 3.4508ms | 289.7920 Ops/s | 283.2611 Ops/s | |
| test_reinforce_speed[True-None] | 1.5508ms | 1.2971ms | 770.9449 Ops/s | 775.9069 Ops/s | |
| test_reinforce_speed[True-backward] | 3.0801ms | 2.9867ms | 334.8182 Ops/s | 318.4747 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 0.4502s | 10.4194ms | 95.9751 Ops/s | 104.3718 Ops/s | |
| test_iql_speed[False-None] | 10.0385ms | 9.5650ms | 104.5473 Ops/s | 103.0395 Ops/s | |
| test_iql_speed[False-backward] | 13.9201ms | 13.4059ms | 74.5942 Ops/s | 73.8393 Ops/s | |
| test_iql_speed[True-None] | 2.2680ms | 2.1776ms | 459.2167 Ops/s | 448.9127 Ops/s | |
| test_iql_speed[True-backward] | 4.9515ms | 4.7453ms | 210.7339 Ops/s | 204.1825 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 18.0570ms | 10.7202ms | 93.2815 Ops/s | 95.9188 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.1129ms | 6.0060ms | 166.4991 Ops/s | 165.8804 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8051ms | 0.2848ms | 3.5107 KOps/s | 2.6315 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5688ms | 0.2666ms | 3.7503 KOps/s | 2.7705 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1364ms | 5.9121ms | 169.1445 Ops/s | 169.3323 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.8175ms | 0.3014ms | 3.3176 KOps/s | 2.7068 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6176ms | 0.3046ms | 3.2826 KOps/s | 2.8157 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.7235ms | 1.2733ms | 785.3732 Ops/s | 687.7150 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.4428ms | 1.1973ms | 835.2430 Ops/s | 724.9821 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.3213ms | 6.0827ms | 164.4000 Ops/s | 167.1202 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.5148ms | 0.4868ms | 2.0541 KOps/s | 2.2899 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7113ms | 0.4227ms | 2.3659 KOps/s | 2.3909 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0714ms | 5.9590ms | 167.8127 Ops/s | 170.9185 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.7943ms | 0.3819ms | 2.6186 KOps/s | 3.5444 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6344ms | 0.3629ms | 2.7559 KOps/s | 3.7460 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1203ms | 5.8518ms | 170.8874 Ops/s | 171.9424 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9496ms | 0.3734ms | 2.6779 KOps/s | 2.8049 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.4989ms | 0.3205ms | 3.1200 KOps/s | 2.9113 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.2326ms | 6.1052ms | 163.7952 Ops/s | 165.4104 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.2664ms | 0.5172ms | 1.9336 KOps/s | 627.3422 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7473ms | 0.5045ms | 1.9821 KOps/s | 2.3817 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.6656s | 18.3585ms | 54.4705 Ops/s | 192.7664 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 4.0196ms | 1.8219ms | 548.8892 Ops/s | 497.4540 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.0469ms | 0.9140ms | 1.0941 KOps/s | 1.0754 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 9.2167ms | 5.1826ms | 192.9527 Ops/s | 192.8816 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 8.7963ms | 1.9917ms | 502.0780 Ops/s | 513.9235 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 1.2657ms | 0.9115ms | 1.0971 KOps/s | 741.7503 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.5792s | 16.7899ms | 59.5597 Ops/s | 50.8846 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 4.2832ms | 2.0095ms | 497.6243 Ops/s | 504.0210 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.9274ms | 1.1244ms | 889.3426 Ops/s | 892.9563 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 37.9394ms | 36.0610ms | 27.7308 Ops/s | 27.3604 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 20.1208ms | 18.3623ms | 54.4593 Ops/s | 54.0588 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 40.2862ms | 37.3324ms | 26.7864 Ops/s | 26.3511 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.2669ms | 18.6919ms | 53.4992 Ops/s | 51.9671 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 41.7337ms | 39.4212ms | 25.3671 Ops/s | 25.0070 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.8546ms | 20.4227ms | 48.9651 Ops/s | 49.5362 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8586ms | 0.2227ms | 4.4896 KOps/s | 4.5810 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.8139ms | 1.4385ms | 695.1558 Ops/s | 714.8760 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.5263ms | 2.3119ms | 432.5431 Ops/s | 437.5922 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.0938ms | 2.9177ms | 342.7402 Ops/s | 342.7706 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.2482ms | 0.1541ms | 6.4878 KOps/s | 6.6900 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3569ms | 0.2059ms | 4.8574 KOps/s | 4.6955 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 2.0537ms | 1.8174ms | 550.2234 Ops/s | 561.7846 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.5143ms | 1.3622ms | 734.0836 Ops/s | 763.3030 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.2487ms | 1.1515ms | 868.4555 Ops/s | 877.0541 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.8873ms | 3.6153ms | 276.5990 Ops/s | 268.4215 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 11.2984ms | 5.7873ms | 172.7924 Ops/s | 171.3414 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.7509ms | 7.0350ms | 142.1469 Ops/s | 134.4644 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4525ms | 0.2815ms | 3.5528 KOps/s | 3.6312 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.7354ms | 1.5576ms | 642.0316 Ops/s | 666.2540 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.6365ms | 2.4470ms | 408.6719 Ops/s | 417.6131 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.4625ms | 3.1226ms | 320.2463 Ops/s | 317.3738 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 35.1010ms | 34.5319ms | 28.9588 Ops/s | 28.6659 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 67.8287ms | 67.5833ms | 14.7966 Ops/s | 14.6854 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 39.5291ms | 38.9741ms | 25.6580 Ops/s | 25.5277 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 77.4650ms | 75.8487ms | 13.1841 Ops/s | 12.8671 Ops/s | |
| test_collector_without_rb_cuda[100-img_shape0-atari] | 57.3388ms | 57.2381ms | 17.4709 Ops/s | 16.9012 Ops/s | |
| test_collector_without_rb_cuda[200-img_shape1-large_batch] | 0.1145s | 0.1141s | 8.7608 Ops/s | 8.6620 Ops/s | |
| test_collector_with_rb_cuda[100-img_shape0-atari] | 60.1961ms | 59.3391ms | 16.8523 Ops/s | 16.6945 Ops/s | |
| test_collector_with_rb_cuda[200-img_shape1-large_batch] | 0.1190s | 0.1181s | 8.4686 Ops/s | 8.3958 Ops/s |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace loops calling
log_scalarwith singlelog_metricscalls across all sota-implementations. This is a follow-up to #3452.Benefits
Efficiency: For loggers with batch APIs (wandb, mlflow), this uses a single API call instead of N calls for N metrics.
CUDA sync optimization: The new
log_metricsmethod batches CUDA→CPU tensor transfers withnon_blocking=Trueand syncs once, avoiding the overhead of multiple implicit synchronizations from calling.item()on each tensor individually.Updated implementations
Changes
27 files updated with simple refactoring:
for key, value in metrics.items(): logger.log_scalar(key, value, step)withlogger.log_metrics(metrics, step)log_metrics(logger, metrics, step)helper, simplified the implementation to just calllogger.log_metrics(metrics, step)Test plan
Made with Cursor