[Feature] Add log_metrics method for efficient batch logging#3452
Merged
[Feature] Add log_metrics method for efficient batch logging#3452
Conversation
Add log_metrics() method to Logger base class and optimized implementations for WandbLogger and MLFlowLogger that use their native batch logging APIs. The new _make_metrics_safe() utility batches CUDA->CPU tensor transfers using non_blocking=True and synchronizes once via a CUDA event, avoiding the overhead of multiple implicit synchronizations that would occur when calling .item() on each CUDA tensor individually. This is particularly useful when logging to services running in separate processes (e.g., Ray actors) that may not have GPU access. Co-authored-by: Cursor <[email protected]>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3452
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 PendingAs of commit 4a53323 with merge base 838410c ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 82.4614μs | 80.5780μs | 12.4103 KOps/s | 12.3770 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1386ms | 0.1379ms | 7.2498 KOps/s | 7.2268 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1080s | 0.1076s | 9.2910 Ops/s | 9.3205 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.7847μs | 2.7767μs | 360.1420 KOps/s | 376.9557 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 38.7969μs | 38.3029μs | 26.1077 KOps/s | 26.6963 KOps/s | |
| test_simple | 0.5569s | 0.5522s | 1.8110 Ops/s | 1.7527 Ops/s | |
| test_transformed | 1.1480s | 1.1400s | 0.8772 Ops/s | 0.8691 Ops/s | |
| test_serial | 1.6856s | 1.6781s | 0.5959 Ops/s | 0.5916 Ops/s | |
| test_parallel | 1.2034s | 1.1448s | 0.8735 Ops/s | 0.7993 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.1967ms | 44.6157μs | 22.4136 KOps/s | 22.6675 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 61.8640μs | 25.1374μs | 39.7814 KOps/s | 39.5165 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 58.2330μs | 24.5395μs | 40.7505 KOps/s | 40.3453 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 35.2720μs | 13.6327μs | 73.3529 KOps/s | 71.9528 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 95.5760μs | 47.8565μs | 20.8958 KOps/s | 21.0221 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 52.1730μs | 27.3707μs | 36.5354 KOps/s | 35.8800 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 56.4340μs | 27.7415μs | 36.0470 KOps/s | 36.4015 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 50.4430μs | 16.4606μs | 60.7513 KOps/s | 60.0271 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 0.1273ms | 50.5206μs | 19.7939 KOps/s | 19.7768 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 56.3330μs | 30.4688μs | 32.8205 KOps/s | 32.1697 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 63.1640μs | 27.9480μs | 35.7807 KOps/s | 35.8689 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 43.3330μs | 16.5159μs | 60.5476 KOps/s | 60.5138 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 0.1014ms | 52.3939μs | 19.0862 KOps/s | 19.0308 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 68.8540μs | 33.1706μs | 30.1471 KOps/s | 29.7469 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 65.3140μs | 29.7930μs | 33.5649 KOps/s | 33.5257 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 53.6630μs | 19.0774μs | 52.4181 KOps/s | 52.1478 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 87.9250μs | 50.1736μs | 19.9308 KOps/s | 19.9724 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 62.3540μs | 30.4328μs | 32.8593 KOps/s | 33.0311 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.3136ms | 31.4851μs | 31.7610 KOps/s | 31.8409 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 47.6130μs | 18.1649μs | 55.0513 KOps/s | 54.5094 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 83.9550μs | 53.0728μs | 18.8420 KOps/s | 18.8730 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 65.4140μs | 33.3572μs | 29.9785 KOps/s | 29.7269 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 56.5030μs | 33.5438μs | 29.8118 KOps/s | 29.5509 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 50.2830μs | 20.8511μs | 47.9592 KOps/s | 47.7151 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 86.5950μs | 55.7126μs | 17.9493 KOps/s | 17.8845 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 64.5440μs | 36.5259μs | 27.3778 KOps/s | 27.8243 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 72.3440μs | 34.0954μs | 29.3295 KOps/s | 29.5709 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 53.3730μs | 21.0416μs | 47.5249 KOps/s | 47.5745 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 87.1660μs | 57.4048μs | 17.4201 KOps/s | 17.0730 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 77.4150μs | 38.3990μs | 26.0423 KOps/s | 25.4367 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 69.8240μs | 35.4924μs | 28.1750 KOps/s | 27.8671 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 56.3130μs | 23.2568μs | 42.9981 KOps/s | 42.2581 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7533s | 0.7509s | 1.3318 Ops/s | 1.3002 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7343s | 0.6389s | 1.5651 Ops/s | 1.5924 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7699s | 1.6953s | 0.5899 Ops/s | 0.5992 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5477s | 1.4639s | 0.6831 Ops/s | 0.6902 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 2.0159s | 1.9386s | 0.5158 Ops/s | 0.5227 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7947s | 1.7123s | 0.5840 Ops/s | 0.5906 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6890s | 4.6419s | 0.2154 Ops/s | 0.2135 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5353s | 4.4560s | 0.2244 Ops/s | 0.2245 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.1580s | 1.9947s | 0.5013 Ops/s | 0.5135 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7294s | 1.6550s | 0.6042 Ops/s | 0.6048 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 10.4968ms | 10.2505ms | 97.5567 Ops/s | 98.7468 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 19.3543ms | 17.7197ms | 56.4345 Ops/s | 91.3509 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2393ms | 0.1290ms | 7.7518 KOps/s | 8.4044 KOps/s | |
| test_values[td1_return_estimate-False-False] | 27.6980ms | 27.3865ms | 36.5143 Ops/s | 37.7376 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 17.9285ms | 17.5528ms | 56.9708 Ops/s | 90.5211 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 40.5893ms | 39.9635ms | 25.0228 Ops/s | 25.5787 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 18.4887ms | 17.5774ms | 56.8913 Ops/s | 91.2741 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 9.2870ms | 9.1538ms | 109.2446 Ops/s | 111.3785 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.6645ms | 1.4628ms | 683.6061 Ops/s | 670.0724 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4863ms | 0.4121ms | 2.4266 KOps/s | 2.4718 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 35.0423ms | 34.3026ms | 29.1523 Ops/s | 32.4467 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 1.8021ms | 1.6893ms | 591.9521 Ops/s | 589.4316 Ops/s | |
| test_dqn_speed[False-None] | 1.7853ms | 1.3888ms | 720.0696 Ops/s | 729.9943 Ops/s | |
| test_dqn_speed[False-backward] | 1.9728ms | 1.8847ms | 530.5887 Ops/s | 523.9962 Ops/s | |
| test_dqn_speed[True-None] | 0.9462ms | 0.5473ms | 1.8273 KOps/s | 1.8006 KOps/s | |
| test_dqn_speed[True-backward] | 1.0556ms | 0.9972ms | 1.0028 KOps/s | 846.0530 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.9255ms | 0.5373ms | 1.8612 KOps/s | 1.8102 KOps/s | |
| test_ddpg_speed[False-None] | 3.1209ms | 2.8197ms | 354.6458 Ops/s | 358.9023 Ops/s | |
| test_ddpg_speed[False-backward] | 4.2750ms | 3.9903ms | 250.6108 Ops/s | 251.2818 Ops/s | |
| test_ddpg_speed[True-None] | 1.5133ms | 1.4000ms | 714.2749 Ops/s | 689.8635 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4876ms | 2.3805ms | 420.0845 Ops/s | 417.6643 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.7836ms | 1.3945ms | 717.0934 Ops/s | 676.4110 Ops/s | |
| test_sac_speed[False-None] | 8.5979ms | 7.9317ms | 126.0760 Ops/s | 125.9197 Ops/s | |
| test_sac_speed[False-backward] | 11.5769ms | 11.0922ms | 90.1538 Ops/s | 89.6411 Ops/s | |
| test_sac_speed[True-None] | 2.5247ms | 2.1566ms | 463.6845 Ops/s | 462.5521 Ops/s | |
| test_sac_speed[True-backward] | 4.1766ms | 4.0079ms | 249.5062 Ops/s | 216.1369 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.4300ms | 2.1530ms | 464.4720 Ops/s | 444.4382 Ops/s | |
| test_redq_speed[False-None] | 10.8376ms | 10.2834ms | 97.2439 Ops/s | 98.3959 Ops/s | |
| test_redq_speed[False-backward] | 21.9689ms | 17.7581ms | 56.3124 Ops/s | 57.4742 Ops/s | |
| test_redq_speed[True-None] | 4.8693ms | 4.4806ms | 223.1852 Ops/s | 222.0032 Ops/s | |
| test_redq_speed[True-backward] | 10.1359ms | 9.8033ms | 102.0069 Ops/s | 105.6767 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 4.6983ms | 4.4492ms | 224.7620 Ops/s | 225.9942 Ops/s | |
| test_redq_deprec_speed[False-None] | 13.9314ms | 11.0707ms | 90.3284 Ops/s | 94.0136 Ops/s | |
| test_redq_deprec_speed[False-backward] | 16.1514ms | 15.6208ms | 64.0173 Ops/s | 66.1503 Ops/s | |
| test_redq_deprec_speed[True-None] | 3.9153ms | 3.6917ms | 270.8756 Ops/s | 266.2453 Ops/s | |
| test_redq_deprec_speed[True-backward] | 7.7282ms | 7.5121ms | 133.1187 Ops/s | 131.9493 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 3.8220ms | 3.6116ms | 276.8826 Ops/s | 270.7538 Ops/s | |
| test_td3_speed[False-None] | 8.1069ms | 7.9204ms | 126.2556 Ops/s | 127.7908 Ops/s | |
| test_td3_speed[False-backward] | 11.4746ms | 10.7485ms | 93.0365 Ops/s | 94.2192 Ops/s | |
| test_td3_speed[True-None] | 1.9179ms | 1.8578ms | 538.2754 Ops/s | 532.8931 Ops/s | |
| test_td3_speed[True-backward] | 3.7703ms | 3.6636ms | 272.9555 Ops/s | 271.9285 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.8481ms | 1.8145ms | 551.1104 Ops/s | 549.4502 Ops/s | |
| test_cql_speed[False-None] | 29.1989ms | 25.9199ms | 38.5804 Ops/s | 38.2811 Ops/s | |
| test_cql_speed[False-backward] | 38.2744ms | 35.5481ms | 28.1309 Ops/s | 28.5560 Ops/s | |
| test_cql_speed[True-None] | 12.6864ms | 12.3145ms | 81.2053 Ops/s | 81.2399 Ops/s | |
| test_cql_speed[True-backward] | 18.9089ms | 18.2759ms | 54.7169 Ops/s | 54.2314 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 12.7466ms | 12.4364ms | 80.4094 Ops/s | 79.2353 Ops/s | |
| test_a2c_speed[False-None] | 5.6860ms | 5.4162ms | 184.6330 Ops/s | 185.8040 Ops/s | |
| test_a2c_speed[False-backward] | 12.0080ms | 11.6674ms | 85.7086 Ops/s | 86.4557 Ops/s | |
| test_a2c_speed[True-None] | 4.2668ms | 3.7409ms | 267.3175 Ops/s | 259.5182 Ops/s | |
| test_a2c_speed[True-backward] | 8.7780ms | 8.5660ms | 116.7407 Ops/s | 117.4977 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 4.0043ms | 3.7006ms | 270.2284 Ops/s | 269.0018 Ops/s | |
| test_ppo_speed[False-None] | 6.1148ms | 5.9046ms | 169.3603 Ops/s | 169.6431 Ops/s | |
| test_ppo_speed[False-backward] | 12.5641ms | 12.2501ms | 81.6318 Ops/s | 81.7433 Ops/s | |
| test_ppo_speed[True-None] | 3.7966ms | 3.6325ms | 275.2929 Ops/s | 272.8051 Ops/s | |
| test_ppo_speed[True-backward] | 8.7423ms | 8.4057ms | 118.9667 Ops/s | 119.5534 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 4.0744ms | 3.5838ms | 279.0346 Ops/s | 275.0976 Ops/s | |
| test_reinforce_speed[False-None] | 4.9580ms | 4.5386ms | 220.3303 Ops/s | 220.2620 Ops/s | |
| test_reinforce_speed[False-backward] | 8.4428ms | 7.2968ms | 137.0464 Ops/s | 136.1560 Ops/s | |
| test_reinforce_speed[True-None] | 3.3501ms | 2.9122ms | 343.3813 Ops/s | 352.4960 Ops/s | |
| test_reinforce_speed[True-backward] | 7.9546ms | 7.7058ms | 129.7720 Ops/s | 131.7358 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.0449ms | 2.8624ms | 349.3517 Ops/s | 355.0034 Ops/s | |
| test_iql_speed[False-None] | 24.8385ms | 20.0813ms | 49.7975 Ops/s | 49.8244 Ops/s | |
| test_iql_speed[False-backward] | 35.1311ms | 30.2073ms | 33.1046 Ops/s | 33.4993 Ops/s | |
| test_iql_speed[True-None] | 8.8753ms | 8.5128ms | 117.4704 Ops/s | 117.3690 Ops/s | |
| test_iql_speed[True-backward] | 17.1279ms | 16.7536ms | 59.6885 Ops/s | 59.3274 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 8.8476ms | 8.5351ms | 117.1626 Ops/s | 117.8926 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.3568ms | 6.1225ms | 163.3310 Ops/s | 166.6480 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.0766ms | 0.3427ms | 2.9178 KOps/s | 3.2297 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6012ms | 0.3448ms | 2.9003 KOps/s | 3.4132 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.4581ms | 5.9474ms | 168.1410 Ops/s | 171.2648 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9598ms | 0.3211ms | 3.1139 KOps/s | 3.7038 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5921ms | 0.2936ms | 3.4062 KOps/s | 3.8135 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.6696ms | 1.4183ms | 705.0547 Ops/s | 813.1752 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6260ms | 1.3372ms | 747.8057 Ops/s | 867.7748 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 10.0515ms | 6.2054ms | 161.1509 Ops/s | 166.7211 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9403ms | 0.4822ms | 2.0738 KOps/s | 2.1851 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6880ms | 0.4804ms | 2.0817 KOps/s | 2.2810 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.1568ms | 5.9210ms | 168.8904 Ops/s | 170.2488 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.9476ms | 0.3483ms | 2.8708 KOps/s | 2.7157 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6880ms | 0.3305ms | 3.0260 KOps/s | 2.9835 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1476ms | 5.8816ms | 170.0207 Ops/s | 170.4947 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.0077ms | 0.3359ms | 2.9770 KOps/s | 3.2938 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.4771ms | 0.2876ms | 3.4775 KOps/s | 3.6308 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.3558ms | 6.1194ms | 163.4134 Ops/s | 167.0681 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.2033ms | 0.4657ms | 2.1471 KOps/s | 682.7538 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7606ms | 0.4882ms | 2.0483 KOps/s | 2.4131 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.5729ms | 5.0586ms | 197.6835 Ops/s | 197.7497 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 3.8992ms | 1.7722ms | 564.2756 Ops/s | 528.3085 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 3.5655ms | 0.9457ms | 1.0574 KOps/s | 1.0812 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 6.9508ms | 5.0432ms | 198.2878 Ops/s | 196.0083 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 3.9626ms | 1.7923ms | 557.9542 Ops/s | 569.8396 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 6.6279ms | 1.2197ms | 819.8633 Ops/s | 827.8539 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.5601s | 16.3602ms | 61.1238 Ops/s | 59.8681 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 4.1092ms | 1.9615ms | 509.8046 Ops/s | 454.2062 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.6348ms | 1.1025ms | 907.0698 Ops/s | 797.8891 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 39.7336ms | 36.5293ms | 27.3752 Ops/s | 27.8717 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 20.2696ms | 18.2801ms | 54.7043 Ops/s | 55.7323 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 40.7569ms | 37.8433ms | 26.4248 Ops/s | 27.0789 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.2319ms | 18.6709ms | 53.5592 Ops/s | 55.0542 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 41.7783ms | 39.0086ms | 25.6354 Ops/s | 25.8317 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.3258ms | 19.9349ms | 50.1632 Ops/s | 50.7751 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8517ms | 0.2209ms | 4.5274 KOps/s | 4.6894 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.5620ms | 1.3803ms | 724.4869 Ops/s | 720.7326 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.5251ms | 2.3502ms | 425.4891 Ops/s | 422.6967 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.2807ms | 2.9161ms | 342.9293 Ops/s | 348.9971 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.4188ms | 0.1370ms | 7.3018 KOps/s | 7.6413 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3427ms | 0.1842ms | 5.4281 KOps/s | 5.6255 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.9294ms | 1.7447ms | 573.1629 Ops/s | 583.4312 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.5516ms | 1.2810ms | 780.6629 Ops/s | 800.4875 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.4661ms | 1.1172ms | 895.0589 Ops/s | 903.6684 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.7765ms | 3.6707ms | 272.4285 Ops/s | 277.7627 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 5.6804ms | 5.5499ms | 180.1841 Ops/s | 179.9117 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.0314ms | 6.8953ms | 145.0257 Ops/s | 138.6958 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4773ms | 0.2765ms | 3.6171 KOps/s | 3.6053 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.6326ms | 1.4953ms | 668.7505 Ops/s | 658.9280 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.8106ms | 2.4008ms | 416.5246 Ops/s | 403.1834 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.3874ms | 3.1051ms | 322.0558 Ops/s | 324.7921 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 34.8052ms | 33.9353ms | 29.4679 Ops/s | 29.9861 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 66.9799ms | 66.5235ms | 15.0323 Ops/s | 15.0743 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 39.7356ms | 38.4777ms | 25.9891 Ops/s | 26.2655 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 0.6688s | 0.1185s | 8.4421 Ops/s | 13.3530 Ops/s |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 81.0003μs | 80.0950μs | 12.4852 KOps/s | 12.1906 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1386ms | 0.1382ms | 7.2379 KOps/s | 7.1847 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1164s | 0.1162s | 8.6084 Ops/s | 8.8163 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.4695μs | 2.4640μs | 405.8452 KOps/s | 402.2750 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 40.2926μs | 39.8557μs | 25.0905 KOps/s | 26.4724 KOps/s | |
| test_simple | 0.7982s | 0.7934s | 1.2604 Ops/s | 1.2210 Ops/s | |
| test_transformed | 1.5393s | 1.4459s | 0.6916 Ops/s | 0.6879 Ops/s | |
| test_serial | 2.3996s | 2.3096s | 0.4330 Ops/s | 0.4308 Ops/s | |
| test_parallel | 2.0329s | 1.9770s | 0.5058 Ops/s | 0.5232 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.3661ms | 44.4186μs | 22.5131 KOps/s | 22.1219 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 59.9410μs | 24.7828μs | 40.3506 KOps/s | 39.8349 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 56.6110μs | 24.2152μs | 41.2964 KOps/s | 40.8638 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 48.8610μs | 13.6721μs | 73.1419 KOps/s | 72.0430 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 78.2410μs | 47.0469μs | 21.2554 KOps/s | 20.8049 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 60.5710μs | 27.2640μs | 36.6784 KOps/s | 35.8026 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 57.0710μs | 26.5971μs | 37.5981 KOps/s | 36.6255 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 41.0110μs | 16.1079μs | 62.0815 KOps/s | 60.5099 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 87.0220μs | 49.1969μs | 20.3265 KOps/s | 19.5645 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 60.8910μs | 29.9892μs | 33.3454 KOps/s | 32.7669 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 59.0810μs | 26.9200μs | 37.1472 KOps/s | 36.6480 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 46.8810μs | 16.2820μs | 61.4176 KOps/s | 60.1663 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 0.1286ms | 51.5475μs | 19.3996 KOps/s | 18.9607 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 68.7110μs | 32.6303μs | 30.6463 KOps/s | 30.5226 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 63.5910μs | 29.7297μs | 33.6364 KOps/s | 33.4321 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 41.3400μs | 18.8129μs | 53.1550 KOps/s | 51.7802 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 97.6920μs | 49.7595μs | 20.0967 KOps/s | 19.6140 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 61.6510μs | 30.4830μs | 32.8051 KOps/s | 32.8134 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.2717ms | 31.4815μs | 31.7646 KOps/s | 31.7717 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 45.1610μs | 18.5747μs | 53.8366 KOps/s | 56.6835 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 88.9810μs | 52.2829μs | 19.1267 KOps/s | 19.0149 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 59.2910μs | 33.2841μs | 30.0444 KOps/s | 30.2462 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 65.8610μs | 33.5140μs | 29.8383 KOps/s | 29.7078 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 51.7410μs | 20.8239μs | 48.0218 KOps/s | 47.4112 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 86.6720μs | 55.2736μs | 18.0918 KOps/s | 17.7863 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 66.2310μs | 36.1747μs | 27.6436 KOps/s | 27.9689 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 61.9810μs | 33.4990μs | 29.8517 KOps/s | 29.3568 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 52.1410μs | 20.8626μs | 47.9328 KOps/s | 47.5102 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 89.2610μs | 57.4984μs | 17.3918 KOps/s | 17.2062 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 79.6910μs | 38.7975μs | 25.7749 KOps/s | 26.0135 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 69.7610μs | 35.8969μs | 27.8576 KOps/s | 27.2672 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 55.9910μs | 23.2077μs | 43.0891 KOps/s | 42.5811 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8606s | 0.7676s | 1.3028 Ops/s | 1.2932 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7245s | 0.6325s | 1.5810 Ops/s | 1.5660 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7425s | 1.6601s | 0.6024 Ops/s | 0.5947 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5179s | 1.4384s | 0.6952 Ops/s | 0.6834 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9852s | 1.9042s | 0.5252 Ops/s | 0.5157 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7746s | 1.6933s | 0.5906 Ops/s | 0.5838 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.8696s | 4.7057s | 0.2125 Ops/s | 0.2153 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5929s | 4.4456s | 0.2249 Ops/s | 0.2247 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.0824s | 1.9697s | 0.5077 Ops/s | 0.5062 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.8023s | 1.6866s | 0.5929 Ops/s | 0.5958 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 21.6118ms | 20.2421ms | 49.4020 Ops/s | 48.0454 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1309s | 3.5383ms | 282.6224 Ops/s | 282.8666 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1071ms | 82.7867μs | 12.0792 KOps/s | 11.7819 KOps/s | |
| test_values[td1_return_estimate-False-False] | 51.5248ms | 48.4629ms | 20.6343 Ops/s | 20.3187 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.3009ms | 1.0893ms | 918.0590 Ops/s | 911.0753 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 84.6244ms | 79.5216ms | 12.5752 Ops/s | 12.4359 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.2643ms | 1.0824ms | 923.8556 Ops/s | 913.7560 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 22.1461ms | 21.0998ms | 47.3938 Ops/s | 46.5301 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0564ms | 0.7541ms | 1.3261 KOps/s | 1.3107 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7233ms | 0.6780ms | 1.4749 KOps/s | 1.4114 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5293ms | 1.4862ms | 672.8766 Ops/s | 666.2191 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7742ms | 0.7247ms | 1.3800 KOps/s | 1.4366 KOps/s | |
| test_dqn_speed[False-None] | 1.6247ms | 1.5284ms | 654.2658 Ops/s | 647.9954 Ops/s | |
| test_dqn_speed[False-backward] | 2.3048ms | 2.1829ms | 458.1108 Ops/s | 453.0176 Ops/s | |
| test_dqn_speed[True-None] | 0.6562ms | 0.5766ms | 1.7344 KOps/s | 1.7628 KOps/s | |
| test_dqn_speed[True-backward] | 1.2844ms | 1.1930ms | 838.1958 Ops/s | 834.1254 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.6438ms | 0.5773ms | 1.7322 KOps/s | 1.6583 KOps/s | |
| test_ddpg_speed[False-None] | 3.2422ms | 2.8801ms | 347.2065 Ops/s | 343.9342 Ops/s | |
| test_ddpg_speed[False-backward] | 4.8048ms | 4.3344ms | 230.7108 Ops/s | 233.0162 Ops/s | |
| test_ddpg_speed[True-None] | 1.3644ms | 1.3000ms | 769.2418 Ops/s | 762.8725 Ops/s | |
| test_ddpg_speed[True-backward] | 2.5501ms | 2.4683ms | 405.1305 Ops/s | 398.5632 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.4371ms | 1.3294ms | 752.2143 Ops/s | 746.9666 Ops/s | |
| test_sac_speed[False-None] | 8.9097ms | 8.3806ms | 119.3226 Ops/s | 118.2848 Ops/s | |
| test_sac_speed[False-backward] | 12.1259ms | 11.6777ms | 85.6334 Ops/s | 84.9555 Ops/s | |
| test_sac_speed[True-None] | 1.8788ms | 1.7885ms | 559.1241 Ops/s | 553.8273 Ops/s | |
| test_sac_speed[True-backward] | 3.6318ms | 3.5424ms | 282.2937 Ops/s | 279.6252 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 19.4597ms | 10.9691ms | 91.1652 Ops/s | 90.0749 Ops/s | |
| test_redq_deprec_speed[False-None] | 9.9457ms | 9.3565ms | 106.8774 Ops/s | 106.0164 Ops/s | |
| test_redq_deprec_speed[False-backward] | 13.1243ms | 12.7396ms | 78.4955 Ops/s | 78.0138 Ops/s | |
| test_redq_deprec_speed[True-None] | 2.6482ms | 2.4941ms | 400.9505 Ops/s | 400.0199 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.6441ms | 4.2417ms | 235.7570 Ops/s | 239.3974 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 16.5882ms | 10.1015ms | 98.9948 Ops/s | 100.1244 Ops/s | |
| test_td3_speed[False-None] | 8.5412ms | 8.2996ms | 120.4871 Ops/s | 113.6213 Ops/s | |
| test_td3_speed[False-backward] | 11.1766ms | 10.7294ms | 93.2018 Ops/s | 92.6403 Ops/s | |
| test_td3_speed[True-None] | 1.7006ms | 1.6453ms | 607.7872 Ops/s | 595.8577 Ops/s | |
| test_td3_speed[True-backward] | 3.1270ms | 3.0602ms | 326.7711 Ops/s | 309.3798 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 73.2894ms | 25.2829ms | 39.5524 Ops/s | 39.4165 Ops/s | |
| test_cql_speed[False-None] | 17.6278ms | 17.3647ms | 57.5880 Ops/s | 57.1692 Ops/s | |
| test_cql_speed[False-backward] | 23.3844ms | 22.7811ms | 43.8961 Ops/s | 42.8423 Ops/s | |
| test_cql_speed[True-None] | 3.3632ms | 3.2102ms | 311.5030 Ops/s | 310.7186 Ops/s | |
| test_cql_speed[True-backward] | 5.7424ms | 5.3031ms | 188.5672 Ops/s | 187.8279 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 19.2929ms | 11.9380ms | 83.7662 Ops/s | 83.1431 Ops/s | |
| test_a2c_speed[False-None] | 4.0038ms | 3.2549ms | 307.2266 Ops/s | 305.5579 Ops/s | |
| test_a2c_speed[False-backward] | 6.6021ms | 6.2171ms | 160.8477 Ops/s | 158.7648 Ops/s | |
| test_a2c_speed[True-None] | 1.5607ms | 1.3107ms | 762.9356 Ops/s | 742.8000 Ops/s | |
| test_a2c_speed[True-backward] | 3.0312ms | 2.9403ms | 340.0966 Ops/s | 321.4463 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.0560ms | 0.9846ms | 1.0157 KOps/s | 1.0091 KOps/s | |
| test_ppo_speed[False-None] | 4.2073ms | 3.8950ms | 256.7393 Ops/s | 254.7061 Ops/s | |
| test_ppo_speed[False-backward] | 7.4370ms | 7.0475ms | 141.8938 Ops/s | 142.8319 Ops/s | |
| test_ppo_speed[True-None] | 1.4811ms | 1.4226ms | 702.9268 Ops/s | 703.2675 Ops/s | |
| test_ppo_speed[True-backward] | 3.3448ms | 3.0714ms | 325.5846 Ops/s | 307.4965 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.1143ms | 1.0406ms | 960.9493 Ops/s | 919.8490 Ops/s | |
| test_reinforce_speed[False-None] | 2.4513ms | 2.3349ms | 428.2927 Ops/s | 433.8249 Ops/s | |
| test_reinforce_speed[False-backward] | 3.7665ms | 3.3199ms | 301.2165 Ops/s | 299.8139 Ops/s | |
| test_reinforce_speed[True-None] | 1.3814ms | 1.2864ms | 777.3552 Ops/s | 789.0235 Ops/s | |
| test_reinforce_speed[True-backward] | 2.9270ms | 2.8651ms | 349.0288 Ops/s | 325.1733 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 0.4435s | 10.4988ms | 95.2491 Ops/s | 104.1610 Ops/s | |
| test_iql_speed[False-None] | 10.0806ms | 9.4862ms | 105.4168 Ops/s | 104.3061 Ops/s | |
| test_iql_speed[False-backward] | 13.6978ms | 13.2253ms | 75.6126 Ops/s | 72.4944 Ops/s | |
| test_iql_speed[True-None] | 2.2740ms | 2.1528ms | 464.5126 Ops/s | 458.3970 Ops/s | |
| test_iql_speed[True-backward] | 5.1298ms | 4.6580ms | 214.6833 Ops/s | 203.2246 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 18.1803ms | 10.6895ms | 93.5497 Ops/s | 95.0727 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.4731ms | 5.9510ms | 168.0386 Ops/s | 166.3218 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.9858ms | 0.3974ms | 2.5166 KOps/s | 2.7337 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6093ms | 0.3766ms | 2.6553 KOps/s | 2.8194 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0564ms | 5.8086ms | 172.1573 Ops/s | 171.1457 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.7060ms | 0.3166ms | 3.1589 KOps/s | 3.4200 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6122ms | 0.3395ms | 2.9451 KOps/s | 3.6365 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.6136ms | 1.3577ms | 736.5563 Ops/s | 744.4566 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.4926ms | 1.2900ms | 775.2063 Ops/s | 789.7126 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.2523ms | 5.9905ms | 166.9321 Ops/s | 168.1143 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.9971ms | 0.4360ms | 2.2938 KOps/s | 2.2735 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.5821ms | 0.4192ms | 2.3857 KOps/s | 2.1234 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0483ms | 5.8517ms | 170.8896 Ops/s | 170.5738 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.1374ms | 0.3611ms | 2.7694 KOps/s | 3.0089 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6484ms | 0.3160ms | 3.1645 KOps/s | 3.0023 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.4354ms | 5.8201ms | 171.8174 Ops/s | 172.6690 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9931ms | 0.3410ms | 2.9328 KOps/s | 2.5794 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5825ms | 0.3542ms | 2.8235 KOps/s | 3.4002 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.1088ms | 5.9871ms | 167.0255 Ops/s | 166.1542 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.2104ms | 0.4870ms | 2.0532 KOps/s | 2.0246 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7037ms | 0.4704ms | 2.1258 KOps/s | 2.1104 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.6393s | 17.8035ms | 56.1689 Ops/s | 193.5151 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 7.4583ms | 1.9505ms | 512.6825 Ops/s | 518.0612 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 11.6266ms | 1.3489ms | 741.3598 Ops/s | 768.5983 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 8.0121ms | 5.0916ms | 196.4028 Ops/s | 193.6694 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 8.8049ms | 1.9362ms | 516.4829 Ops/s | 535.2377 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 1.1127ms | 0.9419ms | 1.0617 KOps/s | 1.0248 KOps/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.5883s | 16.9742ms | 58.9128 Ops/s | 48.6946 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 10.4247ms | 2.0936ms | 477.6358 Ops/s | 510.9299 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.2112ms | 1.1113ms | 899.8846 Ops/s | 913.2107 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 38.0903ms | 35.2401ms | 28.3768 Ops/s | 27.5509 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.7753ms | 18.0658ms | 55.3532 Ops/s | 56.0790 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 39.8770ms | 36.7745ms | 27.1928 Ops/s | 26.8959 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.3231ms | 18.3049ms | 54.6302 Ops/s | 54.9345 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 41.9024ms | 38.9647ms | 25.6643 Ops/s | 25.6018 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.9949ms | 20.7415ms | 48.2126 Ops/s | 50.6674 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8985ms | 0.2192ms | 4.5628 KOps/s | 4.4941 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.5590ms | 1.3888ms | 720.0299 Ops/s | 728.8695 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.6233ms | 2.2817ms | 438.2611 Ops/s | 440.9556 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.1399ms | 2.9457ms | 339.4743 Ops/s | 341.5865 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.2203ms | 0.1490ms | 6.7093 KOps/s | 6.7110 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3710ms | 0.2161ms | 4.6272 KOps/s | 4.3660 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.8031ms | 1.6444ms | 608.1356 Ops/s | 577.5729 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.5089ms | 1.3671ms | 731.4504 Ops/s | 742.4487 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.2638ms | 1.1368ms | 879.6427 Ops/s | 879.6855 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.8222ms | 3.6028ms | 277.5639 Ops/s | 274.1564 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 11.0982ms | 5.7299ms | 174.5236 Ops/s | 176.5821 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.2304ms | 6.9840ms | 143.1845 Ops/s | 141.1333 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4756ms | 0.2721ms | 3.6747 KOps/s | 3.6256 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.7208ms | 1.5484ms | 645.8263 Ops/s | 697.7407 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.5379ms | 2.4136ms | 414.3118 Ops/s | 414.7988 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.3039ms | 3.1595ms | 316.5086 Ops/s | 318.4765 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 35.0446ms | 33.9789ms | 29.4301 Ops/s | 29.3365 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 67.4988ms | 66.6880ms | 14.9952 Ops/s | 14.9163 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 39.2256ms | 38.4274ms | 26.0231 Ops/s | 25.9833 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 76.7554ms | 75.0184ms | 13.3301 Ops/s | 13.1548 Ops/s | |
| test_collector_without_rb_cuda[100-img_shape0-atari] | 58.1531ms | 57.1462ms | 17.4990 Ops/s | 17.6262 Ops/s | |
| test_collector_without_rb_cuda[200-img_shape1-large_batch] | 0.1157s | 0.1124s | 8.8996 Ops/s | 8.8509 Ops/s | |
| test_collector_with_rb_cuda[100-img_shape0-atari] | 60.0714ms | 58.8954ms | 16.9793 Ops/s | 16.9865 Ops/s | |
| test_collector_with_rb_cuda[200-img_shape1-large_batch] | 0.8050s | 0.1928s | 5.1865 Ops/s | 8.5329 Ops/s |
This was referenced Feb 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
log_metrics()method to the baseLoggerclass for logging multiple scalar metrics at onceWandbLoggerandMLFlowLoggerthat use their native batch logging APIs (experiment.log()andmlflow.log_metrics()respectively)_make_metrics_safe()utility that efficiently converts CUDA tensors to Python types by batching transfersMotivation
When logging multiple tensor metrics from CUDA, calling
.item()on each tensor triggers an implicit CUDA synchronization. With N metrics, that's N separate syncs.The new implementation:
non_blocking=TrueThis is particularly useful when logging to services running in separate processes (e.g., Ray actors for wandb/mlflow) that may not have GPU access.
Test plan
Made with Cursor