[Feature] Add TensorDict support to log_metrics#3455
Merged
Conversation
Extend log_metrics() to accept TensorDict in addition to dict inputs. This leverages TensorDict's efficient batch .to() method for CUDA->CPU transfers, which is more efficient than transferring tensors individually. Changes: - Add TensorDictBase to the metrics type signature (dict | TensorDict) - Add _make_metrics_safe_tensordict() helper for TensorDict-specific handling - Add keys_sep parameter to control how nested TensorDict keys are flattened (defaults to "/" for hierarchical metric names like "train/loss") - Update WandbLogger and MLFlowLogger implementations accordingly Co-authored-by: Cursor <[email protected]>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3455
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 35 PendingAs of commit fe9360a with merge base 1415062 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Check for CUDA tensors explicitly rather than relying on TensorDict.device,
which can be None even when individual tensors are on CUDA (e.g., mixed
devices or lazy structures).
Always call .to("cpu") but only sync if there were actually CUDA tensors.
Co-authored-by: Cursor <[email protected]>
Use torch.cuda.is_initialized() instead of iterating over all values. The event sync is cheap if there's no pending CUDA work, so we can just always sync when CUDA is in use rather than checking each tensor. Co-authored-by: Cursor <[email protected]>
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 80.2823μs | 79.1601μs | 12.6326 KOps/s | 12.3425 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1402ms | 0.1396ms | 7.1637 KOps/s | 7.0944 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1120s | 0.1118s | 8.9481 Ops/s | 9.2452 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.7211μs | 2.7025μs | 370.0300 KOps/s | 380.1627 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 36.6988μs | 36.5152μs | 27.3859 KOps/s | 27.4227 KOps/s | |
| test_simple | 0.5467s | 0.5441s | 1.8380 Ops/s | 1.7542 Ops/s | |
| test_transformed | 1.1222s | 1.1179s | 0.8946 Ops/s | 0.8751 Ops/s | |
| test_serial | 1.6971s | 1.6806s | 0.5950 Ops/s | 0.5918 Ops/s | |
| test_parallel | 1.3675s | 1.1879s | 0.8419 Ops/s | 0.8766 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.4646ms | 43.0654μs | 23.2205 KOps/s | 22.1463 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 55.4930μs | 24.9496μs | 40.0808 KOps/s | 39.8033 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 0.4461ms | 24.8966μs | 40.1661 KOps/s | 38.6930 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 61.6330μs | 13.6371μs | 73.3296 KOps/s | 70.3181 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 81.0940μs | 46.8374μs | 21.3504 KOps/s | 20.2852 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 0.4634ms | 27.7892μs | 35.9852 KOps/s | 35.5803 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 0.4568ms | 27.1725μs | 36.8019 KOps/s | 34.5327 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 63.6230μs | 16.2055μs | 61.7075 KOps/s | 58.3208 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 78.1440μs | 49.1769μs | 20.3347 KOps/s | 19.0158 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 58.1830μs | 30.2398μs | 33.0690 KOps/s | 32.3266 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 55.8230μs | 27.2571μs | 36.6877 KOps/s | 34.6682 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 45.0520μs | 16.3743μs | 61.0712 KOps/s | 59.0159 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 90.4350μs | 51.9491μs | 19.2496 KOps/s | 18.5484 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 68.9740μs | 32.9997μs | 30.3033 KOps/s | 29.9858 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 65.2630μs | 30.0152μs | 33.3164 KOps/s | 32.3620 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 45.1320μs | 19.3221μs | 51.7543 KOps/s | 51.5031 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 79.9440μs | 49.6230μs | 20.1519 KOps/s | 19.6749 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 61.9630μs | 30.4408μs | 32.8507 KOps/s | 31.9921 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.4137ms | 31.1158μs | 32.1381 KOps/s | 30.3009 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 87.3440μs | 17.5458μs | 56.9937 KOps/s | 53.9487 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 0.4708ms | 52.2313μs | 19.1456 KOps/s | 18.7462 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 0.4505ms | 32.6007μs | 30.6742 KOps/s | 29.4991 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 0.4502ms | 33.5533μs | 29.8033 KOps/s | 28.4287 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 49.5430μs | 20.8026μs | 48.0709 KOps/s | 46.6019 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 0.4691ms | 55.1385μs | 18.1361 KOps/s | 17.6221 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 0.4635ms | 35.9818μs | 27.7918 KOps/s | 27.4975 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 0.4599ms | 34.2762μs | 29.1748 KOps/s | 28.6636 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 55.5730μs | 21.1793μs | 47.2160 KOps/s | 47.3274 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 0.4815ms | 57.3716μs | 17.4302 KOps/s | 17.0482 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 0.4625ms | 38.4336μs | 26.0189 KOps/s | 25.5228 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 0.4553ms | 36.0198μs | 27.7625 KOps/s | 27.0694 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 60.6630μs | 23.3446μs | 42.8365 KOps/s | 41.5141 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7311s | 0.7309s | 1.3682 Ops/s | 1.2985 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7206s | 0.6224s | 1.6067 Ops/s | 1.5727 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7104s | 1.6335s | 0.6122 Ops/s | 0.5969 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.4967s | 1.4183s | 0.7051 Ops/s | 0.6875 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9631s | 1.8823s | 0.5313 Ops/s | 0.5186 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7445s | 1.6658s | 0.6003 Ops/s | 0.5851 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.7112s | 4.4952s | 0.2225 Ops/s | 0.2207 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.4470s | 4.3377s | 0.2305 Ops/s | 0.2268 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9969s | 1.9210s | 0.5206 Ops/s | 0.5182 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7568s | 1.6757s | 0.5968 Ops/s | 0.6031 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 10.4031ms | 9.9883ms | 100.1170 Ops/s | 101.6514 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 13.7530ms | 11.1090ms | 90.0168 Ops/s | 56.6003 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2432ms | 0.1217ms | 8.2197 KOps/s | 7.8371 KOps/s | |
| test_values[td1_return_estimate-False-False] | 27.1880ms | 26.5040ms | 37.7302 Ops/s | 38.5704 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 11.6854ms | 11.0565ms | 90.4444 Ops/s | 56.7078 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 39.4788ms | 38.9598ms | 25.6675 Ops/s | 26.0539 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 11.9185ms | 11.0530ms | 90.4735 Ops/s | 56.7154 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.8633ms | 8.7922ms | 113.7370 Ops/s | 113.8950 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.8450ms | 1.4297ms | 699.4319 Ops/s | 672.2800 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4733ms | 0.4035ms | 2.4785 KOps/s | 2.4792 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 34.6154ms | 30.1035ms | 33.2187 Ops/s | 28.6372 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 2.1959ms | 1.7105ms | 584.6232 Ops/s | 590.6720 Ops/s | |
| test_dqn_speed[False-None] | 1.7292ms | 1.3942ms | 717.2600 Ops/s | 725.2195 Ops/s | |
| test_dqn_speed[False-backward] | 1.9836ms | 1.8811ms | 531.5953 Ops/s | 534.1496 Ops/s | |
| test_dqn_speed[True-None] | 0.9537ms | 0.5428ms | 1.8422 KOps/s | 1.7401 KOps/s | |
| test_dqn_speed[True-backward] | 1.0420ms | 1.0030ms | 997.0087 Ops/s | 842.0833 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.9323ms | 0.5406ms | 1.8499 KOps/s | 1.7690 KOps/s | |
| test_ddpg_speed[False-None] | 3.0915ms | 2.7970ms | 357.5204 Ops/s | 349.6309 Ops/s | |
| test_ddpg_speed[False-backward] | 4.1360ms | 3.9972ms | 250.1775 Ops/s | 246.9182 Ops/s | |
| test_ddpg_speed[True-None] | 1.8135ms | 1.4049ms | 711.8158 Ops/s | 713.7799 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4367ms | 2.3780ms | 420.5249 Ops/s | 415.3331 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.7348ms | 1.3936ms | 717.5409 Ops/s | 763.0544 Ops/s | |
| test_sac_speed[False-None] | 8.5613ms | 7.8844ms | 126.8334 Ops/s | 128.3913 Ops/s | |
| test_sac_speed[False-backward] | 11.4015ms | 10.9735ms | 91.1289 Ops/s | 91.4194 Ops/s | |
| test_sac_speed[True-None] | 2.2768ms | 2.1533ms | 464.4043 Ops/s | 482.3572 Ops/s | |
| test_sac_speed[True-backward] | 4.2683ms | 4.0321ms | 248.0067 Ops/s | 253.6614 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.5559ms | 2.1344ms | 468.5251 Ops/s | 479.5658 Ops/s | |
| test_redq_speed[False-None] | 10.6264ms | 10.3014ms | 97.0742 Ops/s | 97.9324 Ops/s | |
| test_redq_speed[False-backward] | 18.2642ms | 17.7079ms | 56.4720 Ops/s | 59.0843 Ops/s | |
| test_redq_speed[True-None] | 4.7189ms | 4.4309ms | 225.6885 Ops/s | 223.1018 Ops/s | |
| test_redq_speed[True-backward] | 10.0568ms | 9.7102ms | 102.9849 Ops/s | 84.3866 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 4.7751ms | 4.4031ms | 227.1109 Ops/s | 221.2234 Ops/s | |
| test_redq_deprec_speed[False-None] | 11.3915ms | 10.8821ms | 91.8942 Ops/s | 90.6403 Ops/s | |
| test_redq_deprec_speed[False-backward] | 16.1681ms | 15.6266ms | 63.9934 Ops/s | 62.4784 Ops/s | |
| test_redq_deprec_speed[True-None] | 4.1183ms | 3.6953ms | 270.6149 Ops/s | 272.6279 Ops/s | |
| test_redq_deprec_speed[True-backward] | 7.8799ms | 7.6283ms | 131.0914 Ops/s | 132.9356 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 3.9254ms | 3.6420ms | 274.5768 Ops/s | 279.1174 Ops/s | |
| test_td3_speed[False-None] | 8.1707ms | 7.9185ms | 126.2873 Ops/s | 126.5155 Ops/s | |
| test_td3_speed[False-backward] | 11.1690ms | 10.6985ms | 93.4713 Ops/s | 93.7247 Ops/s | |
| test_td3_speed[True-None] | 1.9169ms | 1.8445ms | 542.1402 Ops/s | 544.3172 Ops/s | |
| test_td3_speed[True-backward] | 3.6648ms | 3.5718ms | 279.9700 Ops/s | 258.2936 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.8880ms | 1.8085ms | 552.9490 Ops/s | 547.7567 Ops/s | |
| test_cql_speed[False-None] | 28.8565ms | 26.0107ms | 38.4458 Ops/s | 37.3846 Ops/s | |
| test_cql_speed[False-backward] | 40.5767ms | 35.4496ms | 28.2091 Ops/s | 28.5090 Ops/s | |
| test_cql_speed[True-None] | 15.5452ms | 12.5128ms | 79.9182 Ops/s | 80.6385 Ops/s | |
| test_cql_speed[True-backward] | 21.5317ms | 18.5202ms | 53.9952 Ops/s | 55.4534 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 13.4276ms | 12.5311ms | 79.8015 Ops/s | 78.8970 Ops/s | |
| test_a2c_speed[False-None] | 5.8864ms | 5.4258ms | 184.3042 Ops/s | 184.5790 Ops/s | |
| test_a2c_speed[False-backward] | 12.2853ms | 11.8635ms | 84.2922 Ops/s | 84.5338 Ops/s | |
| test_a2c_speed[True-None] | 3.9067ms | 3.7630ms | 265.7484 Ops/s | 267.7160 Ops/s | |
| test_a2c_speed[True-backward] | 8.8427ms | 8.6310ms | 115.8612 Ops/s | 113.1858 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 4.0086ms | 3.7583ms | 266.0807 Ops/s | 268.1597 Ops/s | |
| test_ppo_speed[False-None] | 6.0741ms | 5.8088ms | 172.1532 Ops/s | 169.4951 Ops/s | |
| test_ppo_speed[False-backward] | 12.9933ms | 12.4123ms | 80.5650 Ops/s | 80.9993 Ops/s | |
| test_ppo_speed[True-None] | 3.9090ms | 3.7032ms | 270.0355 Ops/s | 266.3533 Ops/s | |
| test_ppo_speed[True-backward] | 8.9334ms | 8.4689ms | 118.0790 Ops/s | 103.9969 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 3.9079ms | 3.6936ms | 270.7378 Ops/s | 272.3397 Ops/s | |
| test_reinforce_speed[False-None] | 5.0916ms | 4.5821ms | 218.2393 Ops/s | 211.4258 Ops/s | |
| test_reinforce_speed[False-backward] | 7.5444ms | 7.3595ms | 135.8795 Ops/s | 132.5123 Ops/s | |
| test_reinforce_speed[True-None] | 3.1730ms | 2.9233ms | 342.0817 Ops/s | 330.6307 Ops/s | |
| test_reinforce_speed[True-backward] | 8.1174ms | 7.8130ms | 127.9921 Ops/s | 121.9059 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.3606ms | 2.8991ms | 344.9362 Ops/s | 343.9232 Ops/s | |
| test_iql_speed[False-None] | 24.8616ms | 20.1599ms | 49.6035 Ops/s | 49.2995 Ops/s | |
| test_iql_speed[False-backward] | 30.8986ms | 30.3949ms | 32.9002 Ops/s | 32.8970 Ops/s | |
| test_iql_speed[True-None] | 8.9276ms | 8.5874ms | 116.4498 Ops/s | 115.1988 Ops/s | |
| test_iql_speed[True-backward] | 17.3806ms | 16.8186ms | 59.4580 Ops/s | 59.3614 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 9.0435ms | 8.6330ms | 115.8343 Ops/s | 109.7635 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.1347ms | 6.0466ms | 165.3825 Ops/s | 166.5681 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.0275ms | 0.2847ms | 3.5129 KOps/s | 3.4067 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5974ms | 0.2665ms | 3.7528 KOps/s | 3.8534 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1580ms | 5.8857ms | 169.9039 Ops/s | 170.6108 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.9989ms | 0.2788ms | 3.5869 KOps/s | 3.3108 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.4861ms | 0.2616ms | 3.8233 KOps/s | 3.2564 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.4904ms | 1.2392ms | 806.9531 Ops/s | 773.9040 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.4257ms | 1.1549ms | 865.8807 Ops/s | 814.6201 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 10.1143ms | 6.2234ms | 160.6845 Ops/s | 164.4839 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.1095ms | 0.4929ms | 2.0287 KOps/s | 2.1203 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6841ms | 0.4665ms | 2.1438 KOps/s | 2.3375 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0532ms | 5.9185ms | 168.9625 Ops/s | 168.5861 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.6411ms | 0.3143ms | 3.1812 KOps/s | 2.9507 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5506ms | 0.3309ms | 3.0223 KOps/s | 3.4307 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1010ms | 5.8702ms | 170.3518 Ops/s | 169.4660 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.1479ms | 0.3638ms | 2.7488 KOps/s | 3.0044 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5746ms | 0.3408ms | 2.9341 KOps/s | 3.7177 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.1743ms | 6.0531ms | 165.2047 Ops/s | 165.5249 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.9126ms | 0.4886ms | 2.0469 KOps/s | 2.1560 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7831ms | 0.4542ms | 2.2018 KOps/s | 2.1011 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.4841ms | 5.0389ms | 198.4559 Ops/s | 191.2443 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 9.6820ms | 1.9056ms | 524.7758 Ops/s | 521.4243 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.2060ms | 0.8616ms | 1.1606 KOps/s | 1.0609 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 7.5515ms | 5.0651ms | 197.4303 Ops/s | 195.1539 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 11.1691ms | 1.9067ms | 524.4531 Ops/s | 559.3732 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 8.6057ms | 1.2960ms | 771.6261 Ops/s | 827.7042 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.5576s | 16.3150ms | 61.2932 Ops/s | 55.9730 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 3.9571ms | 1.9097ms | 523.6528 Ops/s | 440.3234 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.3260ms | 1.0417ms | 959.9371 Ops/s | 780.4266 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 38.4221ms | 35.5797ms | 28.1059 Ops/s | 27.5839 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.5840ms | 17.7082ms | 56.4712 Ops/s | 56.0355 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 40.4315ms | 36.5705ms | 27.3444 Ops/s | 26.3723 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 19.3448ms | 17.9013ms | 55.8617 Ops/s | 53.9859 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 40.3764ms | 38.3612ms | 26.0680 Ops/s | 25.3074 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 20.9202ms | 19.4425ms | 51.4337 Ops/s | 51.5505 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8452ms | 0.2155ms | 4.6411 KOps/s | 4.7195 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.7368ms | 1.3861ms | 721.4274 Ops/s | 724.6623 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.7994ms | 2.3448ms | 426.4792 Ops/s | 413.8647 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.0687ms | 2.8952ms | 345.3967 Ops/s | 343.6065 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.2137ms | 0.1316ms | 7.6000 KOps/s | 7.4259 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3561ms | 0.1830ms | 5.4645 KOps/s | 4.9029 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.8780ms | 1.7664ms | 566.1367 Ops/s | 563.7529 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.4954ms | 1.3285ms | 752.7144 Ops/s | 768.6247 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.2184ms | 1.1215ms | 891.6640 Ops/s | 885.9015 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.7026ms | 3.5710ms | 280.0312 Ops/s | 270.0251 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 6.0601ms | 5.7149ms | 174.9810 Ops/s | 176.7055 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.3833ms | 7.2376ms | 138.1680 Ops/s | 143.9503 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4155ms | 0.2706ms | 3.6958 KOps/s | 3.7283 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.6807ms | 1.5062ms | 663.9028 Ops/s | 667.1550 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.6710ms | 2.4722ms | 404.4921 Ops/s | 393.0365 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.3968ms | 3.0967ms | 322.9256 Ops/s | 319.3564 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 33.9046ms | 33.1796ms | 30.1390 Ops/s | 29.4891 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 67.5772ms | 65.7784ms | 15.2026 Ops/s | 15.0862 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 38.7823ms | 37.9987ms | 26.3167 Ops/s | 25.9691 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 76.1987ms | 74.6546ms | 13.3950 Ops/s | 13.2774 Ops/s |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 80.2374μs | 79.5256μs | 12.5746 KOps/s | 12.3726 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1381ms | 0.1378ms | 7.2555 KOps/s | 7.1564 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1073s | 0.1071s | 9.3344 Ops/s | 8.9186 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.7730μs | 2.7662μs | 361.5103 KOps/s | 375.1809 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 39.0381μs | 38.4667μs | 25.9965 KOps/s | 26.3444 KOps/s | |
| test_simple | 0.8012s | 0.7982s | 1.2528 Ops/s | 1.2276 Ops/s | |
| test_transformed | 1.5526s | 1.4567s | 0.6865 Ops/s | 0.6929 Ops/s | |
| test_serial | 2.4243s | 2.3278s | 0.4296 Ops/s | 0.4331 Ops/s | |
| test_parallel | 2.1436s | 2.0114s | 0.4972 Ops/s | 0.5121 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.3209ms | 45.2667μs | 22.0913 KOps/s | 22.4869 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 48.7300μs | 25.3539μs | 39.4416 KOps/s | 39.8812 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 60.1400μs | 25.3168μs | 39.4994 KOps/s | 39.8672 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 38.5510μs | 13.7666μs | 72.6395 KOps/s | 72.3160 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 74.1710μs | 47.8027μs | 20.9193 KOps/s | 21.2604 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 56.9110μs | 27.3404μs | 36.5759 KOps/s | 36.0930 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 61.6010μs | 28.4338μs | 35.1694 KOps/s | 35.9290 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 47.2910μs | 16.5645μs | 60.3700 KOps/s | 61.0538 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 79.5710μs | 50.0866μs | 19.9654 KOps/s | 19.8118 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 59.3910μs | 30.6428μs | 32.6340 KOps/s | 32.8596 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 59.5110μs | 28.0185μs | 35.6907 KOps/s | 36.5379 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 42.9110μs | 16.6621μs | 60.0164 KOps/s | 60.4730 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 80.9920μs | 52.5022μs | 19.0468 KOps/s | 18.9927 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 66.0610μs | 32.9190μs | 30.3776 KOps/s | 30.3005 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 60.1810μs | 30.2928μs | 33.0112 KOps/s | 33.5646 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 44.4300μs | 19.1690μs | 52.1676 KOps/s | 52.1793 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 96.6920μs | 49.9974μs | 20.0010 KOps/s | 19.8828 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 56.3910μs | 30.8202μs | 32.4462 KOps/s | 32.9688 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.3215ms | 31.8616μs | 31.3857 KOps/s | 31.3184 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 48.0910μs | 18.4443μs | 54.2172 KOps/s | 55.5583 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 0.1024ms | 53.4188μs | 18.7200 KOps/s | 18.7740 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 66.9710μs | 33.2932μs | 30.0362 KOps/s | 29.9688 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 64.1310μs | 33.9504μs | 29.4547 KOps/s | 29.9196 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 45.6910μs | 21.0222μs | 47.5688 KOps/s | 47.4508 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 89.6110μs | 55.7553μs | 17.9355 KOps/s | 17.9913 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 62.7310μs | 35.8876μs | 27.8648 KOps/s | 28.0301 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 66.2510μs | 33.7373μs | 29.6408 KOps/s | 29.5478 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 59.9410μs | 20.8974μs | 47.8529 KOps/s | 47.6163 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 91.9210μs | 57.9138μs | 17.2670 KOps/s | 17.5582 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 68.6010μs | 38.3666μs | 26.0643 KOps/s | 26.3860 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 63.7410μs | 36.9698μs | 27.0491 KOps/s | 28.2064 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 51.2910μs | 23.6853μs | 42.2203 KOps/s | 43.1127 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8709s | 0.7722s | 1.2949 Ops/s | 1.3020 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7320s | 0.6379s | 1.5678 Ops/s | 1.5819 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7590s | 1.6854s | 0.5933 Ops/s | 0.5993 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5406s | 1.4626s | 0.6837 Ops/s | 0.6900 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 2.0085s | 1.9305s | 0.5180 Ops/s | 0.5204 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7912s | 1.7096s | 0.5849 Ops/s | 0.5889 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.8361s | 4.6781s | 0.2138 Ops/s | 0.2142 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5607s | 4.4457s | 0.2249 Ops/s | 0.2233 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.0495s | 1.9590s | 0.5105 Ops/s | 0.5112 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7617s | 1.6632s | 0.6013 Ops/s | 0.5881 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 21.7109ms | 21.2575ms | 47.0421 Ops/s | 48.1157 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1462s | 3.8456ms | 260.0391 Ops/s | 287.6283 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1121ms | 83.0845μs | 12.0359 KOps/s | 11.9595 KOps/s | |
| test_values[td1_return_estimate-False-False] | 50.0913ms | 49.5331ms | 20.1885 Ops/s | 20.3139 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.4083ms | 1.0929ms | 914.9808 Ops/s | 916.3271 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 80.8827ms | 80.5066ms | 12.4213 Ops/s | 12.4058 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.2928ms | 1.0875ms | 919.5748 Ops/s | 916.8909 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 22.5863ms | 22.1304ms | 45.1867 Ops/s | 47.8005 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0308ms | 0.7617ms | 1.3128 KOps/s | 1.3229 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 1.1518ms | 0.6983ms | 1.4321 KOps/s | 1.4726 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5821ms | 1.4967ms | 668.1216 Ops/s | 670.3762 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7833ms | 0.7041ms | 1.4203 KOps/s | 1.4350 KOps/s | |
| test_dqn_speed[False-None] | 1.9574ms | 1.5326ms | 652.4957 Ops/s | 656.6249 Ops/s | |
| test_dqn_speed[False-backward] | 2.4280ms | 2.1675ms | 461.3600 Ops/s | 461.6170 Ops/s | |
| test_dqn_speed[True-None] | 1.0904ms | 0.5822ms | 1.7176 KOps/s | 1.7444 KOps/s | |
| test_dqn_speed[True-backward] | 1.1332ms | 1.0973ms | 911.3120 Ops/s | 832.6060 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.7865ms | 0.5995ms | 1.6681 KOps/s | 1.6465 KOps/s | |
| test_ddpg_speed[False-None] | 3.2669ms | 2.8850ms | 346.6255 Ops/s | 342.7233 Ops/s | |
| test_ddpg_speed[False-backward] | 4.5445ms | 4.1235ms | 242.5132 Ops/s | 236.0376 Ops/s | |
| test_ddpg_speed[True-None] | 1.4620ms | 1.3249ms | 754.7531 Ops/s | 755.0080 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4245ms | 2.3653ms | 422.7875 Ops/s | 420.0320 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.4602ms | 1.3524ms | 739.4432 Ops/s | 741.9298 Ops/s | |
| test_sac_speed[False-None] | 8.9031ms | 8.3219ms | 120.1642 Ops/s | 119.8099 Ops/s | |
| test_sac_speed[False-backward] | 11.6789ms | 11.2252ms | 89.0852 Ops/s | 88.6893 Ops/s | |
| test_sac_speed[True-None] | 1.9961ms | 1.8177ms | 550.1532 Ops/s | 548.2552 Ops/s | |
| test_sac_speed[True-backward] | 3.5792ms | 3.4657ms | 288.5389 Ops/s | 284.2179 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 19.5455ms | 11.0547ms | 90.4596 Ops/s | 90.1581 Ops/s | |
| test_redq_deprec_speed[False-None] | 10.2845ms | 9.4396ms | 105.9372 Ops/s | 104.5606 Ops/s | |
| test_redq_deprec_speed[False-backward] | 12.9130ms | 12.3588ms | 80.9139 Ops/s | 79.4591 Ops/s | |
| test_redq_deprec_speed[True-None] | 2.6488ms | 2.5084ms | 398.6578 Ops/s | 391.1179 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.5462ms | 4.1075ms | 243.4592 Ops/s | 230.8535 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 16.2006ms | 9.9405ms | 100.5990 Ops/s | 100.3024 Ops/s | |
| test_td3_speed[False-None] | 8.5410ms | 8.2215ms | 121.6318 Ops/s | 113.7885 Ops/s | |
| test_td3_speed[False-backward] | 10.8822ms | 10.5038ms | 95.2033 Ops/s | 91.2152 Ops/s | |
| test_td3_speed[True-None] | 1.6699ms | 1.6442ms | 608.2035 Ops/s | 614.9039 Ops/s | |
| test_td3_speed[True-backward] | 3.1288ms | 3.0866ms | 323.9841 Ops/s | 307.3118 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 73.0274ms | 25.0199ms | 39.9681 Ops/s | 39.9102 Ops/s | |
| test_cql_speed[False-None] | 17.6884ms | 17.2358ms | 58.0187 Ops/s | 57.8161 Ops/s | |
| test_cql_speed[False-backward] | 22.8236ms | 22.3510ms | 44.7407 Ops/s | 43.8490 Ops/s | |
| test_cql_speed[True-None] | 3.5129ms | 3.2826ms | 304.6333 Ops/s | 304.6697 Ops/s | |
| test_cql_speed[True-backward] | 5.6400ms | 5.3379ms | 187.3391 Ops/s | 181.2282 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 18.8992ms | 11.9233ms | 83.8697 Ops/s | 84.4405 Ops/s | |
| test_a2c_speed[False-None] | 3.9368ms | 3.2377ms | 308.8598 Ops/s | 309.5894 Ops/s | |
| test_a2c_speed[False-backward] | 6.5496ms | 6.0653ms | 164.8710 Ops/s | 158.1680 Ops/s | |
| test_a2c_speed[True-None] | 1.3994ms | 1.3266ms | 753.8320 Ops/s | 744.2268 Ops/s | |
| test_a2c_speed[True-backward] | 3.1189ms | 2.9898ms | 334.4692 Ops/s | 320.2259 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.1388ms | 0.9924ms | 1.0077 KOps/s | 1.0166 KOps/s | |
| test_ppo_speed[False-None] | 3.9879ms | 3.8460ms | 260.0135 Ops/s | 260.9289 Ops/s | |
| test_ppo_speed[False-backward] | 7.3210ms | 6.8882ms | 145.1766 Ops/s | 141.2807 Ops/s | |
| test_ppo_speed[True-None] | 1.7743ms | 1.4439ms | 692.5665 Ops/s | 697.8306 Ops/s | |
| test_ppo_speed[True-backward] | 3.4697ms | 3.0986ms | 322.7229 Ops/s | 301.6606 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.1364ms | 1.0464ms | 955.6418 Ops/s | 922.5488 Ops/s | |
| test_reinforce_speed[False-None] | 2.4616ms | 2.2787ms | 438.8492 Ops/s | 437.3071 Ops/s | |
| test_reinforce_speed[False-backward] | 3.7665ms | 3.3004ms | 302.9959 Ops/s | 306.6611 Ops/s | |
| test_reinforce_speed[True-None] | 1.3661ms | 1.3021ms | 768.0049 Ops/s | 779.0995 Ops/s | |
| test_reinforce_speed[True-backward] | 3.0165ms | 2.9189ms | 342.5951 Ops/s | 334.0483 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 0.4495s | 10.4038ms | 96.1190 Ops/s | 105.1991 Ops/s | |
| test_iql_speed[False-None] | 9.9153ms | 9.4222ms | 106.1324 Ops/s | 106.4492 Ops/s | |
| test_iql_speed[False-backward] | 13.4522ms | 13.0263ms | 76.7678 Ops/s | 76.7861 Ops/s | |
| test_iql_speed[True-None] | 2.3439ms | 2.1932ms | 455.9600 Ops/s | 454.2154 Ops/s | |
| test_iql_speed[True-backward] | 5.2153ms | 4.7256ms | 211.6154 Ops/s | 202.9520 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 18.0493ms | 10.6551ms | 93.8517 Ops/s | 95.8484 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.1678ms | 5.9640ms | 167.6733 Ops/s | 164.3908 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.6706ms | 0.3580ms | 2.7932 KOps/s | 2.9255 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6144ms | 0.3394ms | 2.9468 KOps/s | 3.4675 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1035ms | 5.8489ms | 170.9728 Ops/s | 170.9212 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9611ms | 0.3392ms | 2.9482 KOps/s | 3.1487 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5325ms | 0.3149ms | 3.1757 KOps/s | 3.1970 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.6123ms | 1.3929ms | 717.9040 Ops/s | 701.5540 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6182ms | 1.3580ms | 736.3623 Ops/s | 740.4402 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.2620ms | 6.0443ms | 165.4454 Ops/s | 165.1610 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.1614ms | 0.4339ms | 2.3049 KOps/s | 2.1569 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6376ms | 0.4259ms | 2.3480 KOps/s | 2.4098 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.5349ms | 5.9593ms | 167.8062 Ops/s | 170.6044 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8285ms | 0.2958ms | 3.3811 KOps/s | 2.6766 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6003ms | 0.3686ms | 2.7132 KOps/s | 3.7825 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0955ms | 5.8280ms | 171.5866 Ops/s | 169.9176 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.6322ms | 0.3691ms | 2.7094 KOps/s | 2.7931 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5564ms | 0.3087ms | 3.2392 KOps/s | 3.4543 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.1896ms | 6.0604ms | 165.0053 Ops/s | 164.2097 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.7859ms | 0.4850ms | 2.0619 KOps/s | 2.2363 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7483ms | 0.4543ms | 2.2014 KOps/s | 2.2924 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.6471s | 17.9532ms | 55.7004 Ops/s | 45.5248 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 8.4086ms | 2.0106ms | 497.3714 Ops/s | 480.0761 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 9.5092ms | 1.2887ms | 775.9905 Ops/s | 881.1802 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 6.9031ms | 5.0651ms | 197.4277 Ops/s | 194.0328 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 4.0521ms | 1.8014ms | 555.1236 Ops/s | 547.6284 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 1.0741ms | 0.9427ms | 1.0607 KOps/s | 754.6157 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.5830s | 16.8482ms | 59.3536 Ops/s | 183.7769 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 4.0434ms | 1.9535ms | 511.9091 Ops/s | 471.6666 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.0980ms | 1.1179ms | 894.5477 Ops/s | 936.1466 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 39.0850ms | 36.5570ms | 27.3546 Ops/s | 27.2726 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 20.2790ms | 18.5952ms | 53.7774 Ops/s | 56.1309 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 41.9518ms | 38.2458ms | 26.1467 Ops/s | 26.6802 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.6261ms | 18.9494ms | 52.7722 Ops/s | 54.9574 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 41.8262ms | 39.5032ms | 25.3144 Ops/s | 25.7414 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 22.2125ms | 20.6756ms | 48.3661 Ops/s | 50.5554 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8988ms | 0.2275ms | 4.3964 KOps/s | 4.4413 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.6051ms | 1.4241ms | 702.1931 Ops/s | 695.4716 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.5232ms | 2.3238ms | 430.3246 Ops/s | 443.7768 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.0332ms | 2.8730ms | 348.0686 Ops/s | 343.9342 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.2369ms | 0.1515ms | 6.6019 KOps/s | 6.7387 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3834ms | 0.2289ms | 4.3695 KOps/s | 4.8105 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.8739ms | 1.7112ms | 584.3739 Ops/s | 553.8458 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.4372ms | 1.2979ms | 770.5036 Ops/s | 742.7778 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.2366ms | 1.1293ms | 885.4820 Ops/s | 876.9184 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.8202ms | 3.6273ms | 275.6876 Ops/s | 265.5702 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 5.7534ms | 5.6138ms | 178.1329 Ops/s | 175.8469 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.4796ms | 7.3433ms | 136.1785 Ops/s | 139.2154 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4804ms | 0.2853ms | 3.5050 KOps/s | 3.6836 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.5966ms | 1.4582ms | 685.7819 Ops/s | 652.7806 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.5538ms | 2.4259ms | 412.2105 Ops/s | 411.1285 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.4802ms | 3.0963ms | 322.9669 Ops/s | 320.3387 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 35.4567ms | 34.8759ms | 28.6731 Ops/s | 29.7289 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 68.7976ms | 67.1358ms | 14.8952 Ops/s | 15.0731 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 40.3648ms | 38.8413ms | 25.7458 Ops/s | 26.1545 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 77.5613ms | 75.7553ms | 13.2004 Ops/s | 13.3847 Ops/s | |
| test_collector_without_rb_cuda[100-img_shape0-atari] | 58.4168ms | 57.1454ms | 17.4992 Ops/s | 17.4924 Ops/s | |
| test_collector_without_rb_cuda[200-img_shape1-large_batch] | 0.1166s | 0.1152s | 8.6777 Ops/s | 8.7698 Ops/s | |
| test_collector_with_rb_cuda[100-img_shape0-atari] | 60.8497ms | 59.9721ms | 16.6744 Ops/s | 16.9707 Ops/s | |
| test_collector_with_rb_cuda[200-img_shape1-large_batch] | 0.1204s | 0.1188s | 8.4147 Ops/s | 8.4284 Ops/s |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extend
log_metrics()to acceptTensorDictin addition todictinputs. This is a follow-up to #3452.Changes
TensorDictBaseto the metrics type signature (dict[str, Any] | TensorDictBase)_make_metrics_safe_tensordict()helper for TensorDict-specific handlingkeys_sepparameter to control how nested TensorDict keys are flattened (defaults to"/"for hierarchical metric names like"train/loss")WandbLoggerandMLFlowLoggerimplementations accordinglyBenefits
This leverages TensorDict's efficient batch
.to()method for CUDA→CPU transfers, which is more efficient than transferring tensors individually. TensorDict can transfer all its leaf tensors in a single optimized operation.Example Usage
Test plan
Made with Cursor