[Dev] Add prof instrumentation to collector and env pipeline#3460
[Dev] Add prof instrumentation to collector and env pipeline#3460vmoens wants to merge 2 commits intogh/vmoens/221/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3460
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit f6255e8 with merge base ab49b59 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
| Prefix | Label Applied | Example |
|---|---|---|
[BugFix] |
BugFix | [BugFix] Fix memory leak in collector |
[Feature] |
Feature | [Feature] Add new optimizer |
[Doc] or [Docs] |
Documentation | [Doc] Update installation guide |
[Refactor] |
Refactoring | [Refactor] Clean up module imports |
[CI] |
CI | [CI] Fix workflow permissions |
[Test] or [Tests] |
Tests | [Tests] Add unit tests for buffer |
[Environment] or [Environments] |
Environments | [Environments] Add Gymnasium support |
[Data] |
Data | [Data] Fix replay buffer sampling |
[Performance] or [Perf] |
Performance | [Performance] Optimize tensor ops |
[BC-Breaking] |
bc breaking | [BC-Breaking] Remove deprecated API |
[Deprecation] |
Deprecation | [Deprecation] Mark old function |
Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).
|
| Prefix | Label Applied | Example |
|---|---|---|
[BugFix] |
BugFix | [BugFix] Fix memory leak in collector |
[Feature] |
Feature | [Feature] Add new optimizer |
[Doc] or [Docs] |
Documentation | [Doc] Update installation guide |
[Refactor] |
Refactoring | [Refactor] Clean up module imports |
[CI] |
CI | [CI] Fix workflow permissions |
[Test] or [Tests] |
Tests | [Tests] Add unit tests for buffer |
[Environment] or [Environments] |
Environments | [Environments] Add Gymnasium support |
[Data] |
Data | [Data] Fix replay buffer sampling |
[Performance] or [Perf] |
Performance | [Performance] Optimize tensor ops |
[BC-Breaking] |
bc breaking | [BC-Breaking] Remove deprecated API |
[Deprecation] |
Deprecation | [Deprecation] Mark old function |
Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).
|
| Prefix | Label Applied | Example |
|---|---|---|
[BugFix] |
BugFix | [BugFix] Fix memory leak in collector |
[Feature] |
Feature | [Feature] Add new optimizer |
[Doc] or [Docs] |
Documentation | [Doc] Update installation guide |
[Refactor] |
Refactoring | [Refactor] Clean up module imports |
[CI] |
CI | [CI] Fix workflow permissions |
[Test] or [Tests] |
Tests | [Tests] Add unit tests for buffer |
[Environment] or [Environments] |
Environments | [Environments] Add Gymnasium support |
[Data] |
Data | [Data] Fix replay buffer sampling |
[Performance] or [Perf] |
Performance | [Performance] Optimize tensor ops |
[BC-Breaking] |
bc breaking | [BC-Breaking] Remove deprecated API |
[Deprecation] |
Deprecation | [Deprecation] Mark old function |
Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 84.6248μs | 81.5414μs | 12.2637 KOps/s | 12.5764 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1352ms | 0.1341ms | 7.4575 KOps/s | 7.3637 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1017s | 0.1016s | 9.8473 Ops/s | 9.7982 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.4875μs | 2.4706μs | 404.7529 KOps/s | 402.9455 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 36.0586μs | 35.8398μs | 27.9019 KOps/s | 26.1796 KOps/s | |
| test_simple | 0.5318s | 0.5315s | 1.8816 Ops/s | 1.7890 Ops/s | |
| test_transformed | 1.1112s | 1.1057s | 0.9044 Ops/s | 0.8830 Ops/s | |
| test_serial | 1.6612s | 1.6457s | 0.6076 Ops/s | 0.6050 Ops/s | |
| test_parallel | 1.1190s | 1.0341s | 0.9670 Ops/s | 0.9663 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.3230ms | 43.6430μs | 22.9132 KOps/s | 23.6818 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 71.2510μs | 24.4062μs | 40.9732 KOps/s | 41.2652 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 59.2110μs | 24.3700μs | 41.0341 KOps/s | 40.7934 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 57.0110μs | 13.4700μs | 74.2390 KOps/s | 74.6363 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 90.6620μs | 46.7076μs | 21.4098 KOps/s | 21.4293 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 0.4276ms | 26.3141μs | 38.0025 KOps/s | 36.4726 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 62.7710μs | 26.6723μs | 37.4921 KOps/s | 36.5225 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 50.2600μs | 16.0048μs | 62.4813 KOps/s | 61.4347 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 0.4565ms | 49.9474μs | 20.0211 KOps/s | 20.6962 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 0.4495ms | 29.5824μs | 33.8039 KOps/s | 33.2363 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 60.2510μs | 26.8238μs | 37.2804 KOps/s | 36.5359 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 0.4199ms | 16.0729μs | 62.2164 KOps/s | 61.4191 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 0.4580ms | 50.5517μs | 19.7817 KOps/s | 19.1460 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 76.1710μs | 32.0596μs | 31.1919 KOps/s | 30.7226 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 0.4471ms | 29.0867μs | 34.3799 KOps/s | 33.7895 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 0.4336ms | 18.4698μs | 54.1423 KOps/s | 52.7354 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 85.6620μs | 48.8218μs | 20.4827 KOps/s | 20.2628 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 0.4463ms | 28.8011μs | 34.7209 KOps/s | 33.0727 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.4411ms | 30.7285μs | 32.5431 KOps/s | 31.6969 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 50.3700μs | 17.6694μs | 56.5950 KOps/s | 55.3241 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 0.4733ms | 50.9016μs | 19.6458 KOps/s | 19.0567 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 0.4501ms | 31.6306μs | 31.6150 KOps/s | 30.3970 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 0.4557ms | 32.7955μs | 30.4920 KOps/s | 29.6992 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 49.3210μs | 20.0445μs | 49.8889 KOps/s | 48.5451 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 0.1179ms | 54.3610μs | 18.3956 KOps/s | 18.4383 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 0.4645ms | 34.4826μs | 29.0001 KOps/s | 27.8861 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 0.4532ms | 33.5313μs | 29.8229 KOps/s | 29.4718 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 50.2610μs | 19.9332μs | 50.1675 KOps/s | 48.0809 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 0.4738ms | 55.0181μs | 18.1759 KOps/s | 17.7527 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 0.4528ms | 37.0375μs | 26.9997 KOps/s | 26.3465 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 0.4501ms | 35.4440μs | 28.2135 KOps/s | 28.0175 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 0.4378ms | 22.2736μs | 44.8961 KOps/s | 43.1458 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7243s | 0.7213s | 1.3864 Ops/s | 1.3366 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7109s | 0.6140s | 1.6286 Ops/s | 1.6200 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7083s | 1.6305s | 0.6133 Ops/s | 0.6102 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.4956s | 1.4146s | 0.7069 Ops/s | 0.7056 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9592s | 1.8790s | 0.5322 Ops/s | 0.5326 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7429s | 1.6594s | 0.6026 Ops/s | 0.6030 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.7606s | 4.6198s | 0.2165 Ops/s | 0.2190 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5119s | 4.3859s | 0.2280 Ops/s | 0.2290 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9224s | 1.8527s | 0.5398 Ops/s | 0.5355 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7095s | 1.5995s | 0.6252 Ops/s | 0.6193 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 9.7326ms | 9.5992ms | 104.1757 Ops/s | 103.9089 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 20.3402ms | 17.6689ms | 56.5968 Ops/s | 56.2144 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1960ms | 0.1226ms | 8.1566 KOps/s | 7.8790 KOps/s | |
| test_values[td1_return_estimate-False-False] | 25.7746ms | 25.4236ms | 39.3335 Ops/s | 37.9966 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 18.6860ms | 17.7122ms | 56.4584 Ops/s | 55.9354 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 38.3413ms | 37.7221ms | 26.5097 Ops/s | 25.9332 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 17.8889ms | 17.6163ms | 56.7656 Ops/s | 55.9142 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.5279ms | 8.4397ms | 118.4875 Ops/s | 116.9571 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.8586ms | 1.4823ms | 674.6249 Ops/s | 681.4200 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4895ms | 0.4088ms | 2.4464 KOps/s | 2.4760 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 35.0739ms | 34.5295ms | 28.9608 Ops/s | 28.8732 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 1.8146ms | 1.6893ms | 591.9535 Ops/s | 586.3951 Ops/s | |
| test_dqn_speed[False-None] | 1.7768ms | 1.3628ms | 733.7944 Ops/s | 750.5548 Ops/s | |
| test_dqn_speed[False-backward] | 1.9295ms | 1.8700ms | 534.7686 Ops/s | 543.6064 Ops/s | |
| test_dqn_speed[True-None] | 0.9549ms | 0.5470ms | 1.8281 KOps/s | 1.8506 KOps/s | |
| test_dqn_speed[True-backward] | 1.1309ms | 0.9915ms | 1.0086 KOps/s | 1.0007 KOps/s | |
| test_dqn_speed[reduce-overhead-None] | 0.9453ms | 0.5286ms | 1.8917 KOps/s | 1.8611 KOps/s | |
| test_ddpg_speed[False-None] | 3.1472ms | 2.7369ms | 365.3708 Ops/s | 359.7313 Ops/s | |
| test_ddpg_speed[False-backward] | 3.9784ms | 3.8829ms | 257.5424 Ops/s | 254.4025 Ops/s | |
| test_ddpg_speed[True-None] | 1.7651ms | 1.3876ms | 720.6633 Ops/s | 713.0627 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4188ms | 2.3563ms | 424.3965 Ops/s | 342.2545 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.5071ms | 1.3759ms | 726.7914 Ops/s | 715.0496 Ops/s | |
| test_sac_speed[False-None] | 8.4026ms | 7.7170ms | 129.5845 Ops/s | 129.1272 Ops/s | |
| test_sac_speed[False-backward] | 13.7330ms | 11.3633ms | 88.0025 Ops/s | 91.8738 Ops/s | |
| test_sac_speed[True-None] | 2.4864ms | 2.1288ms | 469.7570 Ops/s | 465.0575 Ops/s | |
| test_sac_speed[True-backward] | 4.1002ms | 4.0151ms | 249.0604 Ops/s | 225.9870 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.5274ms | 2.1224ms | 471.1573 Ops/s | 464.8601 Ops/s | |
| test_redq_speed[False-None] | 14.8271ms | 10.4647ms | 95.5591 Ops/s | 92.8562 Ops/s | |
| test_redq_speed[False-backward] | 21.6681ms | 17.8100ms | 56.1482 Ops/s | 57.3823 Ops/s | |
| test_redq_speed[True-None] | 5.4550ms | 4.4541ms | 224.5099 Ops/s | 220.9189 Ops/s | |
| test_redq_speed[True-backward] | 10.7969ms | 9.8929ms | 101.0822 Ops/s | 103.1178 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 4.8816ms | 4.4926ms | 222.5878 Ops/s | 221.7896 Ops/s | |
| test_redq_deprec_speed[False-None] | 11.4786ms | 10.8596ms | 92.0845 Ops/s | 91.2367 Ops/s | |
| test_redq_deprec_speed[False-backward] | 16.2362ms | 15.5415ms | 64.3438 Ops/s | 63.0615 Ops/s | |
| test_redq_deprec_speed[True-None] | 4.1240ms | 3.7339ms | 267.8193 Ops/s | 272.1730 Ops/s | |
| test_redq_deprec_speed[True-backward] | 7.8693ms | 7.6579ms | 130.5847 Ops/s | 128.0118 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 4.1236ms | 3.6556ms | 273.5517 Ops/s | 267.7118 Ops/s | |
| test_td3_speed[False-None] | 7.9966ms | 7.7419ms | 129.1671 Ops/s | 128.3804 Ops/s | |
| test_td3_speed[False-backward] | 11.2129ms | 10.5902ms | 94.4268 Ops/s | 93.7976 Ops/s | |
| test_td3_speed[True-None] | 1.8953ms | 1.8324ms | 545.7187 Ops/s | 543.7179 Ops/s | |
| test_td3_speed[True-backward] | 3.7492ms | 3.6529ms | 273.7559 Ops/s | 261.8990 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.8953ms | 1.8177ms | 550.1487 Ops/s | 555.1957 Ops/s | |
| test_cql_speed[False-None] | 29.0409ms | 25.6144ms | 39.0405 Ops/s | 39.0357 Ops/s | |
| test_cql_speed[False-backward] | 35.0728ms | 34.4731ms | 29.0081 Ops/s | 28.9179 Ops/s | |
| test_cql_speed[True-None] | 12.7721ms | 12.4975ms | 80.0158 Ops/s | 78.9217 Ops/s | |
| test_cql_speed[True-backward] | 19.0844ms | 18.5666ms | 53.8602 Ops/s | 55.3680 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 13.1605ms | 12.5190ms | 79.8786 Ops/s | 80.3127 Ops/s | |
| test_a2c_speed[False-None] | 5.6753ms | 5.3324ms | 187.5327 Ops/s | 188.1961 Ops/s | |
| test_a2c_speed[False-backward] | 11.9718ms | 11.6655ms | 85.7232 Ops/s | 85.5932 Ops/s | |
| test_a2c_speed[True-None] | 4.1609ms | 3.7250ms | 268.4566 Ops/s | 264.8699 Ops/s | |
| test_a2c_speed[True-backward] | 9.1055ms | 8.5993ms | 116.2892 Ops/s | 111.6856 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 4.1237ms | 3.7155ms | 269.1425 Ops/s | 264.5837 Ops/s | |
| test_ppo_speed[False-None] | 6.2658ms | 5.8145ms | 171.9824 Ops/s | 169.4412 Ops/s | |
| test_ppo_speed[False-backward] | 12.8434ms | 12.2776ms | 81.4494 Ops/s | 79.9747 Ops/s | |
| test_ppo_speed[True-None] | 4.0411ms | 3.6285ms | 275.5965 Ops/s | 265.8247 Ops/s | |
| test_ppo_speed[True-backward] | 8.6326ms | 8.4239ms | 118.7105 Ops/s | 110.6979 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 3.7437ms | 3.6029ms | 277.5553 Ops/s | 267.1054 Ops/s | |
| test_reinforce_speed[False-None] | 4.8726ms | 4.3941ms | 227.5754 Ops/s | 218.2248 Ops/s | |
| test_reinforce_speed[False-backward] | 7.7238ms | 7.2798ms | 137.3659 Ops/s | 135.2741 Ops/s | |
| test_reinforce_speed[True-None] | 3.2769ms | 2.8702ms | 348.4030 Ops/s | 365.2633 Ops/s | |
| test_reinforce_speed[True-backward] | 8.3101ms | 7.8554ms | 127.3006 Ops/s | 117.1101 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.0777ms | 2.8908ms | 345.9258 Ops/s | 365.6211 Ops/s | |
| test_iql_speed[False-None] | 20.2183ms | 19.5943ms | 51.0354 Ops/s | 51.0153 Ops/s | |
| test_iql_speed[False-backward] | 30.9401ms | 30.1818ms | 33.1325 Ops/s | 34.0072 Ops/s | |
| test_iql_speed[True-None] | 8.9464ms | 8.5673ms | 116.7223 Ops/s | 122.9246 Ops/s | |
| test_iql_speed[True-backward] | 17.2520ms | 16.8471ms | 59.3573 Ops/s | 64.4971 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 11.5946ms | 8.8047ms | 113.5753 Ops/s | 116.3881 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0270ms | 5.9045ms | 169.3635 Ops/s | 168.7647 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.9302ms | 0.3177ms | 3.1478 KOps/s | 3.0917 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6954ms | 0.3227ms | 3.0992 KOps/s | 3.4521 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.8962ms | 5.6760ms | 176.1809 Ops/s | 176.6498 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.1362ms | 0.3190ms | 3.1345 KOps/s | 3.2649 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7293ms | 0.3039ms | 3.2900 KOps/s | 3.4304 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.6611ms | 1.3401ms | 746.2124 Ops/s | 821.6556 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.5117ms | 1.2727ms | 785.7608 Ops/s | 884.8158 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 11.6427ms | 5.9407ms | 168.3300 Ops/s | 173.1900 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.2187ms | 0.4461ms | 2.2417 KOps/s | 2.3645 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8325ms | 0.4289ms | 2.3313 KOps/s | 2.4705 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.8338ms | 5.6727ms | 176.2835 Ops/s | 175.1499 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.5517ms | 0.3095ms | 3.2313 KOps/s | 3.6141 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.4897ms | 0.3052ms | 3.2769 KOps/s | 3.8199 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.9929ms | 5.6619ms | 176.6184 Ops/s | 178.6517 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.7438ms | 0.3045ms | 3.2845 KOps/s | 3.4059 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5445ms | 0.3068ms | 3.2600 KOps/s | 3.6117 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.1292ms | 5.8051ms | 172.2616 Ops/s | 173.5814 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.8656ms | 0.4658ms | 2.1471 KOps/s | 1.8133 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6753ms | 0.4602ms | 2.1728 KOps/s | 2.3003 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.3353ms | 4.8498ms | 206.1921 Ops/s | 58.7136 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 4.8688ms | 2.1045ms | 475.1610 Ops/s | 552.5213 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 2.1006ms | 1.0983ms | 910.4591 Ops/s | 1.1438 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.5402s | 15.7294ms | 63.5754 Ops/s | 198.4478 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 3.8605ms | 1.7220ms | 580.7229 Ops/s | 556.1321 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 1.1339ms | 0.8621ms | 1.1599 KOps/s | 1.1991 KOps/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 9.4185ms | 5.2004ms | 192.2920 Ops/s | 60.6735 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 9.1307ms | 2.0124ms | 496.9072 Ops/s | 491.9592 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.2128ms | 1.1699ms | 854.7587 Ops/s | 971.3924 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 39.9036ms | 35.2482ms | 28.3703 Ops/s | 28.5568 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.4651ms | 17.9853ms | 55.6011 Ops/s | 58.1745 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 40.2742ms | 36.1097ms | 27.6934 Ops/s | 27.1873 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.2913ms | 18.0820ms | 55.3035 Ops/s | 56.9282 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 39.7742ms | 37.8184ms | 26.4422 Ops/s | 26.3306 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 20.9895ms | 19.5404ms | 51.1760 Ops/s | 52.8472 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.9415ms | 0.2250ms | 4.4441 KOps/s | 4.5637 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.7068ms | 1.3981ms | 715.2430 Ops/s | 715.1571 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.4694ms | 2.3022ms | 434.3747 Ops/s | 412.8749 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.1125ms | 2.9423ms | 339.8720 Ops/s | 339.8553 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.2125ms | 0.1309ms | 7.6419 KOps/s | 7.6688 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.5239ms | 0.1832ms | 5.4591 KOps/s | 5.6259 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.8678ms | 1.7308ms | 577.7689 Ops/s | 567.1466 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.4312ms | 1.3058ms | 765.7959 Ops/s | 770.4438 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.2049ms | 1.0906ms | 916.9170 Ops/s | 910.8727 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.7239ms | 3.4796ms | 287.3871 Ops/s | 280.4412 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 11.2282ms | 5.6958ms | 175.5666 Ops/s | 176.1628 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.5064ms | 7.1816ms | 139.2444 Ops/s | 140.1090 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4230ms | 0.2687ms | 3.7219 KOps/s | 3.7183 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.6914ms | 1.5146ms | 660.2350 Ops/s | 665.7389 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.7854ms | 2.4160ms | 413.9148 Ops/s | 391.1416 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.3336ms | 3.1437ms | 318.0957 Ops/s | 317.4713 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 34.8531ms | 33.6292ms | 29.7361 Ops/s | 29.7201 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 67.7709ms | 65.9578ms | 15.1612 Ops/s | 15.1176 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 41.9326ms | 41.3000ms | 24.2131 Ops/s | 24.1549 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 81.3549ms | 80.5838ms | 12.4094 Ops/s | 12.3323 Ops/s |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 81.4490μs | 80.0188μs | 12.4971 KOps/s | 12.4697 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1392ms | 0.1389ms | 7.2019 KOps/s | 7.1946 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1187s | 0.1183s | 8.4515 Ops/s | 8.9782 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5633μs | 2.5593μs | 390.7250 KOps/s | 379.4445 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 39.0442μs | 37.9473μs | 26.3524 KOps/s | 26.6239 KOps/s | |
| test_simple | 0.7956s | 0.7951s | 1.2578 Ops/s | 1.2230 Ops/s | |
| test_transformed | 1.5428s | 1.4495s | 0.6899 Ops/s | 0.6851 Ops/s | |
| test_serial | 2.4179s | 2.3229s | 0.4305 Ops/s | 0.4294 Ops/s | |
| test_parallel | 1.9898s | 1.8565s | 0.5387 Ops/s | 0.5483 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.4685ms | 44.9970μs | 22.2237 KOps/s | 22.3329 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 0.4420ms | 25.5265μs | 39.1749 KOps/s | 40.0200 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 69.5710μs | 25.5112μs | 39.1984 KOps/s | 40.2275 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 41.3610μs | 13.8650μs | 72.1238 KOps/s | 72.6020 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 0.4711ms | 47.6677μs | 20.9785 KOps/s | 21.0194 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 58.7410μs | 27.7805μs | 35.9964 KOps/s | 36.3321 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 0.4526ms | 27.8667μs | 35.8851 KOps/s | 36.2459 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 52.2910μs | 16.6355μs | 60.1122 KOps/s | 59.9976 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 0.4700ms | 50.6728μs | 19.7345 KOps/s | 19.6070 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 78.0920μs | 30.5602μs | 32.7223 KOps/s | 32.9291 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 0.4534ms | 27.6485μs | 36.1683 KOps/s | 36.4781 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 0.4455ms | 16.7202μs | 59.8079 KOps/s | 59.9162 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 0.4819ms | 52.7450μs | 18.9592 KOps/s | 18.8435 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 68.8410μs | 33.5701μs | 29.7884 KOps/s | 30.2972 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 0.4493ms | 30.5343μs | 32.7500 KOps/s | 33.2864 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 0.4357ms | 19.0800μs | 52.4108 KOps/s | 51.0936 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 80.4510μs | 50.5811μs | 19.7702 KOps/s | 19.7201 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 0.4555ms | 30.5293μs | 32.7554 KOps/s | 32.4485 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.3433ms | 32.1120μs | 31.1410 KOps/s | 31.9702 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 0.4544ms | 18.3230μs | 54.5763 KOps/s | 55.3552 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 85.8120μs | 53.3977μs | 18.7274 KOps/s | 19.2241 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 0.4480ms | 32.8054μs | 30.4828 KOps/s | 30.0844 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 0.4477ms | 34.5519μs | 28.9420 KOps/s | 29.3108 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 47.8310μs | 20.5947μs | 48.5562 KOps/s | 47.3744 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 91.8520μs | 55.5599μs | 17.9986 KOps/s | 17.8513 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 0.4552ms | 35.7916μs | 27.9395 KOps/s | 27.7980 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 0.4472ms | 33.6663μs | 29.7033 KOps/s | 29.6369 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 0.4363ms | 20.9901μs | 47.6415 KOps/s | 47.8900 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 98.6110μs | 58.0429μs | 17.2286 KOps/s | 17.3977 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 0.4488ms | 38.4546μs | 26.0047 KOps/s | 25.6806 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 0.4490ms | 35.8633μs | 27.8837 KOps/s | 27.2797 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 0.4328ms | 23.1503μs | 43.1960 KOps/s | 42.4992 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8740s | 0.7749s | 1.2905 Ops/s | 1.2845 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7306s | 0.6339s | 1.5775 Ops/s | 1.5666 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7722s | 1.6970s | 0.5893 Ops/s | 0.5876 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5457s | 1.4648s | 0.6827 Ops/s | 0.6792 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 2.0230s | 1.9451s | 0.5141 Ops/s | 0.5120 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.8011s | 1.7213s | 0.5809 Ops/s | 0.5792 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.7386s | 4.6424s | 0.2154 Ops/s | 0.2124 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5849s | 4.4523s | 0.2246 Ops/s | 0.2214 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.0215s | 1.9205s | 0.5207 Ops/s | 0.5227 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7224s | 1.6641s | 0.6009 Ops/s | 0.6014 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 21.0885ms | 20.7477ms | 48.1982 Ops/s | 47.7899 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1564s | 4.0477ms | 247.0565 Ops/s | 285.3646 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1086ms | 82.6738μs | 12.0957 KOps/s | 12.0341 KOps/s | |
| test_values[td1_return_estimate-False-False] | 49.5077ms | 49.0265ms | 20.3971 Ops/s | 20.2622 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.3310ms | 1.0878ms | 919.3035 Ops/s | 916.2214 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 80.6227ms | 80.2832ms | 12.4559 Ops/s | 12.3955 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.3377ms | 1.0873ms | 919.7033 Ops/s | 920.4723 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 21.2615ms | 20.9485ms | 47.7361 Ops/s | 44.4422 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0357ms | 0.7566ms | 1.3216 KOps/s | 1.3160 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7347ms | 0.6819ms | 1.4664 KOps/s | 1.4663 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5314ms | 1.4929ms | 669.8357 Ops/s | 668.7844 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7453ms | 0.6974ms | 1.4340 KOps/s | 1.4355 KOps/s | |
| test_dqn_speed[False-None] | 1.7585ms | 1.5313ms | 653.0347 Ops/s | 648.0240 Ops/s | |
| test_dqn_speed[False-backward] | 2.4809ms | 2.1934ms | 455.9087 Ops/s | 454.0925 Ops/s | |
| test_dqn_speed[True-None] | 0.6419ms | 0.5658ms | 1.7673 KOps/s | 1.7314 KOps/s | |
| test_dqn_speed[True-backward] | 1.2755ms | 1.2221ms | 818.2879 Ops/s | 881.6167 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.7262ms | 0.6046ms | 1.6539 KOps/s | 1.5748 KOps/s | |
| test_ddpg_speed[False-None] | 3.3084ms | 2.9112ms | 343.5053 Ops/s | 342.3773 Ops/s | |
| test_ddpg_speed[False-backward] | 4.7428ms | 4.3359ms | 230.6327 Ops/s | 238.1091 Ops/s | |
| test_ddpg_speed[True-None] | 1.4163ms | 1.3286ms | 752.6538 Ops/s | 744.5261 Ops/s | |
| test_ddpg_speed[True-backward] | 2.6179ms | 2.5402ms | 393.6739 Ops/s | 411.7870 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.6049ms | 1.3896ms | 719.6507 Ops/s | 729.7472 Ops/s | |
| test_sac_speed[False-None] | 9.5821ms | 8.6046ms | 116.2168 Ops/s | 120.0762 Ops/s | |
| test_sac_speed[False-backward] | 12.2406ms | 11.7604ms | 85.0314 Ops/s | 87.9814 Ops/s | |
| test_sac_speed[True-None] | 1.9051ms | 1.8303ms | 546.3476 Ops/s | 536.7360 Ops/s | |
| test_sac_speed[True-backward] | 3.7011ms | 3.6128ms | 276.7935 Ops/s | 271.7592 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 18.9903ms | 10.8648ms | 92.0406 Ops/s | 83.1571 Ops/s | |
| test_redq_deprec_speed[False-None] | 9.8103ms | 9.3432ms | 107.0298 Ops/s | 105.7311 Ops/s | |
| test_redq_deprec_speed[False-backward] | 13.1944ms | 12.7347ms | 78.5259 Ops/s | 77.3996 Ops/s | |
| test_redq_deprec_speed[True-None] | 2.6218ms | 2.5547ms | 391.4349 Ops/s | 389.5513 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.3586ms | 4.3136ms | 231.8257 Ops/s | 233.2146 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 17.0566ms | 10.0244ms | 99.7568 Ops/s | 101.2905 Ops/s | |
| test_td3_speed[False-None] | 8.4786ms | 8.2344ms | 121.4418 Ops/s | 121.4005 Ops/s | |
| test_td3_speed[False-backward] | 11.6273ms | 10.9023ms | 91.7242 Ops/s | 93.6924 Ops/s | |
| test_td3_speed[True-None] | 1.6796ms | 1.6543ms | 604.4866 Ops/s | 569.0958 Ops/s | |
| test_td3_speed[True-backward] | 3.7177ms | 3.2669ms | 306.0965 Ops/s | 312.5064 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 71.3106ms | 24.9176ms | 40.1322 Ops/s | 40.4350 Ops/s | |
| test_cql_speed[False-None] | 17.7372ms | 17.2723ms | 57.8960 Ops/s | 57.5102 Ops/s | |
| test_cql_speed[False-backward] | 23.5113ms | 22.9249ms | 43.6207 Ops/s | 43.3351 Ops/s | |
| test_cql_speed[True-None] | 3.4228ms | 3.2735ms | 305.4848 Ops/s | 300.6772 Ops/s | |
| test_cql_speed[True-backward] | 5.8227ms | 5.3899ms | 185.5336 Ops/s | 175.8178 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 19.0159ms | 11.8752ms | 84.2093 Ops/s | 83.1387 Ops/s | |
| test_a2c_speed[False-None] | 3.9549ms | 3.2510ms | 307.6003 Ops/s | 305.1354 Ops/s | |
| test_a2c_speed[False-backward] | 6.6507ms | 6.1829ms | 161.7374 Ops/s | 154.9932 Ops/s | |
| test_a2c_speed[True-None] | 1.4207ms | 1.3257ms | 754.3050 Ops/s | 745.2296 Ops/s | |
| test_a2c_speed[True-backward] | 3.0370ms | 2.9866ms | 334.8335 Ops/s | 332.0253 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.2053ms | 1.0044ms | 995.6228 Ops/s | 995.7738 Ops/s | |
| test_ppo_speed[False-None] | 3.9745ms | 3.8597ms | 259.0855 Ops/s | 257.9643 Ops/s | |
| test_ppo_speed[False-backward] | 7.5392ms | 7.0455ms | 141.9348 Ops/s | 143.1049 Ops/s | |
| test_ppo_speed[True-None] | 1.6190ms | 1.4440ms | 692.5252 Ops/s | 697.0942 Ops/s | |
| test_ppo_speed[True-backward] | 3.1479ms | 3.0916ms | 323.4615 Ops/s | 314.8874 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.1586ms | 1.0573ms | 945.7739 Ops/s | 930.6005 Ops/s | |
| test_reinforce_speed[False-None] | 2.4947ms | 2.3146ms | 432.0494 Ops/s | 420.5286 Ops/s | |
| test_reinforce_speed[False-backward] | 3.8896ms | 3.3923ms | 294.7870 Ops/s | 284.4410 Ops/s | |
| test_reinforce_speed[True-None] | 1.4155ms | 1.2994ms | 769.5593 Ops/s | 762.9844 Ops/s | |
| test_reinforce_speed[True-backward] | 2.9802ms | 2.9248ms | 341.9004 Ops/s | 323.1620 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 17.6479ms | 9.5677ms | 104.5184 Ops/s | 106.0571 Ops/s | |
| test_iql_speed[False-None] | 10.2711ms | 9.4707ms | 105.5884 Ops/s | 105.5303 Ops/s | |
| test_iql_speed[False-backward] | 13.7412ms | 13.1621ms | 75.9760 Ops/s | 75.1504 Ops/s | |
| test_iql_speed[True-None] | 2.5161ms | 2.1973ms | 455.1091 Ops/s | 450.1519 Ops/s | |
| test_iql_speed[True-backward] | 5.2463ms | 4.7707ms | 209.6120 Ops/s | 199.2707 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 17.8835ms | 10.6186ms | 94.1747 Ops/s | 95.5738 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.1182ms | 5.9419ms | 168.2955 Ops/s | 166.0902 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.9524ms | 0.3871ms | 2.5836 KOps/s | 2.6191 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5895ms | 0.3456ms | 2.8938 KOps/s | 2.7407 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.9531ms | 5.6669ms | 176.4641 Ops/s | 170.0741 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.8825ms | 0.3553ms | 2.8147 KOps/s | 3.3028 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5103ms | 0.2654ms | 3.7684 KOps/s | 3.5235 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.5612ms | 1.3046ms | 766.4943 Ops/s | 769.4766 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.4179ms | 1.2135ms | 824.0596 Ops/s | 817.3065 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.1774ms | 5.9280ms | 168.6909 Ops/s | 166.3451 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.6888ms | 0.4401ms | 2.2723 KOps/s | 2.2932 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7353ms | 0.4815ms | 2.0770 KOps/s | 1.9537 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.9320ms | 5.8723ms | 170.2912 Ops/s | 169.4486 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.0779ms | 0.3592ms | 2.7841 KOps/s | 2.7848 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6771ms | 0.3898ms | 2.5657 KOps/s | 3.7184 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0529ms | 5.8450ms | 171.0850 Ops/s | 171.3912 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9427ms | 0.3057ms | 3.2715 KOps/s | 3.0676 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.4490ms | 0.2834ms | 3.5288 KOps/s | 3.1785 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.0330ms | 5.9277ms | 168.6995 Ops/s | 167.5567 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.3054ms | 0.5361ms | 1.8653 KOps/s | 1.9689 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6921ms | 0.5004ms | 1.9984 KOps/s | 1.7867 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.5932s | 16.8224ms | 59.4446 Ops/s | 49.2580 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 8.3831ms | 1.9815ms | 504.6799 Ops/s | 495.8515 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 10.5857ms | 1.3159ms | 759.9632 Ops/s | 733.9749 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 8.1701ms | 5.1387ms | 194.6033 Ops/s | 194.7096 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 9.4740ms | 1.9933ms | 501.6723 Ops/s | 508.3236 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 9.2782ms | 1.3244ms | 755.0719 Ops/s | 1.0455 KOps/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 7.0094ms | 5.3388ms | 187.3096 Ops/s | 186.6065 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 4.0857ms | 2.0078ms | 498.0470 Ops/s | 481.8271 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 4.0521ms | 1.1874ms | 842.1986 Ops/s | 843.8022 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 38.2437ms | 36.0929ms | 27.7063 Ops/s | 27.3395 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.6715ms | 18.2702ms | 54.7339 Ops/s | 53.2321 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 40.5165ms | 37.4408ms | 26.7089 Ops/s | 26.4473 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.1564ms | 18.6944ms | 53.4920 Ops/s | 52.8858 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 41.0666ms | 39.1829ms | 25.5213 Ops/s | 25.1231 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.7409ms | 20.2620ms | 49.3535 Ops/s | 48.4858 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8776ms | 0.2257ms | 4.4303 KOps/s | 4.4138 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.7511ms | 1.4175ms | 705.4513 Ops/s | 700.6929 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.7141ms | 2.3445ms | 426.5299 Ops/s | 431.1534 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.1524ms | 2.9554ms | 338.3616 Ops/s | 337.3370 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.2499ms | 0.1674ms | 5.9732 KOps/s | 6.0800 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3898ms | 0.2313ms | 4.3238 KOps/s | 4.5662 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 2.2411ms | 1.8881ms | 529.6465 Ops/s | 557.1878 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.5318ms | 1.3815ms | 723.8481 Ops/s | 711.7561 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.2089ms | 1.1574ms | 863.9934 Ops/s | 863.0531 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 7.6864ms | 3.6767ms | 271.9827 Ops/s | 271.7038 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 11.1709ms | 5.9502ms | 168.0625 Ops/s | 170.2686 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 15.1455ms | 7.2091ms | 138.7133 Ops/s | 139.1262 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4308ms | 0.2725ms | 3.6697 KOps/s | 3.6617 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.6644ms | 1.5185ms | 658.5238 Ops/s | 650.7790 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.9710ms | 2.4775ms | 403.6405 Ops/s | 410.8720 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.5717ms | 3.1591ms | 316.5484 Ops/s | 315.9580 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 35.7852ms | 35.1291ms | 28.4664 Ops/s | 28.4315 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 70.0662ms | 69.1257ms | 14.4664 Ops/s | 14.5834 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 44.2394ms | 43.1711ms | 23.1636 Ops/s | 23.4587 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 86.0971ms | 84.3683ms | 11.8528 Ops/s | 11.8358 Ops/s | |
| test_collector_without_rb_cuda[100-img_shape0-atari] | 59.1841ms | 57.4382ms | 17.4100 Ops/s | 17.4483 Ops/s | |
| test_collector_without_rb_cuda[200-img_shape1-large_batch] | 0.1179s | 0.1149s | 8.7058 Ops/s | 8.7816 Ops/s | |
| test_collector_with_rb_cuda[100-img_shape0-atari] | 64.4858ms | 62.8472ms | 15.9116 Ops/s | 15.9826 Ops/s | |
| test_collector_with_rb_cuda[200-img_shape1-large_batch] | 0.1271s | 0.1254s | 7.9774 Ops/s | 8.0232 Ops/s |
|
Closing: prof instrumentation not needed in the final stack. |
Stack from ghstack (oldest at bottom):
Add optional, zero-cost profiling instrumentation using the prof
library (conditionally imported, no hard dependency). When prof is not
installed or not initialised, all _prof_ctx() calls return
contextlib.nullcontext() with no overhead.
Instrumented phases:
penv.send_commands, penv.wait_for_workers, penv.read_outputs,
penv.sync_w2m
worker.cuda_sync, worker.signal_done
collector.to_device, collector.stack_results
worker.share_memory
Workers receive prof_shm_name from the parent process and call
prof.prepare() to join the distributed profiling session.
Co-authored-by: Cursor [email protected]