[CI] Separate GPU and CPU tests with pytest markers#3404
Merged
vmoens merged 6 commits intogh/vmoens/207/basefrom Jan 29, 2026
Merged
[CI] Separate GPU and CPU tests with pytest markers#3404vmoens merged 6 commits intogh/vmoens/207/basefrom
vmoens merged 6 commits intogh/vmoens/207/basefrom
Conversation
vmoens
added a commit
that referenced
this pull request
Jan 28, 2026
Add pytest.mark.gpu to tests that require CUDA, and update run_all.sh to filter tests based on whether running on GPU or CPU machines. Changes: - Register 'gpu' marker in pytest.ini and conftest.py - Add pytest.mark.gpu to ~30 tests that explicitly require CUDA - Update run_all.sh to use GPU_MARKER_FILTER: - GPU jobs (CU_VERSION != cpu): run only pytest.mark.gpu tests - CPU jobs (CU_VERSION = cpu): run all tests except pytest.mark.gpu This significantly reduces GPU machine usage by running only GPU-requiring tests on expensive GPU runners (~30 tests instead of ~2000+). Tests that can run on either device will run on CPU machines only. The optimization can be disabled by setting TORCHRL_GPU_FILTER=0. ghstack-source-id: ca9778d Pull-Request: #3404
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3404
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This was referenced Jan 28, 2026
vmoens
added a commit
that referenced
this pull request
Jan 28, 2026
Add pytest.mark.gpu to tests that require CUDA, and update run_all.sh to filter tests based on whether running on GPU or CPU machines. Changes: - Register 'gpu' marker in pytest.ini and conftest.py - Add pytest.mark.gpu to ~30 tests that explicitly require CUDA - Update run_all.sh to use GPU_MARKER_FILTER: - GPU jobs (CU_VERSION != cpu): run only pytest.mark.gpu tests - CPU jobs (CU_VERSION = cpu): run all tests except pytest.mark.gpu This significantly reduces GPU machine usage by running only GPU-requiring tests on expensive GPU runners (~30 tests instead of ~2000+). Tests that can run on either device will run on CPU machines only. The optimization can be disabled by setting TORCHRL_GPU_FILTER=0. ghstack-source-id: 9235913 Pull-Request: #3404
vmoens
added a commit
that referenced
this pull request
Jan 28, 2026
Add pytest.mark.gpu to tests that require CUDA, and update run_all.sh to filter tests based on whether running on GPU or CPU machines. Changes: - Register 'gpu' marker in pytest.ini and conftest.py - Add pytest.mark.gpu to ~30 tests that explicitly require CUDA - Update run_all.sh to use GPU_MARKER_FILTER: - GPU jobs (CU_VERSION != cpu): run only pytest.mark.gpu tests - CPU jobs (CU_VERSION = cpu): run all tests except pytest.mark.gpu This significantly reduces GPU machine usage by running only GPU-requiring tests on expensive GPU runners (~30 tests instead of ~2000+). Tests that can run on either device will run on CPU machines only. The optimization can be disabled by setting TORCHRL_GPU_FILTER=0. ghstack-source-id: 9235913 Pull-Request: #3404
vmoens
added a commit
that referenced
this pull request
Jan 28, 2026
Add pytest.mark.gpu to tests that require CUDA, and update run_all.sh to filter tests based on whether running on GPU or CPU machines. Changes: - Register 'gpu' marker in pytest.ini and conftest.py - Add pytest.mark.gpu to ~30 tests that explicitly require CUDA - Update run_all.sh to use GPU_MARKER_FILTER: - GPU jobs (CU_VERSION != cpu): run only pytest.mark.gpu tests - CPU jobs (CU_VERSION = cpu): run all tests except pytest.mark.gpu This significantly reduces GPU machine usage by running only GPU-requiring tests on expensive GPU runners (~30 tests instead of ~2000+). Tests that can run on either device will run on CPU machines only. The optimization can be disabled by setting TORCHRL_GPU_FILTER=0. ghstack-source-id: 69148f2 Pull-Request: #3404
vmoens
added a commit
that referenced
this pull request
Jan 29, 2026
Add pytest.mark.gpu to tests that require CUDA, and update run_all.sh to filter tests based on whether running on GPU or CPU machines. Changes: - Register 'gpu' marker in pytest.ini and conftest.py - Add pytest.mark.gpu to ~30 tests that explicitly require CUDA - Update run_all.sh to use GPU_MARKER_FILTER: - GPU jobs (CU_VERSION != cpu): run only pytest.mark.gpu tests - CPU jobs (CU_VERSION = cpu): run all tests except pytest.mark.gpu This significantly reduces GPU machine usage by running only GPU-requiring tests on expensive GPU runners (~30 tests instead of ~2000+). Tests that can run on either device will run on CPU machines only. The optimization can be disabled by setting TORCHRL_GPU_FILTER=0. ghstack-source-id: cf14d1e Pull-Request: #3404
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 79.7458μs | 78.7644μs | 12.6961 KOps/s | 12.1608 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1362ms | 0.1354ms | 7.3865 KOps/s | 7.1133 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1073s | 0.1070s | 9.3453 Ops/s | 9.1166 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5347μs | 2.5238μs | 396.2214 KOps/s | 385.2515 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 37.5882μs | 37.2559μs | 26.8414 KOps/s | 26.3834 KOps/s | |
| test_simple | 0.6586s | 0.5711s | 1.7510 Ops/s | 1.7686 Ops/s | |
| test_transformed | 1.2262s | 1.1319s | 0.8835 Ops/s | 0.8800 Ops/s | |
| test_serial | 1.6694s | 1.6632s | 0.6012 Ops/s | 0.6115 Ops/s | |
| test_parallel | 1.2430s | 1.1564s | 0.8647 Ops/s | 0.8955 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.3181ms | 43.9394μs | 22.7586 KOps/s | 23.4200 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 64.4130μs | 23.6698μs | 42.2479 KOps/s | 40.8300 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 77.1840μs | 24.0562μs | 41.5694 KOps/s | 40.3185 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 50.2230μs | 13.1823μs | 75.8591 KOps/s | 73.8273 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 90.5840μs | 46.8965μs | 21.3236 KOps/s | 21.3943 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 89.6540μs | 26.6049μs | 37.5871 KOps/s | 36.9902 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 69.3730μs | 26.9723μs | 37.0751 KOps/s | 36.0685 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 91.3940μs | 15.8294μs | 63.1734 KOps/s | 62.5141 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 95.3840μs | 49.3414μs | 20.2669 KOps/s | 20.1023 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 62.3430μs | 29.5902μs | 33.7950 KOps/s | 33.6851 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 67.2730μs | 27.2613μs | 36.6821 KOps/s | 36.7805 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 38.1810μs | 15.9333μs | 62.7618 KOps/s | 62.7226 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 81.7140μs | 51.0812μs | 19.5767 KOps/s | 19.5334 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 0.1040ms | 31.2477μs | 32.0023 KOps/s | 31.4143 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 58.8730μs | 29.1867μs | 34.2622 KOps/s | 33.5304 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 43.1620μs | 18.4371μs | 54.2386 KOps/s | 53.2319 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 98.0240μs | 48.6721μs | 20.5457 KOps/s | 20.2126 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 60.0530μs | 29.6028μs | 33.7806 KOps/s | 33.0017 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 59.1930μs | 30.2142μs | 33.0970 KOps/s | 32.0660 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 51.3730μs | 17.6511μs | 56.6537 KOps/s | 55.2539 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 2.7691ms | 52.0368μs | 19.2172 KOps/s | 18.9454 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 70.0830μs | 32.2494μs | 31.0083 KOps/s | 30.9747 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 67.8030μs | 33.1833μs | 30.1356 KOps/s | 30.2052 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 44.3720μs | 20.1629μs | 49.5960 KOps/s | 48.1949 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 84.1730μs | 53.2782μs | 18.7694 KOps/s | 18.2036 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 72.8740μs | 34.6547μs | 28.8561 KOps/s | 28.2991 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 63.7530μs | 32.8793μs | 30.4143 KOps/s | 29.5497 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 51.8220μs | 20.2601μs | 49.3582 KOps/s | 48.6020 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 90.5840μs | 55.8016μs | 17.9206 KOps/s | 17.5884 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 66.7630μs | 37.1242μs | 26.9366 KOps/s | 26.4261 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 68.5730μs | 34.6067μs | 28.8961 KOps/s | 28.1961 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 51.9930μs | 22.4750μs | 44.4939 KOps/s | 43.6647 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8414s | 0.7511s | 1.3314 Ops/s | 1.3242 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7061s | 0.6147s | 1.6269 Ops/s | 1.6094 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7153s | 1.6386s | 0.6103 Ops/s | 0.6064 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.4942s | 1.4136s | 0.7074 Ops/s | 0.6840 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9498s | 1.8777s | 0.5326 Ops/s | 0.5320 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7321s | 1.6644s | 0.6008 Ops/s | 0.6023 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6654s | 4.5895s | 0.2179 Ops/s | 0.2185 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.6075s | 4.4718s | 0.2236 Ops/s | 0.2244 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.1526s | 1.9594s | 0.5104 Ops/s | 0.5113 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7688s | 1.6660s | 0.6002 Ops/s | 0.6161 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 10.3463ms | 10.0425ms | 99.5765 Ops/s | 96.6993 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 19.9151ms | 17.9202ms | 55.8031 Ops/s | 87.0201 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2111ms | 0.1300ms | 7.6926 KOps/s | 7.3921 KOps/s | |
| test_values[td1_return_estimate-False-False] | 29.0190ms | 27.7722ms | 36.0072 Ops/s | 35.1554 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 18.6609ms | 17.9382ms | 55.7471 Ops/s | 87.1853 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 42.7938ms | 40.9298ms | 24.4321 Ops/s | 23.7836 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 19.1589ms | 17.9539ms | 55.6981 Ops/s | 87.1528 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 9.1294ms | 8.8981ms | 112.3829 Ops/s | 108.6886 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.7867ms | 1.5293ms | 653.9132 Ops/s | 648.4828 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.5539ms | 0.4315ms | 2.3177 KOps/s | 2.3599 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 35.0872ms | 34.6872ms | 28.8291 Ops/s | 34.3806 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 2.1539ms | 1.7425ms | 573.9039 Ops/s | 570.3284 Ops/s | |
| test_dqn_speed[False-None] | 1.7797ms | 1.3836ms | 722.7352 Ops/s | 720.6519 Ops/s | |
| test_dqn_speed[False-backward] | 1.9768ms | 1.9113ms | 523.2015 Ops/s | 527.2238 Ops/s | |
| test_dqn_speed[True-None] | 0.7230ms | 0.5515ms | 1.8131 KOps/s | 1.8831 KOps/s | |
| test_dqn_speed[True-backward] | 1.0171ms | 0.9857ms | 1.0145 KOps/s | 993.4563 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.9276ms | 0.5186ms | 1.9282 KOps/s | 1.8924 KOps/s | |
| test_ddpg_speed[False-None] | 0.1994s | 3.4279ms | 291.7196 Ops/s | 352.7294 Ops/s | |
| test_ddpg_speed[False-backward] | 4.1190ms | 4.0230ms | 248.5734 Ops/s | 245.3491 Ops/s | |
| test_ddpg_speed[True-None] | 1.8642ms | 1.3840ms | 722.5543 Ops/s | 710.8613 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4051ms | 2.3536ms | 424.8863 Ops/s | 377.5588 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.8593ms | 1.3748ms | 727.3951 Ops/s | 713.6841 Ops/s | |
| test_sac_speed[False-None] | 8.4930ms | 7.9338ms | 126.0431 Ops/s | 126.1598 Ops/s | |
| test_sac_speed[False-backward] | 11.7698ms | 11.1666ms | 89.5525 Ops/s | 89.1815 Ops/s | |
| test_sac_speed[True-None] | 2.5603ms | 2.1414ms | 466.9815 Ops/s | 446.5895 Ops/s | |
| test_sac_speed[True-backward] | 4.0994ms | 3.9799ms | 251.2617 Ops/s | 245.1600 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.2613ms | 2.1195ms | 471.8017 Ops/s | 473.4617 Ops/s | |
| test_redq_speed[False-None] | 10.7342ms | 10.2855ms | 97.2240 Ops/s | 97.2453 Ops/s | |
| test_redq_speed[False-backward] | 18.6299ms | 17.8758ms | 55.9417 Ops/s | 56.5696 Ops/s | |
| test_redq_speed[True-None] | 4.9486ms | 4.4841ms | 223.0100 Ops/s | 225.9144 Ops/s | |
| test_redq_speed[True-backward] | 9.9930ms | 9.8203ms | 101.8296 Ops/s | 101.5085 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 4.9551ms | 4.3776ms | 228.4359 Ops/s | 224.5306 Ops/s | |
| test_redq_deprec_speed[False-None] | 11.5050ms | 10.9223ms | 91.5556 Ops/s | 93.3891 Ops/s | |
| test_redq_deprec_speed[False-backward] | 15.9715ms | 15.6964ms | 63.7088 Ops/s | 65.2134 Ops/s | |
| test_redq_deprec_speed[True-None] | 4.2118ms | 3.6960ms | 270.5601 Ops/s | 269.8903 Ops/s | |
| test_redq_deprec_speed[True-backward] | 7.8395ms | 7.6472ms | 130.7675 Ops/s | 110.1815 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 4.1677ms | 3.6131ms | 276.7696 Ops/s | 276.5828 Ops/s | |
| test_td3_speed[False-None] | 8.1824ms | 7.9941ms | 125.0922 Ops/s | 124.6787 Ops/s | |
| test_td3_speed[False-backward] | 11.3045ms | 10.8514ms | 92.1538 Ops/s | 91.5590 Ops/s | |
| test_td3_speed[True-None] | 1.8796ms | 1.8333ms | 545.4560 Ops/s | 544.5575 Ops/s | |
| test_td3_speed[True-backward] | 3.7719ms | 3.6315ms | 275.3696 Ops/s | 270.3886 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.8512ms | 1.7914ms | 558.2139 Ops/s | 549.7468 Ops/s | |
| test_cql_speed[False-None] | 30.5857ms | 26.3159ms | 37.9998 Ops/s | 37.8281 Ops/s | |
| test_cql_speed[False-backward] | 39.1317ms | 35.9281ms | 27.8334 Ops/s | 28.2406 Ops/s | |
| test_cql_speed[True-None] | 13.1357ms | 12.5117ms | 79.9251 Ops/s | 79.2654 Ops/s | |
| test_cql_speed[True-backward] | 18.8385ms | 18.3638ms | 54.4551 Ops/s | 53.7966 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 12.6274ms | 12.4015ms | 80.6355 Ops/s | 79.8477 Ops/s | |
| test_a2c_speed[False-None] | 5.9510ms | 5.4176ms | 184.5853 Ops/s | 185.8306 Ops/s | |
| test_a2c_speed[False-backward] | 12.1481ms | 11.8700ms | 84.2463 Ops/s | 84.3664 Ops/s | |
| test_a2c_speed[True-None] | 4.1581ms | 3.7296ms | 268.1248 Ops/s | 272.9750 Ops/s | |
| test_a2c_speed[True-backward] | 8.7930ms | 8.6284ms | 115.8969 Ops/s | 115.2585 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 3.8018ms | 3.6715ms | 272.3684 Ops/s | 271.9262 Ops/s | |
| test_ppo_speed[False-None] | 6.0184ms | 5.8590ms | 170.6771 Ops/s | 171.0868 Ops/s | |
| test_ppo_speed[False-backward] | 12.7847ms | 12.4689ms | 80.1995 Ops/s | 79.6875 Ops/s | |
| test_ppo_speed[True-None] | 4.1483ms | 3.6218ms | 276.1077 Ops/s | 274.8823 Ops/s | |
| test_ppo_speed[True-backward] | 8.8566ms | 8.4329ms | 118.5837 Ops/s | 117.8953 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 3.6891ms | 3.5903ms | 278.5315 Ops/s | 279.8719 Ops/s | |
| test_reinforce_speed[False-None] | 4.7130ms | 4.5568ms | 219.4542 Ops/s | 221.8469 Ops/s | |
| test_reinforce_speed[False-backward] | 7.6957ms | 7.4227ms | 134.7212 Ops/s | 136.2955 Ops/s | |
| test_reinforce_speed[True-None] | 3.3148ms | 2.8536ms | 350.4300 Ops/s | 351.1335 Ops/s | |
| test_reinforce_speed[True-backward] | 8.3960ms | 7.8699ms | 127.0657 Ops/s | 123.9436 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.3867ms | 2.8666ms | 348.8429 Ops/s | 339.3555 Ops/s | |
| test_iql_speed[False-None] | 26.0278ms | 20.1633ms | 49.5952 Ops/s | 50.9086 Ops/s | |
| test_iql_speed[False-backward] | 31.8075ms | 30.5269ms | 32.7580 Ops/s | 32.9171 Ops/s | |
| test_iql_speed[True-None] | 9.1210ms | 8.4998ms | 117.6504 Ops/s | 113.3626 Ops/s | |
| test_iql_speed[True-backward] | 17.0691ms | 16.7441ms | 59.7224 Ops/s | 59.4958 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 8.9173ms | 8.5259ms | 117.2903 Ops/s | 116.7007 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.1753ms | 5.7254ms | 174.6603 Ops/s | 172.7807 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.0638ms | 0.3664ms | 2.7290 KOps/s | 3.4273 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6527ms | 0.3504ms | 2.8536 KOps/s | 3.0078 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1209ms | 5.5518ms | 180.1206 Ops/s | 177.6370 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.0768ms | 0.3512ms | 2.8471 KOps/s | 3.5876 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6030ms | 0.3325ms | 3.0075 KOps/s | 3.5191 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.8540ms | 1.4035ms | 712.5088 Ops/s | 783.5141 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.5718ms | 1.3201ms | 757.5459 Ops/s | 843.4427 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 11.5870ms | 5.9087ms | 169.2423 Ops/s | 173.8227 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.0786ms | 0.4446ms | 2.2491 KOps/s | 2.1542 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6866ms | 0.4696ms | 2.1296 KOps/s | 2.1518 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0120ms | 5.5958ms | 178.7059 Ops/s | 176.7325 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.8290ms | 0.2795ms | 3.5779 KOps/s | 3.2332 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7194ms | 0.2997ms | 3.3369 KOps/s | 3.3565 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.8631ms | 5.5513ms | 180.1375 Ops/s | 177.3186 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.6974ms | 0.3104ms | 3.2212 KOps/s | 3.4338 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6107ms | 0.3780ms | 2.6454 KOps/s | 3.2952 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.9668ms | 5.7574ms | 173.6882 Ops/s | 171.8121 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.8422ms | 0.4566ms | 2.1903 KOps/s | 2.1681 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6335ms | 0.4438ms | 2.2533 KOps/s | 2.2980 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.4907ms | 4.9947ms | 200.2121 Ops/s | 197.6926 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 13.0781ms | 1.9263ms | 519.1269 Ops/s | 522.6645 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.0615ms | 0.8715ms | 1.1474 KOps/s | 1.1454 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 11.1087ms | 5.0680ms | 197.3166 Ops/s | 200.7695 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 6.6574ms | 1.8840ms | 530.7949 Ops/s | 559.6953 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 11.8139ms | 1.2668ms | 789.4028 Ops/s | 1.0622 KOps/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.5495s | 16.0993ms | 62.1146 Ops/s | 57.2830 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 4.2288ms | 1.8630ms | 536.7811 Ops/s | 493.1920 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.8559ms | 1.0063ms | 993.7112 Ops/s | 953.8081 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 37.9772ms | 35.6010ms | 28.0891 Ops/s | 27.9380 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 20.3285ms | 18.5851ms | 53.8066 Ops/s | 55.6752 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 40.4316ms | 36.9538ms | 27.0608 Ops/s | 26.7861 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.1200ms | 18.4398ms | 54.2306 Ops/s | 53.0771 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 39.9897ms | 38.6119ms | 25.8988 Ops/s | 25.6690 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.8360ms | 20.2602ms | 49.3579 Ops/s | 50.5338 Ops/s |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 82.8024μs | 81.8418μs | 12.2187 KOps/s | 11.7695 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1426ms | 0.1415ms | 7.0691 KOps/s | 7.0060 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1410s | 0.1407s | 7.1083 Ops/s | 7.4691 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.9965μs | 2.9887μs | 334.5893 KOps/s | 360.6524 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 38.6287μs | 37.6219μs | 26.5803 KOps/s | 26.0331 KOps/s | |
| test_simple | 0.9236s | 0.8323s | 1.2015 Ops/s | 1.2044 Ops/s | |
| test_transformed | 1.5700s | 1.4806s | 0.6754 Ops/s | 0.6819 Ops/s | |
| test_serial | 2.4604s | 2.3816s | 0.4199 Ops/s | 0.4273 Ops/s | |
| test_parallel | 2.0451s | 1.9911s | 0.5022 Ops/s | 0.5010 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.5136ms | 45.7529μs | 21.8565 KOps/s | 22.2012 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 59.0010μs | 25.5918μs | 39.0750 KOps/s | 39.8280 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 58.3010μs | 25.4601μs | 39.2771 KOps/s | 38.8573 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 42.9210μs | 14.1835μs | 70.5046 KOps/s | 71.8991 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 0.1126ms | 48.3943μs | 20.6636 KOps/s | 20.5590 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 59.0410μs | 28.5428μs | 35.0351 KOps/s | 35.6319 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 63.5810μs | 28.7795μs | 34.7470 KOps/s | 34.6821 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 49.0900μs | 16.8504μs | 59.3456 KOps/s | 59.5810 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 92.9920μs | 51.6284μs | 19.3692 KOps/s | 19.3127 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 87.2810μs | 30.9973μs | 32.2608 KOps/s | 31.8517 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 90.9110μs | 28.0665μs | 35.6296 KOps/s | 34.7931 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 45.2310μs | 16.9053μs | 59.1531 KOps/s | 59.8684 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 96.2810μs | 53.4106μs | 18.7229 KOps/s | 18.6257 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 80.5220μs | 34.1048μs | 29.3214 KOps/s | 29.6568 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 58.3910μs | 31.2116μs | 32.0394 KOps/s | 32.3637 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 52.4210μs | 19.4361μs | 51.4505 KOps/s | 51.2961 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 94.9610μs | 52.1881μs | 19.1614 KOps/s | 19.4082 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 59.7710μs | 31.2726μs | 31.9769 KOps/s | 32.3232 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 63.4110μs | 31.7935μs | 31.4529 KOps/s | 31.2443 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 50.5810μs | 18.7040μs | 53.4644 KOps/s | 54.4821 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 2.6159ms | 54.4950μs | 18.3503 KOps/s | 18.5782 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 70.0110μs | 33.6268μs | 29.7382 KOps/s | 29.4582 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 62.8910μs | 34.9338μs | 28.6255 KOps/s | 28.5643 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 51.3710μs | 21.6372μs | 46.2166 KOps/s | 47.7558 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 86.7420μs | 57.2007μs | 17.4823 KOps/s | 17.5052 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 70.5110μs | 36.7504μs | 27.2106 KOps/s | 27.0980 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 98.1120μs | 34.5790μs | 28.9193 KOps/s | 28.3049 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 53.5110μs | 21.4399μs | 46.6420 KOps/s | 47.1360 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 98.9510μs | 58.7076μs | 17.0336 KOps/s | 16.8323 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 77.0010μs | 39.6038μs | 25.2501 KOps/s | 25.2296 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 72.6810μs | 36.7955μs | 27.1772 KOps/s | 26.6994 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 52.9010μs | 24.0841μs | 41.5211 KOps/s | 42.0901 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7638s | 0.7601s | 1.3155 Ops/s | 1.2855 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7325s | 0.6411s | 1.5599 Ops/s | 1.5549 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7799s | 1.7023s | 0.5874 Ops/s | 0.5891 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5457s | 1.4713s | 0.6797 Ops/s | 0.6810 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 2.0450s | 1.9649s | 0.5089 Ops/s | 0.5131 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.8117s | 1.7333s | 0.5769 Ops/s | 0.5802 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.9341s | 4.7332s | 0.2113 Ops/s | 0.2104 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.6519s | 4.5278s | 0.2209 Ops/s | 0.2214 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.0785s | 1.9819s | 0.5046 Ops/s | 0.5046 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.8526s | 1.7105s | 0.5846 Ops/s | 0.5912 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 22.0257ms | 21.5869ms | 46.3244 Ops/s | 47.8013 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1436s | 3.8205ms | 261.7492 Ops/s | 267.6292 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1114ms | 86.1417μs | 11.6088 KOps/s | 11.8577 KOps/s | |
| test_values[td1_return_estimate-False-False] | 51.8995ms | 51.3880ms | 19.4598 Ops/s | 19.6203 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.3694ms | 1.1160ms | 896.0600 Ops/s | 908.4098 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 84.4318ms | 83.9200ms | 11.9161 Ops/s | 11.9588 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.3264ms | 1.1099ms | 900.9758 Ops/s | 912.9208 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 22.2324ms | 21.7815ms | 45.9105 Ops/s | 47.0632 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0747ms | 0.7844ms | 1.2748 KOps/s | 1.2934 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7579ms | 0.7015ms | 1.4254 KOps/s | 1.4036 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5917ms | 1.5170ms | 659.2168 Ops/s | 666.4424 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7702ms | 0.7191ms | 1.3906 KOps/s | 1.4302 KOps/s | |
| test_dqn_speed[False-None] | 2.0225ms | 1.5663ms | 638.4276 Ops/s | 640.7204 Ops/s | |
| test_dqn_speed[False-backward] | 2.3123ms | 2.2255ms | 449.3409 Ops/s | 455.7208 Ops/s | |
| test_dqn_speed[True-None] | 1.0585ms | 0.5732ms | 1.7447 KOps/s | 1.8226 KOps/s | |
| test_dqn_speed[True-backward] | 1.1369ms | 1.0881ms | 918.9994 Ops/s | 889.2959 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.6700ms | 0.5739ms | 1.7423 KOps/s | 1.7000 KOps/s | |
| test_ddpg_speed[False-None] | 3.6484ms | 2.9865ms | 334.8368 Ops/s | 336.8814 Ops/s | |
| test_ddpg_speed[False-backward] | 4.5944ms | 4.2813ms | 233.5739 Ops/s | 226.8627 Ops/s | |
| test_ddpg_speed[True-None] | 1.3979ms | 1.3143ms | 760.8437 Ops/s | 767.7349 Ops/s | |
| test_ddpg_speed[True-backward] | 2.5336ms | 2.4154ms | 414.0159 Ops/s | 421.1570 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.4541ms | 1.3522ms | 739.5223 Ops/s | 748.1429 Ops/s | |
| test_sac_speed[False-None] | 8.9843ms | 8.5365ms | 117.1446 Ops/s | 117.3656 Ops/s | |
| test_sac_speed[False-backward] | 12.0419ms | 11.5430ms | 86.6327 Ops/s | 84.9420 Ops/s | |
| test_sac_speed[True-None] | 2.1611ms | 1.8211ms | 549.1283 Ops/s | 554.4245 Ops/s | |
| test_sac_speed[True-backward] | 3.5245ms | 3.4581ms | 289.1764 Ops/s | 279.1938 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 18.9077ms | 10.7406ms | 93.1050 Ops/s | 93.8383 Ops/s | |
| test_redq_deprec_speed[False-None] | 10.2597ms | 9.6004ms | 104.1624 Ops/s | 105.4119 Ops/s | |
| test_redq_deprec_speed[False-backward] | 13.2375ms | 12.7300ms | 78.5545 Ops/s | 77.6617 Ops/s | |
| test_redq_deprec_speed[True-None] | 2.6734ms | 2.5703ms | 389.0549 Ops/s | 391.2554 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.3364ms | 4.1966ms | 238.2904 Ops/s | 243.0146 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 15.6841ms | 9.6290ms | 103.8532 Ops/s | 105.5690 Ops/s | |
| test_td3_speed[False-None] | 8.8346ms | 8.4656ms | 118.1248 Ops/s | 117.6089 Ops/s | |
| test_td3_speed[False-backward] | 11.6279ms | 10.9119ms | 91.6432 Ops/s | 92.5118 Ops/s | |
| test_td3_speed[True-None] | 1.7150ms | 1.6442ms | 608.2041 Ops/s | 621.8021 Ops/s | |
| test_td3_speed[True-backward] | 3.1702ms | 3.1079ms | 321.7616 Ops/s | 323.0305 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 80.4918ms | 24.0528ms | 41.5752 Ops/s | 43.1723 Ops/s | |
| test_cql_speed[False-None] | 18.6702ms | 17.9424ms | 55.7338 Ops/s | 57.0931 Ops/s | |
| test_cql_speed[False-backward] | 23.7047ms | 23.2389ms | 43.0313 Ops/s | 43.8002 Ops/s | |
| test_cql_speed[True-None] | 3.3306ms | 3.2525ms | 307.4538 Ops/s | 310.8470 Ops/s | |
| test_cql_speed[True-backward] | 5.9367ms | 5.5183ms | 181.2167 Ops/s | 181.4506 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 18.8211ms | 11.7649ms | 84.9989 Ops/s | 86.0599 Ops/s | |
| test_a2c_speed[False-None] | 4.2909ms | 3.3603ms | 297.5937 Ops/s | 304.8047 Ops/s | |
| test_a2c_speed[False-backward] | 7.0399ms | 6.6219ms | 151.0141 Ops/s | 154.4998 Ops/s | |
| test_a2c_speed[True-None] | 1.8325ms | 1.3497ms | 740.9166 Ops/s | 749.7705 Ops/s | |
| test_a2c_speed[True-backward] | 3.2184ms | 3.1363ms | 318.8507 Ops/s | 323.9000 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.4067ms | 0.9887ms | 1.0114 KOps/s | 1.0227 KOps/s | |
| test_ppo_speed[False-None] | 4.3692ms | 3.9944ms | 250.3475 Ops/s | 256.7391 Ops/s | |
| test_ppo_speed[False-backward] | 7.8314ms | 7.3941ms | 135.2429 Ops/s | 136.1851 Ops/s | |
| test_ppo_speed[True-None] | 1.6284ms | 1.4263ms | 701.1060 Ops/s | 709.1185 Ops/s | |
| test_ppo_speed[True-backward] | 3.3075ms | 3.2608ms | 306.6707 Ops/s | 327.0132 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.1090ms | 1.0335ms | 967.5838 Ops/s | 933.3878 Ops/s | |
| test_reinforce_speed[False-None] | 2.4412ms | 2.3482ms | 425.8623 Ops/s | 431.0415 Ops/s | |
| test_reinforce_speed[False-backward] | 3.9558ms | 3.5141ms | 284.5644 Ops/s | 289.8118 Ops/s | |
| test_reinforce_speed[True-None] | 1.3578ms | 1.2664ms | 789.6590 Ops/s | 797.3933 Ops/s | |
| test_reinforce_speed[True-backward] | 3.1271ms | 3.0794ms | 324.7340 Ops/s | 325.8129 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 0.4676s | 10.1178ms | 98.8353 Ops/s | 97.4161 Ops/s | |
| test_iql_speed[False-None] | 10.1575ms | 9.7047ms | 103.0431 Ops/s | 103.5774 Ops/s | |
| test_iql_speed[False-backward] | 14.2866ms | 13.8114ms | 72.4040 Ops/s | 72.9256 Ops/s | |
| test_iql_speed[True-None] | 2.2988ms | 2.1961ms | 455.3485 Ops/s | 453.8979 Ops/s | |
| test_iql_speed[True-backward] | 5.1795ms | 4.8793ms | 204.9489 Ops/s | 206.8884 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 17.1818ms | 10.1425ms | 98.5951 Ops/s | 75.7660 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.4583ms | 5.9738ms | 167.3971 Ops/s | 166.0390 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8896ms | 0.2923ms | 3.4214 KOps/s | 3.4521 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5598ms | 0.2966ms | 3.3719 KOps/s | 3.6583 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1964ms | 5.8204ms | 171.8105 Ops/s | 170.2316 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.0512ms | 0.3174ms | 3.1509 KOps/s | 3.5038 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6433ms | 0.3107ms | 3.2187 KOps/s | 3.6221 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.8302ms | 1.4887ms | 671.7466 Ops/s | 768.3889 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6017ms | 1.3586ms | 736.0666 Ops/s | 786.1774 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.0885ms | 5.9956ms | 166.7881 Ops/s | 165.2669 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.2822ms | 0.5536ms | 1.8062 KOps/s | 2.1515 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7359ms | 0.5066ms | 1.9739 KOps/s | 2.1935 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0547ms | 5.8318ms | 171.4725 Ops/s | 168.9109 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8589ms | 0.3772ms | 2.6511 KOps/s | 2.8591 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6226ms | 0.3644ms | 2.7440 KOps/s | 2.6841 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0420ms | 5.7708ms | 173.2865 Ops/s | 170.1449 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.5771ms | 0.2883ms | 3.4685 KOps/s | 3.4860 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5768ms | 0.3482ms | 2.8718 KOps/s | 3.7223 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.2021ms | 5.9836ms | 167.1239 Ops/s | 166.1120 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.7601ms | 0.5106ms | 1.9586 KOps/s | 2.1704 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7682ms | 0.5023ms | 1.9909 KOps/s | 2.2034 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.6103s | 17.3642ms | 57.5897 Ops/s | 50.6984 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 32.2410ms | 2.4181ms | 413.5497 Ops/s | 491.3184 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 10.1338ms | 1.3107ms | 762.9572 Ops/s | 1.0406 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 7.5669ms | 5.2939ms | 188.8971 Ops/s | 191.4335 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 4.0374ms | 1.8089ms | 552.8273 Ops/s | 498.8845 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 8.9821ms | 1.3044ms | 766.6576 Ops/s | 751.7478 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.5634s | 16.6828ms | 59.9419 Ops/s | 183.3766 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 11.6269ms | 2.1753ms | 459.6972 Ops/s | 61.1396 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.2347ms | 1.0768ms | 928.6998 Ops/s | 840.4383 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 39.7468ms | 36.9444ms | 27.0677 Ops/s | 26.7994 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 20.3431ms | 18.6929ms | 53.4962 Ops/s | 53.5792 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 41.0633ms | 37.8010ms | 26.4543 Ops/s | 25.4597 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.7049ms | 18.7708ms | 53.2742 Ops/s | 52.5604 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 41.0404ms | 39.5088ms | 25.3108 Ops/s | 24.9039 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.3045ms | 20.1714ms | 49.5751 Ops/s | 48.8105 Ops/s |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
Add @pytest.mark.gpu to tests that require CUDA, and update run_all.sh
to filter tests based on whether running on GPU or CPU machines.
Changes:
This significantly reduces GPU machine usage by running only GPU-requiring
tests on expensive GPU runners (~30 tests instead of ~2000+). Tests that
can run on either device will run on CPU machines only.
The optimization can be disabled by setting TORCHRL_GPU_FILTER=0.