Skip to content

[BugFix] Fix doc CI by preventing _repr_html_ dispatch in parallel envs#3441

Merged
vmoens merged 12 commits intomainfrom
fix-doc
Feb 4, 2026
Merged

[BugFix] Fix doc CI by preventing _repr_html_ dispatch in parallel envs#3441
vmoens merged 12 commits intomainfrom
fix-doc

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Feb 4, 2026

Summary

  • Fixes the doc CI failure caused by Sphinx-Gallery trying to call _repr_html_ on _dispatch_caller_parallel objects
  • Added a guard to raise AttributeError for attributes starting with _, preventing special methods from being incorrectly dispatched to workers

Problem

The torchrl_envs.py tutorial has code that accesses parallel_env.foo which returns a _dispatch_caller_parallel object. When Sphinx-Gallery tries to display this result, it calls _repr_html_ to get an HTML representation. The __getattr__ method was blindly chaining this call, sending ('foo', '_repr_html_') to the worker processes. Since foo is a string, trying to get _repr_html_ from a string fails.

Test plan

  • Verified the fix prevents _repr_html_ from being dispatched
  • Verified chained attribute access still works
  • ParallelEnv tests pass

Made with Cursor

The _dispatch_caller_parallel class was chaining all attribute access,
including special methods like _repr_html_ that Sphinx-Gallery calls
for display. This caused failures when the underlying attribute was
a simple type (e.g., string) that doesn't have _repr_html_.

Added a guard to raise AttributeError for attributes starting with '_',
which signals to display systems that HTML representation is not supported.

Co-authored-by: Cursor <cursoragent@cursor.com>
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 4, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3441

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 12 Pending

As of commit 0c619f1 with merge base 7f24887 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 4, 2026
@github-actions github-actions bot added the BugFix label Feb 4, 2026
- Remove conditional that only ran tutorials on labeled PRs
- Tutorials now run on all PRs to catch issues earlier
- Remove toctree references to excluded tutorials (torchrl_demo, llm_browser)
- Fix title level inconsistencies in llms.rst

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions github-actions bot added Documentation Improvements or additions to documentation CI Has to do with CI setup (e.g. wheels & builds, tests...) labels Feb 4, 2026
- Remove share_memory_() call that caused multiprocessing issues
- Use spawn consistently instead of fork
- Modernize intro and remove outdated aafig tree structure
- Add torchrl_demo to GPU tutorials list
- Re-enable torchrl_demo in sphinx-gallery (remove from ignore_pattern)
- Add back to docs toctree

Co-authored-by: Cursor <cursoragent@cursor.com>
vmoens and others added 7 commits February 4, 2026 14:47
Co-authored-by: Cursor <cursoragent@cursor.com>
Streamlined the tutorial from ~730 lines to ~410 lines with:

- Quick start example at the top showing immediate value
- Progressive structure: TensorDict → Envs → Modules → Collection → Training
- Modern patterns: QValueActor, SyncDataCollector, LazyTensorStorage
- Complete DQN training loop example
- Removed redundant examples (multiple rollout implementations)
- Removed low-level internals (sum tree access)
- Updated "What's Next" with links to SOTA implementations and advanced features
- Cleaner code with consistent style

Co-authored-by: Cursor <cursoragent@cursor.com>
- Remove hardcoded mp_start_method="spawn"
- Let ParallelEnv use platform defaults (fork on Linux, spawn on Windows/macOS)
- Add note explaining the tradeoffs
- Remove forced spawn in sphinx-gallery preamble

Co-authored-by: Cursor <cursoragent@cursor.com>
Expand the tutorial with more narrative text explaining:
- Why TensorDict is useful (not just what it does)
- How specs describe environment inputs/outputs
- The composability benefit of TensorDictModule
- What each component does in the training loop
- Context for probabilistic policies and distributions

The tutorial now reads more like a guided walkthrough rather
than a reference of code examples.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Use mp_start_method="fork" for ParallelEnv (works when running script directly)
- Fix PrioritizedReplayBuffer.sample() to use return_info=True for indices
- Fix DQNLoss: pass QValueActor not raw network, use "categorical" action_space
- Add terminated key to dummy batch for loss example
- Use correct log prob key (action_log_prob)

Tested locally - tutorial runs successfully.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 4, 2026

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 173. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 81.4605μs 80.7819μs 12.3790 KOps/s 12.4523 KOps/s $\color{#d91a1a}-0.59\%$
test_tensor_to_bytestream_speed[torch.save] 0.1418ms 0.1408ms 7.1043 KOps/s 7.1544 KOps/s $\color{#d91a1a}-0.70\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1143s 0.1138s 8.7908 Ops/s 8.6439 Ops/s $\color{#35bf28}+1.70\%$
test_tensor_to_bytestream_speed[numpy] 2.6069μs 2.6037μs 384.0749 KOps/s 373.8384 KOps/s $\color{#35bf28}+2.74\%$
test_tensor_to_bytestream_speed[safetensors] 39.5183μs 38.4149μs 26.0316 KOps/s 25.8811 KOps/s $\color{#35bf28}+0.58\%$
test_simple 0.5448s 0.5444s 1.8370 Ops/s 1.7232 Ops/s $\textbf{\color{#35bf28}+6.60\%}$
test_transformed 1.1309s 1.1297s 0.8852 Ops/s 0.8527 Ops/s $\color{#35bf28}+3.82\%$
test_serial 1.6773s 1.6742s 0.5973 Ops/s 0.5908 Ops/s $\color{#35bf28}+1.11\%$
test_parallel 1.2095s 1.1241s 0.8896 Ops/s 0.8902 Ops/s $\color{#d91a1a}-0.07\%$
test_step_mdp_speed[True-True-True-True-True] 0.1347ms 43.9052μs 22.7763 KOps/s 22.6816 KOps/s $\color{#35bf28}+0.42\%$
test_step_mdp_speed[True-True-True-True-False] 56.3710μs 25.2548μs 39.5964 KOps/s 39.9244 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[True-True-True-False-True] 60.8410μs 24.8420μs 40.2544 KOps/s 39.2565 KOps/s $\color{#35bf28}+2.54\%$
test_step_mdp_speed[True-True-True-False-False] 72.3510μs 13.7982μs 72.4730 KOps/s 72.3578 KOps/s $\color{#35bf28}+0.16\%$
test_step_mdp_speed[True-True-False-True-True] 78.9710μs 47.9356μs 20.8613 KOps/s 20.9235 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[True-True-False-True-False] 57.0010μs 28.3171μs 35.3144 KOps/s 36.3113 KOps/s $\color{#d91a1a}-2.75\%$
test_step_mdp_speed[True-True-False-False-True] 77.8010μs 27.2885μs 36.6455 KOps/s 36.1037 KOps/s $\color{#35bf28}+1.50\%$
test_step_mdp_speed[True-True-False-False-False] 43.1010μs 16.8988μs 59.1758 KOps/s 61.2254 KOps/s $\color{#d91a1a}-3.35\%$
test_step_mdp_speed[True-False-True-True-True] 0.1123ms 50.5672μs 19.7757 KOps/s 19.9719 KOps/s $\color{#d91a1a}-0.98\%$
test_step_mdp_speed[True-False-True-True-False] 63.8210μs 30.8829μs 32.3803 KOps/s 32.7702 KOps/s $\color{#d91a1a}-1.19\%$
test_step_mdp_speed[True-False-True-False-True] 63.0810μs 28.1055μs 35.5802 KOps/s 36.0157 KOps/s $\color{#d91a1a}-1.21\%$
test_step_mdp_speed[True-False-True-False-False] 44.7410μs 16.5776μs 60.3224 KOps/s 60.3446 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[True-False-False-True-True] 84.1310μs 52.8010μs 18.9390 KOps/s 19.0324 KOps/s $\color{#d91a1a}-0.49\%$
test_step_mdp_speed[True-False-False-True-False] 68.1310μs 33.4905μs 29.8592 KOps/s 30.1051 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[True-False-False-False-True] 87.6810μs 29.8747μs 33.4732 KOps/s 33.2286 KOps/s $\color{#35bf28}+0.74\%$
test_step_mdp_speed[True-False-False-False-False] 50.5800μs 19.1158μs 52.3127 KOps/s 52.3390 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[False-True-True-True-True] 79.6810μs 50.8087μs 19.6816 KOps/s 19.7846 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[False-True-True-True-False] 75.3010μs 30.8741μs 32.3896 KOps/s 32.4169 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[False-True-True-False-True] 2.4186ms 31.7980μs 31.4485 KOps/s 31.2352 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[False-True-True-False-False] 78.2410μs 18.3836μs 54.3964 KOps/s 54.6574 KOps/s $\color{#d91a1a}-0.48\%$
test_step_mdp_speed[False-True-False-True-True] 0.1160ms 53.0351μs 18.8554 KOps/s 18.7142 KOps/s $\color{#35bf28}+0.75\%$
test_step_mdp_speed[False-True-False-True-False] 68.6110μs 33.5864μs 29.7739 KOps/s 30.2317 KOps/s $\color{#d91a1a}-1.51\%$
test_step_mdp_speed[False-True-False-False-True] 60.1610μs 33.8542μs 29.5384 KOps/s 29.2942 KOps/s $\color{#35bf28}+0.83\%$
test_step_mdp_speed[False-True-False-False-False] 46.8000μs 21.1491μs 47.2833 KOps/s 47.3165 KOps/s $\color{#d91a1a}-0.07\%$
test_step_mdp_speed[False-False-True-True-True] 96.2210μs 56.1333μs 17.8147 KOps/s 17.6564 KOps/s $\color{#35bf28}+0.90\%$
test_step_mdp_speed[False-False-True-True-False] 74.9210μs 36.5267μs 27.3773 KOps/s 27.2372 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[False-False-True-False-True] 67.9610μs 34.0503μs 29.3683 KOps/s 28.9298 KOps/s $\color{#35bf28}+1.52\%$
test_step_mdp_speed[False-False-True-False-False] 62.2910μs 20.8063μs 48.0623 KOps/s 46.9432 KOps/s $\color{#35bf28}+2.38\%$
test_step_mdp_speed[False-False-False-True-True] 95.4210μs 58.5450μs 17.0809 KOps/s 16.9933 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[False-False-False-True-False] 76.5110μs 39.0070μs 25.6364 KOps/s 25.6394 KOps/s $\color{#d91a1a}-0.01\%$
test_step_mdp_speed[False-False-False-False-True] 95.6810μs 36.0264μs 27.7575 KOps/s 27.3015 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[False-False-False-False-False] 58.9210μs 23.6950μs 42.2030 KOps/s 42.3842 KOps/s $\color{#d91a1a}-0.43\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.7449s 0.7438s 1.3445 Ops/s 1.2944 Ops/s $\color{#35bf28}+3.87\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7339s 0.6359s 1.5725 Ops/s 1.5617 Ops/s $\color{#35bf28}+0.69\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7614s 1.6813s 0.5948 Ops/s 0.5955 Ops/s $\color{#d91a1a}-0.11\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5344s 1.4531s 0.6882 Ops/s 0.6884 Ops/s $\color{#d91a1a}-0.04\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9998s 1.9202s 0.5208 Ops/s 0.5195 Ops/s $\color{#35bf28}+0.24\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7873s 1.7059s 0.5862 Ops/s 0.5880 Ops/s $\color{#d91a1a}-0.30\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7648s 4.6523s 0.2149 Ops/s 0.2174 Ops/s $\color{#d91a1a}-1.11\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5124s 4.4287s 0.2258 Ops/s 0.2254 Ops/s $\color{#35bf28}+0.17\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.0373s 1.9665s 0.5085 Ops/s 0.5068 Ops/s $\color{#35bf28}+0.35\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7565s 1.6762s 0.5966 Ops/s 0.5929 Ops/s $\color{#35bf28}+0.62\%$
test_values[generalized_advantage_estimate-True-True] 10.2662ms 10.1505ms 98.5175 Ops/s 98.4795 Ops/s $\color{#35bf28}+0.04\%$
test_values[vec_generalized_advantage_estimate-True-True] 15.5799ms 11.3456ms 88.1399 Ops/s 56.2097 Ops/s $\textbf{\color{#35bf28}+56.81\%}$
test_values[td0_return_estimate-False-False] 0.2327ms 0.1272ms 7.8624 KOps/s 7.7692 KOps/s $\color{#35bf28}+1.20\%$
test_values[td1_return_estimate-False-False] 26.9065ms 26.5133ms 37.7170 Ops/s 37.1850 Ops/s $\color{#35bf28}+1.43\%$
test_values[vec_td1_return_estimate-False-False] 12.4213ms 11.3096ms 88.4203 Ops/s 56.1894 Ops/s $\textbf{\color{#35bf28}+57.36\%}$
test_values[td_lambda_return_estimate-True-False] 40.0262ms 39.1629ms 25.5344 Ops/s 24.9921 Ops/s $\color{#35bf28}+2.17\%$
test_values[vec_td_lambda_return_estimate-True-False] 12.8460ms 11.4158ms 87.5975 Ops/s 55.6816 Ops/s $\textbf{\color{#35bf28}+57.32\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.1939ms 8.9943ms 111.1816 Ops/s 110.2782 Ops/s $\color{#35bf28}+0.82\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.7782ms 1.5612ms 640.5126 Ops/s 660.4498 Ops/s $\color{#d91a1a}-3.02\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4580ms 0.4060ms 2.4629 KOps/s 2.4623 KOps/s $\color{#35bf28}+0.03\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 30.9260ms 26.7028ms 37.4493 Ops/s 30.6776 Ops/s $\textbf{\color{#35bf28}+22.07\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1.9884ms 1.7132ms 583.7114 Ops/s 582.1654 Ops/s $\color{#35bf28}+0.27\%$
test_dqn_speed[False-None] 1.5157ms 1.3824ms 723.3638 Ops/s 719.4603 Ops/s $\color{#35bf28}+0.54\%$
test_dqn_speed[False-backward] 1.9929ms 1.8982ms 526.8268 Ops/s 524.0775 Ops/s $\color{#35bf28}+0.52\%$
test_dqn_speed[True-None] 0.6457ms 0.5542ms 1.8044 KOps/s 1.8126 KOps/s $\color{#d91a1a}-0.45\%$
test_dqn_speed[True-backward] 1.0708ms 1.0238ms 976.7569 Ops/s 887.2004 Ops/s $\textbf{\color{#35bf28}+10.09\%}$
test_dqn_speed[reduce-overhead-None] 0.6651ms 0.5555ms 1.8001 KOps/s 1.7940 KOps/s $\color{#35bf28}+0.34\%$
test_ddpg_speed[False-None] 3.0805ms 2.8315ms 353.1642 Ops/s 346.7698 Ops/s $\color{#35bf28}+1.84\%$
test_ddpg_speed[False-backward] 4.2327ms 4.0843ms 244.8425 Ops/s 243.8456 Ops/s $\color{#35bf28}+0.41\%$
test_ddpg_speed[True-None] 1.5650ms 1.4291ms 699.7310 Ops/s 686.2672 Ops/s $\color{#35bf28}+1.96\%$
test_ddpg_speed[True-backward] 2.5178ms 2.4369ms 410.3582 Ops/s 340.1566 Ops/s $\textbf{\color{#35bf28}+20.64\%}$
test_ddpg_speed[reduce-overhead-None] 1.5399ms 1.4235ms 702.4849 Ops/s 690.9655 Ops/s $\color{#35bf28}+1.67\%$
test_sac_speed[False-None] 8.5841ms 8.0335ms 124.4789 Ops/s 125.0595 Ops/s $\color{#d91a1a}-0.46\%$
test_sac_speed[False-backward] 11.7138ms 11.3341ms 88.2291 Ops/s 89.1989 Ops/s $\color{#d91a1a}-1.09\%$
test_sac_speed[True-None] 2.4364ms 2.2294ms 448.5436 Ops/s 456.1894 Ops/s $\color{#d91a1a}-1.68\%$
test_sac_speed[True-backward] 4.3373ms 4.1742ms 239.5646 Ops/s 239.9652 Ops/s $\color{#d91a1a}-0.17\%$
test_sac_speed[reduce-overhead-None] 2.3430ms 2.2157ms 451.3183 Ops/s 445.9617 Ops/s $\color{#35bf28}+1.20\%$
test_redq_speed[False-None] 10.9465ms 10.4584ms 95.6172 Ops/s 96.0130 Ops/s $\color{#d91a1a}-0.41\%$
test_redq_speed[False-backward] 18.9679ms 18.2118ms 54.9094 Ops/s 55.9576 Ops/s $\color{#d91a1a}-1.87\%$
test_redq_speed[True-None] 4.8919ms 4.6894ms 213.2476 Ops/s 210.0917 Ops/s $\color{#35bf28}+1.50\%$
test_redq_speed[True-backward] 10.6411ms 10.2830ms 97.2474 Ops/s 101.0000 Ops/s $\color{#d91a1a}-3.72\%$
test_redq_speed[reduce-overhead-None] 5.2620ms 4.7307ms 211.3871 Ops/s 216.7303 Ops/s $\color{#d91a1a}-2.47\%$
test_redq_deprec_speed[False-None] 11.6825ms 11.2347ms 89.0100 Ops/s 88.4499 Ops/s $\color{#35bf28}+0.63\%$
test_redq_deprec_speed[False-backward] 16.8729ms 16.3004ms 61.3482 Ops/s 62.2296 Ops/s $\color{#d91a1a}-1.42\%$
test_redq_deprec_speed[True-None] 4.0967ms 3.8427ms 260.2331 Ops/s 258.2978 Ops/s $\color{#35bf28}+0.75\%$
test_redq_deprec_speed[True-backward] 8.2756ms 7.9630ms 125.5804 Ops/s 123.9594 Ops/s $\color{#35bf28}+1.31\%$
test_redq_deprec_speed[reduce-overhead-None] 4.1356ms 3.7675ms 265.4304 Ops/s 260.1315 Ops/s $\color{#35bf28}+2.04\%$
test_td3_speed[False-None] 8.1666ms 8.0431ms 124.3303 Ops/s 122.5881 Ops/s $\color{#35bf28}+1.42\%$
test_td3_speed[False-backward] 11.4487ms 10.9419ms 91.3918 Ops/s 91.2142 Ops/s $\color{#35bf28}+0.19\%$
test_td3_speed[True-None] 1.9790ms 1.9371ms 516.2327 Ops/s 522.3419 Ops/s $\color{#d91a1a}-1.17\%$
test_td3_speed[True-backward] 4.0077ms 3.7894ms 263.8926 Ops/s 248.3862 Ops/s $\textbf{\color{#35bf28}+6.24\%}$
test_td3_speed[reduce-overhead-None] 1.9049ms 1.8654ms 536.0771 Ops/s 540.5805 Ops/s $\color{#d91a1a}-0.83\%$
test_cql_speed[False-None] 27.3867ms 26.4116ms 37.8622 Ops/s 37.9608 Ops/s $\color{#d91a1a}-0.26\%$
test_cql_speed[False-backward] 38.7398ms 36.0073ms 27.7722 Ops/s 27.1316 Ops/s $\color{#35bf28}+2.36\%$
test_cql_speed[True-None] 13.6923ms 12.8155ms 78.0303 Ops/s 77.4299 Ops/s $\color{#35bf28}+0.78\%$
test_cql_speed[True-backward] 19.0022ms 18.4386ms 54.2340 Ops/s 54.0301 Ops/s $\color{#35bf28}+0.38\%$
test_cql_speed[reduce-overhead-None] 13.3124ms 12.8280ms 77.9543 Ops/s 77.3894 Ops/s $\color{#35bf28}+0.73\%$
test_a2c_speed[False-None] 5.8382ms 5.3693ms 186.2436 Ops/s 183.3367 Ops/s $\color{#35bf28}+1.59\%$
test_a2c_speed[False-backward] 12.5214ms 12.0078ms 83.2790 Ops/s 84.4766 Ops/s $\color{#d91a1a}-1.42\%$
test_a2c_speed[True-None] 4.0375ms 3.8066ms 262.6992 Ops/s 252.1848 Ops/s $\color{#35bf28}+4.17\%$
test_a2c_speed[True-backward] 8.9721ms 8.7176ms 114.7100 Ops/s 113.9517 Ops/s $\color{#35bf28}+0.67\%$
test_a2c_speed[reduce-overhead-None] 3.9167ms 3.7604ms 265.9285 Ops/s 263.3229 Ops/s $\color{#35bf28}+0.99\%$
test_ppo_speed[False-None] 6.2418ms 5.8170ms 171.9093 Ops/s 165.9772 Ops/s $\color{#35bf28}+3.57\%$
test_ppo_speed[False-backward] 13.0150ms 12.5153ms 79.9023 Ops/s 79.0214 Ops/s $\color{#35bf28}+1.11\%$
test_ppo_speed[True-None] 4.5649ms 3.7067ms 269.7835 Ops/s 271.2031 Ops/s $\color{#d91a1a}-0.52\%$
test_ppo_speed[True-backward] 9.0229ms 8.5450ms 117.0281 Ops/s 115.2725 Ops/s $\color{#35bf28}+1.52\%$
test_ppo_speed[reduce-overhead-None] 4.1376ms 3.6907ms 270.9549 Ops/s 269.1531 Ops/s $\color{#35bf28}+0.67\%$
test_reinforce_speed[False-None] 4.9508ms 4.5503ms 219.7658 Ops/s 192.5267 Ops/s $\textbf{\color{#35bf28}+14.15\%}$
test_reinforce_speed[False-backward] 7.8571ms 7.4343ms 134.5110 Ops/s 133.8061 Ops/s $\color{#35bf28}+0.53\%$
test_reinforce_speed[True-None] 4.2442ms 3.0142ms 331.7653 Ops/s 333.6241 Ops/s $\color{#d91a1a}-0.56\%$
test_reinforce_speed[True-backward] 8.1735ms 7.8185ms 127.9021 Ops/s 118.0998 Ops/s $\textbf{\color{#35bf28}+8.30\%}$
test_reinforce_speed[reduce-overhead-None] 3.4430ms 2.9180ms 342.7002 Ops/s 329.6436 Ops/s $\color{#35bf28}+3.96\%$
test_iql_speed[False-None] 20.3059ms 19.5734ms 51.0896 Ops/s 48.8412 Ops/s $\color{#35bf28}+4.60\%$
test_iql_speed[False-backward] 36.3111ms 30.6957ms 32.5779 Ops/s 31.9777 Ops/s $\color{#35bf28}+1.88\%$
test_iql_speed[True-None] 9.1055ms 8.7299ms 114.5487 Ops/s 112.0978 Ops/s $\color{#35bf28}+2.19\%$
test_iql_speed[True-backward] 17.5184ms 16.9235ms 59.0895 Ops/s 58.5697 Ops/s $\color{#35bf28}+0.89\%$
test_iql_speed[reduce-overhead-None] 9.8862ms 8.8095ms 113.5132 Ops/s 109.6964 Ops/s $\color{#35bf28}+3.48\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4223ms 6.0139ms 166.2816 Ops/s 163.7053 Ops/s $\color{#35bf28}+1.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.2283ms 0.3827ms 2.6127 KOps/s 3.4589 KOps/s $\textbf{\color{#d91a1a}-24.46\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8581ms 0.3632ms 2.7530 KOps/s 3.7150 KOps/s $\textbf{\color{#d91a1a}-25.90\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4837ms 5.9082ms 169.2568 Ops/s 170.1988 Ops/s $\color{#d91a1a}-0.55\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0023ms 0.3481ms 2.8726 KOps/s 2.8845 KOps/s $\color{#d91a1a}-0.41\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8171ms 0.2935ms 3.4069 KOps/s 3.0089 KOps/s $\textbf{\color{#35bf28}+13.23\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.7724ms 1.2572ms 795.4120 Ops/s 775.5270 Ops/s $\color{#35bf28}+2.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.3855ms 1.1801ms 847.3664 Ops/s 826.1254 Ops/s $\color{#35bf28}+2.57\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 12.7896ms 6.1990ms 161.3163 Ops/s 165.0645 Ops/s $\color{#d91a1a}-2.27\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0511ms 0.4807ms 2.0804 KOps/s 2.0562 KOps/s $\color{#35bf28}+1.18\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8791ms 0.4858ms 2.0583 KOps/s 2.1609 KOps/s $\color{#d91a1a}-4.75\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2971ms 5.8569ms 170.7376 Ops/s 169.3510 Ops/s $\color{#35bf28}+0.82\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9755ms 0.3758ms 2.6612 KOps/s 2.4839 KOps/s $\textbf{\color{#35bf28}+7.14\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5365ms 0.3395ms 2.9459 KOps/s 3.1163 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4072ms 5.8585ms 170.6921 Ops/s 170.3813 Ops/s $\color{#35bf28}+0.18\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6168ms 0.2960ms 3.3783 KOps/s 2.5800 KOps/s $\textbf{\color{#35bf28}+30.94\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7587ms 0.3179ms 3.1459 KOps/s 2.7546 KOps/s $\textbf{\color{#35bf28}+14.21\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6828ms 6.0421ms 165.5064 Ops/s 165.6795 Ops/s $\color{#d91a1a}-0.10\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.3815ms 0.5294ms 1.8889 KOps/s 2.0120 KOps/s $\textbf{\color{#d91a1a}-6.12\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9573ms 0.5128ms 1.9500 KOps/s 2.1487 KOps/s $\textbf{\color{#d91a1a}-9.25\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.5469ms 5.0918ms 196.3948 Ops/s 50.9153 Ops/s $\textbf{\color{#35bf28}+285.73\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.0611ms 1.9858ms 503.5838 Ops/s 544.9197 Ops/s $\textbf{\color{#d91a1a}-7.59\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.9987ms 0.8770ms 1.1402 KOps/s 1.1114 KOps/s $\color{#35bf28}+2.59\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 8.0207ms 5.0930ms 196.3484 Ops/s 194.7310 Ops/s $\color{#35bf28}+0.83\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.8164ms 1.9379ms 516.0244 Ops/s 560.8635 Ops/s $\textbf{\color{#d91a1a}-7.99\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.3788ms 1.2310ms 812.3262 Ops/s 811.4213 Ops/s $\color{#35bf28}+0.11\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5593s 16.3719ms 61.0803 Ops/s 190.6662 Ops/s $\textbf{\color{#d91a1a}-67.96\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 12.7061ms 2.1199ms 471.7282 Ops/s 517.0980 Ops/s $\textbf{\color{#d91a1a}-8.77\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.3230ms 1.0295ms 971.3569 Ops/s 900.8337 Ops/s $\textbf{\color{#35bf28}+7.83\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 37.7403ms 35.7111ms 28.0025 Ops/s 27.6703 Ops/s $\color{#35bf28}+1.20\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.1012ms 17.6437ms 56.6774 Ops/s 56.1417 Ops/s $\color{#35bf28}+0.95\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.3594ms 36.9240ms 27.0827 Ops/s 26.7727 Ops/s $\color{#35bf28}+1.16\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.3517ms 17.8763ms 55.9400 Ops/s 53.9390 Ops/s $\color{#35bf28}+3.71\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.3892ms 38.7641ms 25.7971 Ops/s 25.7017 Ops/s $\color{#35bf28}+0.37\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.4949ms 19.2616ms 51.9168 Ops/s 51.2259 Ops/s $\color{#35bf28}+1.35\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8935ms 0.2217ms 4.5115 KOps/s 4.6344 KOps/s $\color{#d91a1a}-2.65\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7220ms 1.4458ms 691.6619 Ops/s 698.1261 Ops/s $\color{#d91a1a}-0.93\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.8488ms 2.5161ms 397.4387 Ops/s 420.4499 Ops/s $\textbf{\color{#d91a1a}-5.47\%}$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.3981ms 3.0072ms 332.5323 Ops/s 332.0536 Ops/s $\color{#35bf28}+0.14\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2126ms 0.1353ms 7.3892 KOps/s 7.5459 KOps/s $\color{#d91a1a}-2.08\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3398ms 0.1863ms 5.3685 KOps/s 5.4618 KOps/s $\color{#d91a1a}-1.71\%$
test_storage_write_contiguous[100-img_shape2-large_img] 2.1055ms 1.8748ms 533.3780 Ops/s 536.7391 Ops/s $\color{#d91a1a}-0.63\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.6097ms 1.3966ms 716.0125 Ops/s 731.0381 Ops/s $\color{#d91a1a}-2.06\%$
test_collector_stack_then_write[50-img_shape0-small] 1.3594ms 1.1327ms 882.8220 Ops/s 894.5132 Ops/s $\color{#d91a1a}-1.31\%$
test_collector_stack_then_write[100-img_shape1-atari] 4.0484ms 3.6765ms 271.9972 Ops/s 279.5860 Ops/s $\color{#d91a1a}-2.71\%$
test_collector_stack_then_write[100-img_shape2-large_img] 6.2043ms 5.7967ms 172.5127 Ops/s 171.5271 Ops/s $\color{#35bf28}+0.57\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.9931ms 7.2978ms 137.0278 Ops/s 140.0691 Ops/s $\color{#d91a1a}-2.17\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.7206ms 0.2732ms 3.6599 KOps/s 3.5948 KOps/s $\color{#35bf28}+1.81\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 2.1203ms 1.5706ms 636.7162 Ops/s 638.9973 Ops/s $\color{#d91a1a}-0.36\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 3.0385ms 2.6513ms 377.1731 Ops/s 397.8374 Ops/s $\textbf{\color{#d91a1a}-5.19\%}$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.6740ms 3.2678ms 306.0167 Ops/s 310.4659 Ops/s $\color{#d91a1a}-1.43\%$
test_collector_without_rb[100-img_shape0-atari] 35.0569ms 33.8254ms 29.5636 Ops/s 29.7640 Ops/s $\color{#d91a1a}-0.67\%$
test_collector_without_rb[200-img_shape1-large_batch] 66.6236ms 66.0993ms 15.1288 Ops/s 15.0112 Ops/s $\color{#35bf28}+0.78\%$
test_collector_with_rb[100-img_shape0-atari] 38.1898ms 37.7028ms 26.5232 Ops/s 26.1599 Ops/s $\color{#35bf28}+1.39\%$
test_collector_with_rb[200-img_shape1-large_batch] 0.6761s 0.1213s 8.2415 Ops/s 13.3093 Ops/s $\textbf{\color{#d91a1a}-38.08\%}$

@github-actions
Copy link
Contributor

github-actions bot commented Feb 4, 2026

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 78.9785μs 77.6050μs 12.8858 KOps/s 12.2459 KOps/s $\textbf{\color{#35bf28}+5.22\%}$
test_tensor_to_bytestream_speed[torch.save] 0.1363ms 0.1360ms 7.3523 KOps/s 6.9827 KOps/s $\textbf{\color{#35bf28}+5.29\%}$
test_tensor_to_bytestream_speed[untyped_storage] 0.1006s 0.1002s 9.9839 Ops/s 9.9100 Ops/s $\color{#35bf28}+0.75\%$
test_tensor_to_bytestream_speed[numpy] 2.4367μs 2.4050μs 415.7943 KOps/s 417.9456 KOps/s $\color{#d91a1a}-0.51\%$
test_tensor_to_bytestream_speed[safetensors] 35.5730μs 35.3782μs 28.2660 KOps/s 26.7964 KOps/s $\textbf{\color{#35bf28}+5.48\%}$
test_simple 0.7852s 0.7812s 1.2800 Ops/s 1.2418 Ops/s $\color{#35bf28}+3.08\%$
test_transformed 1.5223s 1.4299s 0.6993 Ops/s 0.7010 Ops/s $\color{#d91a1a}-0.23\%$
test_serial 2.3754s 2.2827s 0.4381 Ops/s 0.4313 Ops/s $\color{#35bf28}+1.57\%$
test_parallel 2.0390s 1.9187s 0.5212 Ops/s 0.5161 Ops/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[True-True-True-True-True] 0.2615ms 43.2203μs 23.1373 KOps/s 23.2463 KOps/s $\color{#d91a1a}-0.47\%$
test_step_mdp_speed[True-True-True-True-False] 56.0510μs 24.2647μs 41.2121 KOps/s 40.4589 KOps/s $\color{#35bf28}+1.86\%$
test_step_mdp_speed[True-True-True-False-True] 51.9610μs 24.1376μs 41.4292 KOps/s 41.7357 KOps/s $\color{#d91a1a}-0.73\%$
test_step_mdp_speed[True-True-True-False-False] 58.7810μs 13.3430μs 74.9457 KOps/s 74.3626 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[True-True-False-True-True] 87.7610μs 45.9091μs 21.7822 KOps/s 21.5507 KOps/s $\color{#35bf28}+1.07\%$
test_step_mdp_speed[True-True-False-True-False] 57.5600μs 26.6938μs 37.4619 KOps/s 37.1717 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[True-True-False-False-True] 59.8310μs 26.1640μs 38.2205 KOps/s 36.8620 KOps/s $\color{#35bf28}+3.69\%$
test_step_mdp_speed[True-True-False-False-False] 62.3310μs 16.1048μs 62.0932 KOps/s 62.0549 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[True-False-True-True-True] 0.1244ms 47.3869μs 21.1029 KOps/s 20.3470 KOps/s $\color{#35bf28}+3.72\%$
test_step_mdp_speed[True-False-True-True-False] 60.8110μs 29.5554μs 33.8347 KOps/s 33.2621 KOps/s $\color{#35bf28}+1.72\%$
test_step_mdp_speed[True-False-True-False-True] 64.6400μs 27.0892μs 36.9151 KOps/s 36.9340 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[True-False-True-False-False] 50.1310μs 16.5541μs 60.4079 KOps/s 62.2070 KOps/s $\color{#d91a1a}-2.89\%$
test_step_mdp_speed[True-False-False-True-True] 87.0110μs 52.3651μs 19.0967 KOps/s 19.3766 KOps/s $\color{#d91a1a}-1.44\%$
test_step_mdp_speed[True-False-False-True-False] 64.1810μs 32.3839μs 30.8795 KOps/s 30.7813 KOps/s $\color{#35bf28}+0.32\%$
test_step_mdp_speed[True-False-False-False-True] 63.8700μs 29.3640μs 34.0553 KOps/s 33.7287 KOps/s $\color{#35bf28}+0.97\%$
test_step_mdp_speed[True-False-False-False-False] 53.6000μs 18.9105μs 52.8805 KOps/s 52.6016 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[False-True-True-True-True] 84.5910μs 49.5676μs 20.1745 KOps/s 20.0682 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[False-True-True-True-False] 60.0710μs 30.0728μs 33.2527 KOps/s 32.8652 KOps/s $\color{#35bf28}+1.18\%$
test_step_mdp_speed[False-True-True-False-True] 2.3424ms 31.1278μs 32.1256 KOps/s 31.9394 KOps/s $\color{#35bf28}+0.58\%$
test_step_mdp_speed[False-True-True-False-False] 48.1500μs 18.0226μs 55.4858 KOps/s 55.7276 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[False-True-False-True-True] 78.5010μs 51.2895μs 19.4972 KOps/s 19.3788 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[False-True-False-True-False] 64.3210μs 32.1467μs 31.1074 KOps/s 30.8809 KOps/s $\color{#35bf28}+0.73\%$
test_step_mdp_speed[False-True-False-False-True] 83.7810μs 33.2758μs 30.0519 KOps/s 29.8665 KOps/s $\color{#35bf28}+0.62\%$
test_step_mdp_speed[False-True-False-False-False] 47.5910μs 20.2147μs 49.4691 KOps/s 49.0666 KOps/s $\color{#35bf28}+0.82\%$
test_step_mdp_speed[False-False-True-True-True] 84.2500μs 54.2785μs 18.4235 KOps/s 18.3078 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[False-False-True-True-False] 64.9910μs 34.7166μs 28.8047 KOps/s 28.2376 KOps/s $\color{#35bf28}+2.01\%$
test_step_mdp_speed[False-False-True-False-True] 75.2700μs 33.2764μs 30.0513 KOps/s 29.9764 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[False-False-True-False-False] 54.0810μs 20.2968μs 49.2689 KOps/s 48.5275 KOps/s $\color{#35bf28}+1.53\%$
test_step_mdp_speed[False-False-False-True-True] 0.1185ms 56.3421μs 17.7487 KOps/s 17.9329 KOps/s $\color{#d91a1a}-1.03\%$
test_step_mdp_speed[False-False-False-True-False] 74.2400μs 37.9133μs 26.3759 KOps/s 26.9597 KOps/s $\color{#d91a1a}-2.17\%$
test_step_mdp_speed[False-False-False-False-True] 70.5210μs 34.8823μs 28.6679 KOps/s 28.3852 KOps/s $\color{#35bf28}+1.00\%$
test_step_mdp_speed[False-False-False-False-False] 52.0900μs 22.8084μs 43.8436 KOps/s 43.6607 KOps/s $\color{#35bf28}+0.42\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8616s 0.7601s 1.3156 Ops/s 1.3257 Ops/s $\color{#d91a1a}-0.76\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7222s 0.6247s 1.6007 Ops/s 1.6093 Ops/s $\color{#d91a1a}-0.53\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7382s 1.6598s 0.6025 Ops/s 0.6077 Ops/s $\color{#d91a1a}-0.86\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5103s 1.4337s 0.6975 Ops/s 0.7004 Ops/s $\color{#d91a1a}-0.42\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9862s 1.9048s 0.5250 Ops/s 0.5236 Ops/s $\color{#35bf28}+0.27\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7634s 1.6836s 0.5940 Ops/s 0.5900 Ops/s $\color{#35bf28}+0.67\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7097s 4.5973s 0.2175 Ops/s 0.2165 Ops/s $\color{#35bf28}+0.48\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5038s 4.4340s 0.2255 Ops/s 0.2240 Ops/s $\color{#35bf28}+0.68\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.0033s 1.9249s 0.5195 Ops/s 0.5034 Ops/s $\color{#35bf28}+3.21\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7244s 1.6410s 0.6094 Ops/s 0.6099 Ops/s $\color{#d91a1a}-0.08\%$
test_values[generalized_advantage_estimate-True-True] 20.2762ms 19.7552ms 50.6196 Ops/s 49.9755 Ops/s $\color{#35bf28}+1.29\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1520s 3.9511ms 253.0937 Ops/s 277.5080 Ops/s $\textbf{\color{#d91a1a}-8.80\%}$
test_values[td0_return_estimate-False-False] 0.1061ms 81.3193μs 12.2972 KOps/s 12.2059 KOps/s $\color{#35bf28}+0.75\%$
test_values[td1_return_estimate-False-False] 47.4541ms 46.9452ms 21.3014 Ops/s 20.3530 Ops/s $\color{#35bf28}+4.66\%$
test_values[vec_td1_return_estimate-False-False] 1.2881ms 1.0758ms 929.5045 Ops/s 919.6676 Ops/s $\color{#35bf28}+1.07\%$
test_values[td_lambda_return_estimate-True-False] 80.6743ms 77.5041ms 12.9025 Ops/s 12.3676 Ops/s $\color{#35bf28}+4.33\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2609ms 1.0716ms 933.1717 Ops/s 924.8861 Ops/s $\color{#35bf28}+0.90\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 20.4590ms 20.0197ms 49.9509 Ops/s 49.0952 Ops/s $\color{#35bf28}+1.74\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0047ms 0.7417ms 1.3483 KOps/s 1.3341 KOps/s $\color{#35bf28}+1.07\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7185ms 0.6652ms 1.5034 KOps/s 1.4525 KOps/s $\color{#35bf28}+3.51\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5104ms 1.4781ms 676.5369 Ops/s 671.9742 Ops/s $\color{#35bf28}+0.68\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7389ms 0.6863ms 1.4570 KOps/s 1.4129 KOps/s $\color{#35bf28}+3.12\%$
test_dqn_speed[False-None] 1.5859ms 1.5072ms 663.4906 Ops/s 661.8800 Ops/s $\color{#35bf28}+0.24\%$
test_dqn_speed[False-backward] 2.2331ms 2.1571ms 463.5946 Ops/s 463.2328 Ops/s $\color{#35bf28}+0.08\%$
test_dqn_speed[True-None] 0.6077ms 0.5439ms 1.8386 KOps/s 1.7770 KOps/s $\color{#35bf28}+3.46\%$
test_dqn_speed[True-backward] 1.2276ms 1.1861ms 843.0867 Ops/s 841.3307 Ops/s $\color{#35bf28}+0.21\%$
test_dqn_speed[reduce-overhead-None] 0.6607ms 0.5677ms 1.7616 KOps/s 1.6099 KOps/s $\textbf{\color{#35bf28}+9.42\%}$
test_ddpg_speed[False-None] 3.2305ms 2.8489ms 351.0087 Ops/s 349.3012 Ops/s $\color{#35bf28}+0.49\%$
test_ddpg_speed[False-backward] 4.7223ms 4.2514ms 235.2154 Ops/s 237.4973 Ops/s $\color{#d91a1a}-0.96\%$
test_ddpg_speed[True-None] 1.4033ms 1.2802ms 781.1481 Ops/s 771.7045 Ops/s $\color{#35bf28}+1.22\%$
test_ddpg_speed[True-backward] 2.5258ms 2.4579ms 406.8515 Ops/s 428.1778 Ops/s $\color{#d91a1a}-4.98\%$
test_ddpg_speed[reduce-overhead-None] 1.3760ms 1.3042ms 766.7668 Ops/s 753.6600 Ops/s $\color{#35bf28}+1.74\%$
test_sac_speed[False-None] 8.8083ms 8.2438ms 121.3027 Ops/s 116.4663 Ops/s $\color{#35bf28}+4.15\%$
test_sac_speed[False-backward] 12.0310ms 11.4824ms 87.0896 Ops/s 87.6546 Ops/s $\color{#d91a1a}-0.64\%$
test_sac_speed[True-None] 1.8722ms 1.7663ms 566.1591 Ops/s 558.1934 Ops/s $\color{#35bf28}+1.43\%$
test_sac_speed[True-backward] 3.6661ms 3.5496ms 281.7194 Ops/s 293.0903 Ops/s $\color{#d91a1a}-3.88\%$
test_sac_speed[reduce-overhead-None] 18.9839ms 10.7907ms 92.6722 Ops/s 93.0885 Ops/s $\color{#d91a1a}-0.45\%$
test_redq_deprec_speed[False-None] 10.1286ms 9.3709ms 106.7128 Ops/s 106.9764 Ops/s $\color{#d91a1a}-0.25\%$
test_redq_deprec_speed[False-backward] 13.0364ms 12.5515ms 79.6718 Ops/s 80.8386 Ops/s $\color{#d91a1a}-1.44\%$
test_redq_deprec_speed[True-None] 2.5686ms 2.4663ms 405.4661 Ops/s 401.8642 Ops/s $\color{#35bf28}+0.90\%$
test_redq_deprec_speed[True-backward] 4.5960ms 4.2270ms 236.5723 Ops/s 242.9208 Ops/s $\color{#d91a1a}-2.61\%$
test_redq_deprec_speed[reduce-overhead-None] 16.4353ms 9.9802ms 100.1987 Ops/s 103.2683 Ops/s $\color{#d91a1a}-2.97\%$
test_td3_speed[False-None] 8.5907ms 8.1622ms 122.5164 Ops/s 115.6573 Ops/s $\textbf{\color{#35bf28}+5.93\%}$
test_td3_speed[False-backward] 11.1841ms 10.7072ms 93.3950 Ops/s 93.5975 Ops/s $\color{#d91a1a}-0.22\%$
test_td3_speed[True-None] 1.6327ms 1.5988ms 625.4799 Ops/s 626.0650 Ops/s $\color{#d91a1a}-0.09\%$
test_td3_speed[True-backward] 3.4246ms 3.2022ms 312.2834 Ops/s 322.6894 Ops/s $\color{#d91a1a}-3.22\%$
test_td3_speed[reduce-overhead-None] 65.2400ms 24.5901ms 40.6667 Ops/s 41.4467 Ops/s $\color{#d91a1a}-1.88\%$
test_cql_speed[False-None] 17.2842ms 17.0489ms 58.6550 Ops/s 58.1853 Ops/s $\color{#35bf28}+0.81\%$
test_cql_speed[False-backward] 23.1063ms 22.6500ms 44.1501 Ops/s 44.4264 Ops/s $\color{#d91a1a}-0.62\%$
test_cql_speed[True-None] 3.4541ms 3.2436ms 308.2995 Ops/s 312.5921 Ops/s $\color{#d91a1a}-1.37\%$
test_cql_speed[True-backward] 5.4539ms 5.3557ms 186.7165 Ops/s 188.7740 Ops/s $\color{#d91a1a}-1.09\%$
test_cql_speed[reduce-overhead-None] 18.9566ms 11.8418ms 84.4466 Ops/s 84.5094 Ops/s $\color{#d91a1a}-0.07\%$
test_a2c_speed[False-None] 4.1600ms 3.2076ms 311.7636 Ops/s 304.6415 Ops/s $\color{#35bf28}+2.34\%$
test_a2c_speed[False-backward] 6.7767ms 6.3579ms 157.2854 Ops/s 163.0205 Ops/s $\color{#d91a1a}-3.52\%$
test_a2c_speed[True-None] 1.3722ms 1.2977ms 770.5945 Ops/s 753.9855 Ops/s $\color{#35bf28}+2.20\%$
test_a2c_speed[True-backward] 3.1276ms 3.0493ms 327.9394 Ops/s 323.9886 Ops/s $\color{#35bf28}+1.22\%$
test_a2c_speed[reduce-overhead-None] 1.0590ms 0.9727ms 1.0280 KOps/s 1.0190 KOps/s $\color{#35bf28}+0.89\%$
test_ppo_speed[False-None] 3.9512ms 3.8104ms 262.4380 Ops/s 259.2319 Ops/s $\color{#35bf28}+1.24\%$
test_ppo_speed[False-backward] 7.5004ms 7.1282ms 140.2887 Ops/s 139.1331 Ops/s $\color{#35bf28}+0.83\%$
test_ppo_speed[True-None] 1.4832ms 1.3945ms 717.1004 Ops/s 708.1553 Ops/s $\color{#35bf28}+1.26\%$
test_ppo_speed[True-backward] 3.2318ms 3.1893ms 313.5477 Ops/s 327.3915 Ops/s $\color{#d91a1a}-4.23\%$
test_ppo_speed[reduce-overhead-None] 1.0952ms 1.0304ms 970.4633 Ops/s 932.5897 Ops/s $\color{#35bf28}+4.06\%$
test_reinforce_speed[False-None] 2.3210ms 2.2551ms 443.4313 Ops/s 434.2028 Ops/s $\color{#35bf28}+2.13\%$
test_reinforce_speed[False-backward] 3.4283ms 3.3880ms 295.1603 Ops/s 294.3604 Ops/s $\color{#35bf28}+0.27\%$
test_reinforce_speed[True-None] 1.3888ms 1.2584ms 794.6790 Ops/s 799.3350 Ops/s $\color{#d91a1a}-0.58\%$
test_reinforce_speed[True-backward] 3.1966ms 2.9791ms 335.6665 Ops/s 320.9346 Ops/s $\color{#35bf28}+4.59\%$
test_reinforce_speed[reduce-overhead-None] 17.5231ms 9.3123ms 107.3847 Ops/s 107.7515 Ops/s $\color{#d91a1a}-0.34\%$
test_iql_speed[False-None] 9.9649ms 9.3493ms 106.9597 Ops/s 106.2953 Ops/s $\color{#35bf28}+0.63\%$
test_iql_speed[False-backward] 13.7928ms 13.3053ms 75.1581 Ops/s 74.0191 Ops/s $\color{#35bf28}+1.54\%$
test_iql_speed[True-None] 2.2216ms 2.1246ms 470.6847 Ops/s 463.3204 Ops/s $\color{#35bf28}+1.59\%$
test_iql_speed[True-backward] 5.0022ms 4.7612ms 210.0326 Ops/s 211.1377 Ops/s $\color{#d91a1a}-0.52\%$
test_iql_speed[reduce-overhead-None] 17.8946ms 10.5346ms 94.9253 Ops/s 96.0105 Ops/s $\color{#d91a1a}-1.13\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8594ms 5.7287ms 174.5610 Ops/s 170.3884 Ops/s $\color{#35bf28}+2.45\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8453ms 0.2769ms 3.6111 KOps/s 2.6384 KOps/s $\textbf{\color{#35bf28}+36.87\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6342ms 0.2605ms 3.8382 KOps/s 2.7524 KOps/s $\textbf{\color{#35bf28}+39.45\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.8846ms 5.6536ms 176.8780 Ops/s 175.2952 Ops/s $\color{#35bf28}+0.90\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0691ms 0.2711ms 3.6885 KOps/s 3.6969 KOps/s $\color{#d91a1a}-0.23\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4705ms 0.2537ms 3.9415 KOps/s 3.9467 KOps/s $\color{#d91a1a}-0.13\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5369ms 1.2216ms 818.6307 Ops/s 826.6388 Ops/s $\color{#d91a1a}-0.97\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.3615ms 1.1358ms 880.4267 Ops/s 763.3247 Ops/s $\textbf{\color{#35bf28}+15.34\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1343ms 5.8003ms 172.4048 Ops/s 172.7446 Ops/s $\color{#d91a1a}-0.20\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.9416ms 0.4735ms 2.1118 KOps/s 1.8624 KOps/s $\textbf{\color{#35bf28}+13.39\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7068ms 0.4448ms 2.2480 KOps/s 1.9141 KOps/s $\textbf{\color{#35bf28}+17.44\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8327ms 5.6639ms 176.5565 Ops/s 177.2342 Ops/s $\color{#d91a1a}-0.38\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8110ms 0.2999ms 3.3342 KOps/s 2.5785 KOps/s $\textbf{\color{#35bf28}+29.31\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5233ms 0.3029ms 3.3016 KOps/s 3.8806 KOps/s $\textbf{\color{#d91a1a}-14.92\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.8735ms 5.6102ms 178.2483 Ops/s 177.3135 Ops/s $\color{#35bf28}+0.53\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8357ms 0.3302ms 3.0284 KOps/s 3.6541 KOps/s $\textbf{\color{#d91a1a}-17.12\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4331ms 0.2785ms 3.5907 KOps/s 3.9220 KOps/s $\textbf{\color{#d91a1a}-8.45\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.8877ms 5.7783ms 173.0598 Ops/s 169.7945 Ops/s $\color{#35bf28}+1.92\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0060ms 0.5057ms 1.9774 KOps/s 2.2535 KOps/s $\textbf{\color{#d91a1a}-12.25\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6512ms 0.4409ms 2.2680 KOps/s 2.2750 KOps/s $\color{#d91a1a}-0.31\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.6268s 17.4779ms 57.2152 Ops/s 196.1238 Ops/s $\textbf{\color{#d91a1a}-70.83\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 12.7153ms 1.9890ms 502.7687 Ops/s 542.3331 Ops/s $\textbf{\color{#d91a1a}-7.30\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.2288ms 0.8882ms 1.1259 KOps/s 1.1016 KOps/s $\color{#35bf28}+2.20\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 11.1572ms 5.1120ms 195.6194 Ops/s 197.3482 Ops/s $\color{#d91a1a}-0.88\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.7260ms 1.9380ms 515.9868 Ops/s 490.9780 Ops/s $\textbf{\color{#35bf28}+5.09\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 12.0176ms 1.3225ms 756.1436 Ops/s 879.6296 Ops/s $\textbf{\color{#d91a1a}-14.04\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5636s 16.4397ms 60.8285 Ops/s 184.9449 Ops/s $\textbf{\color{#d91a1a}-67.11\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.0864ms 1.9099ms 523.5799 Ops/s 494.8979 Ops/s $\textbf{\color{#35bf28}+5.80\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.8880ms 1.0717ms 933.1002 Ops/s 669.5781 Ops/s $\textbf{\color{#35bf28}+39.36\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 37.9656ms 35.3235ms 28.3098 Ops/s 27.8972 Ops/s $\color{#35bf28}+1.48\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.9312ms 18.1424ms 55.1196 Ops/s 56.3438 Ops/s $\color{#d91a1a}-2.17\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 39.8667ms 36.4002ms 27.4724 Ops/s 27.0843 Ops/s $\color{#35bf28}+1.43\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.0690ms 18.0216ms 55.4891 Ops/s 55.1077 Ops/s $\color{#35bf28}+0.69\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 39.5425ms 38.0947ms 26.2504 Ops/s 25.1310 Ops/s $\color{#35bf28}+4.45\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.4490ms 19.5071ms 51.2634 Ops/s 49.0054 Ops/s $\color{#35bf28}+4.61\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8743ms 0.2195ms 4.5566 KOps/s 4.4465 KOps/s $\color{#35bf28}+2.47\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.5559ms 1.3945ms 717.0788 Ops/s 699.0593 Ops/s $\color{#35bf28}+2.58\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.3833ms 2.2569ms 443.0864 Ops/s 431.6221 Ops/s $\color{#35bf28}+2.66\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0011ms 2.8735ms 348.0121 Ops/s 338.7304 Ops/s $\color{#35bf28}+2.74\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2717ms 0.1474ms 6.7848 KOps/s 6.6309 KOps/s $\color{#35bf28}+2.32\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3457ms 0.2017ms 4.9585 KOps/s 4.7254 KOps/s $\color{#35bf28}+4.93\%$
test_storage_write_contiguous[100-img_shape2-large_img] 1.8887ms 1.8131ms 551.5523 Ops/s 549.4327 Ops/s $\color{#35bf28}+0.39\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5110ms 1.3313ms 751.1569 Ops/s 771.3332 Ops/s $\color{#d91a1a}-2.62\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2696ms 1.1196ms 893.1537 Ops/s 906.5334 Ops/s $\color{#d91a1a}-1.48\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.6963ms 3.5248ms 283.7013 Ops/s 278.9101 Ops/s $\color{#35bf28}+1.72\%$
test_collector_stack_then_write[100-img_shape2-large_img] 5.9180ms 5.7050ms 175.2848 Ops/s 172.6446 Ops/s $\color{#35bf28}+1.53\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.2878ms 7.0973ms 140.8993 Ops/s 144.7511 Ops/s $\color{#d91a1a}-2.66\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4249ms 0.2777ms 3.6015 KOps/s 3.7300 KOps/s $\color{#d91a1a}-3.45\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6692ms 1.5230ms 656.5954 Ops/s 654.2516 Ops/s $\color{#35bf28}+0.36\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.5912ms 2.4074ms 415.3786 Ops/s 413.3534 Ops/s $\color{#35bf28}+0.49\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.4747ms 3.1203ms 320.4784 Ops/s 319.8720 Ops/s $\color{#35bf28}+0.19\%$
test_collector_without_rb[100-img_shape0-atari] 34.6965ms 33.8350ms 29.5552 Ops/s 30.0691 Ops/s $\color{#d91a1a}-1.71\%$
test_collector_without_rb[200-img_shape1-large_batch] 67.4672ms 66.1723ms 15.1121 Ops/s 15.3239 Ops/s $\color{#d91a1a}-1.38\%$
test_collector_with_rb[100-img_shape0-atari] 39.4974ms 38.2970ms 26.1117 Ops/s 26.5460 Ops/s $\color{#d91a1a}-1.64\%$
test_collector_with_rb[200-img_shape1-large_batch] 77.0890ms 75.7225ms 13.2061 Ops/s 13.5528 Ops/s $\color{#d91a1a}-2.56\%$
test_collector_without_rb_cuda[100-img_shape0-atari] 57.9786ms 56.5373ms 17.6874 Ops/s 17.9353 Ops/s $\color{#d91a1a}-1.38\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1147s 0.1121s 8.9225 Ops/s 9.0039 Ops/s $\color{#d91a1a}-0.90\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 0.7685s 97.8697ms 10.2177 Ops/s 17.1955 Ops/s $\textbf{\color{#d91a1a}-40.58\%}$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1200s 0.1173s 8.5281 Ops/s 8.7069 Ops/s $\color{#d91a1a}-2.05\%$

vmoens and others added 2 commits February 4, 2026 16:38
Simplify the multiprocessing start method logic to match torchrl_demo.py:
- Only set fork if no start method is already configured
- Remove confusing is_sphinx detection logic

This should reduce flakiness in CI caused by inconsistent multiprocessing
start method handling.

Co-authored-by: Cursor <cursoragent@cursor.com>
Standardize all tutorials to use the same multiprocessing pattern:
- Only set fork if no start method is already configured
- Remove confusing is_sphinx detection logic for multiprocessing
- Use get_start_method() for mp_context instead of is_sphinx conditional
- Consistent behavior when sphinx-gallery runs tutorials in parallel

This should fix flaky CI failures caused by tutorials trying to set
different multiprocessing start methods when run in parallel.

Updated tutorials:
- coding_ddpg.py
- coding_dqn.py
- coding_ppo.py
- dqn_with_rnn.py
- multi_task.py
- pretrained_models.py
- torchrl_envs.py

Co-authored-by: Cursor <cursoragent@cursor.com>
@vmoens vmoens merged commit 881e5ee into main Feb 4, 2026
117 of 118 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BugFix CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation tutorials/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant