Skip to content

[BugFix] ParallelEnv: fix shutdown hang with shared-memory done flags#3464

Merged
vmoens merged 1 commit intomainfrom
fix/parallel-env-shutdown-hang
Feb 8, 2026
Merged

[BugFix] ParallelEnv: fix shutdown hang with shared-memory done flags#3464
vmoens merged 1 commit intomainfrom
fix/parallel-env-shutdown-hang

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Feb 7, 2026

Summary

  • After the SHM done-flags optimization ([Perf] ParallelEnv: replace mp.Event with shared-memory done flags #3457), _shutdown_workers used _wait_for_workers which relies on connection_wait to detect pipe closure. However, the "close" handler closes the pipe without sending data, and connection_wait does not reliably detect socketpair closure on all platforms (macOS with forked workers), causing env.close() to hang indefinitely.
  • Fixes by reverting _shutdown_workers to mp.Event-based waiting, and ensuring the shared-memory worker signals mp_event in both the "close" and "load_state_dict" handlers alongside _signal_done().

Test plan

  • test/test_envs.py::TestNonTensorEnv::test_parallel no longer hangs (was reproducing reliably before fix)
  • Existing parallel env tests pass

Made with Cursor

After the SHM done-flags optimization, _shutdown_workers was changed
to use _wait_for_workers which relies on connection_wait to detect
pipe closure. However, the "close" handler closes the pipe without
sending data, and connection_wait does not reliably detect socketpair
closure on all platforms (notably macOS with forked workers).

Fix by reverting _shutdown_workers to mp.Event-based waiting, and
ensuring the shared-memory worker signals mp_event in both the
"close" and "load_state_dict" handlers alongside _signal_done().

Co-authored-by: Cursor <[email protected]>
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 7, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3464

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Pending

As of commit 741562e with merge base 73b853b (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 7, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 7, 2026

⚠️ PR Title Label Error

Unknown or invalid prefix [Fix].

Current title: [Fix] ParallelEnv: fix shutdown hang with shared-memory done flags

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

@github-actions
Copy link
Contributor

github-actions bot commented Feb 7, 2026

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 173. Improved: $\large\color{#35bf28}21$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 82.0226μs 80.6266μs 12.4029 KOps/s 12.5826 KOps/s $\color{#d91a1a}-1.43\%$
test_tensor_to_bytestream_speed[torch.save] 0.1394ms 0.1390ms 7.1955 KOps/s 7.1700 KOps/s $\color{#35bf28}+0.36\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1124s 0.1121s 8.9188 Ops/s 8.3761 Ops/s $\textbf{\color{#35bf28}+6.48\%}$
test_tensor_to_bytestream_speed[numpy] 2.5756μs 2.5638μs 390.0477 KOps/s 372.2061 KOps/s $\color{#35bf28}+4.79\%$
test_tensor_to_bytestream_speed[safetensors] 36.2313μs 35.9388μs 27.8251 KOps/s 27.7539 KOps/s $\color{#35bf28}+0.26\%$
test_simple 0.5489s 0.5478s 1.8256 Ops/s 1.7259 Ops/s $\textbf{\color{#35bf28}+5.78\%}$
test_transformed 1.2481s 1.1542s 0.8664 Ops/s 0.8562 Ops/s $\color{#35bf28}+1.19\%$
test_serial 1.7354s 1.6961s 0.5896 Ops/s 0.5825 Ops/s $\color{#35bf28}+1.22\%$
test_parallel 1.1436s 1.0528s 0.9499 Ops/s 0.9428 Ops/s $\color{#35bf28}+0.75\%$
test_step_mdp_speed[True-True-True-True-True] 0.2515ms 44.1133μs 22.6689 KOps/s 23.1151 KOps/s $\color{#d91a1a}-1.93\%$
test_step_mdp_speed[True-True-True-True-False] 58.5510μs 25.1201μs 39.8087 KOps/s 40.2668 KOps/s $\color{#d91a1a}-1.14\%$
test_step_mdp_speed[True-True-True-False-True] 59.5610μs 24.7204μs 40.4524 KOps/s 40.3075 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[True-True-True-False-False] 43.6110μs 13.7760μs 72.5898 KOps/s 73.0118 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[True-True-False-True-True] 0.1060ms 45.8786μs 21.7967 KOps/s 21.2595 KOps/s $\color{#35bf28}+2.53\%$
test_step_mdp_speed[True-True-False-True-False] 99.1710μs 27.5710μs 36.2700 KOps/s 36.4761 KOps/s $\color{#d91a1a}-0.56\%$
test_step_mdp_speed[True-True-False-False-True] 61.8600μs 27.4813μs 36.3884 KOps/s 36.2759 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[True-True-False-False-False] 98.2220μs 16.7291μs 59.7762 KOps/s 60.8879 KOps/s $\color{#d91a1a}-1.83\%$
test_step_mdp_speed[True-False-True-True-True] 0.1108ms 50.1160μs 19.9537 KOps/s 20.1681 KOps/s $\color{#d91a1a}-1.06\%$
test_step_mdp_speed[True-False-True-True-False] 60.9310μs 30.9654μs 32.2941 KOps/s 32.7524 KOps/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[True-False-True-False-True] 67.0310μs 27.2909μs 36.6423 KOps/s 35.6723 KOps/s $\color{#35bf28}+2.72\%$
test_step_mdp_speed[True-False-True-False-False] 47.4800μs 16.7099μs 59.8447 KOps/s 60.3851 KOps/s $\color{#d91a1a}-0.89\%$
test_step_mdp_speed[True-False-False-True-True] 0.1476ms 50.8768μs 19.6553 KOps/s 19.0157 KOps/s $\color{#35bf28}+3.36\%$
test_step_mdp_speed[True-False-False-True-False] 79.9810μs 33.0914μs 30.2193 KOps/s 30.2339 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[True-False-False-False-True] 65.6510μs 30.0865μs 33.2375 KOps/s 33.1485 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[True-False-False-False-False] 48.3300μs 19.2061μs 52.0667 KOps/s 51.9612 KOps/s $\color{#35bf28}+0.20\%$
test_step_mdp_speed[False-True-True-True-True] 97.0110μs 49.2469μs 20.3059 KOps/s 19.6761 KOps/s $\color{#35bf28}+3.20\%$
test_step_mdp_speed[False-True-True-True-False] 65.0210μs 30.2465μs 33.0617 KOps/s 32.9647 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[False-True-True-False-True] 2.3910ms 32.0749μs 31.1770 KOps/s 31.5390 KOps/s $\color{#d91a1a}-1.15\%$
test_step_mdp_speed[False-True-True-False-False] 60.1210μs 18.2871μs 54.6832 KOps/s 55.1914 KOps/s $\color{#d91a1a}-0.92\%$
test_step_mdp_speed[False-True-False-True-True] 92.2720μs 52.7575μs 18.9546 KOps/s 19.3514 KOps/s $\color{#d91a1a}-2.05\%$
test_step_mdp_speed[False-True-False-True-False] 74.6410μs 33.2607μs 30.0655 KOps/s 30.5554 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[False-True-False-False-True] 0.1127ms 33.8193μs 29.5689 KOps/s 29.5903 KOps/s $\color{#d91a1a}-0.07\%$
test_step_mdp_speed[False-True-False-False-False] 57.5510μs 20.9200μs 47.8012 KOps/s 48.5808 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[False-False-True-True-True] 0.1170ms 55.8506μs 17.9049 KOps/s 18.1787 KOps/s $\color{#d91a1a}-1.51\%$
test_step_mdp_speed[False-False-True-True-False] 86.8910μs 36.5880μs 27.3314 KOps/s 27.9362 KOps/s $\color{#d91a1a}-2.17\%$
test_step_mdp_speed[False-False-True-False-True] 86.8910μs 33.9604μs 29.4461 KOps/s 29.4299 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[False-False-True-False-False] 86.0610μs 21.0506μs 47.5046 KOps/s 47.9659 KOps/s $\color{#d91a1a}-0.96\%$
test_step_mdp_speed[False-False-False-True-True] 0.1073ms 57.2649μs 17.4627 KOps/s 17.5533 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[False-False-False-True-False] 82.6020μs 38.6358μs 25.8827 KOps/s 26.1991 KOps/s $\color{#d91a1a}-1.21\%$
test_step_mdp_speed[False-False-False-False-True] 87.6410μs 36.5442μs 27.3642 KOps/s 27.7156 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[False-False-False-False-False] 64.5910μs 23.4454μs 42.6523 KOps/s 43.2797 KOps/s $\color{#d91a1a}-1.45\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8616s 0.7650s 1.3072 Ops/s 1.2969 Ops/s $\color{#35bf28}+0.80\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7230s 0.6278s 1.5928 Ops/s 1.5705 Ops/s $\color{#35bf28}+1.42\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7489s 1.6678s 0.5996 Ops/s 0.5929 Ops/s $\color{#35bf28}+1.12\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5260s 1.4476s 0.6908 Ops/s 0.6847 Ops/s $\color{#35bf28}+0.89\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9988s 1.9193s 0.5210 Ops/s 0.5149 Ops/s $\color{#35bf28}+1.18\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7764s 1.6975s 0.5891 Ops/s 0.5812 Ops/s $\color{#35bf28}+1.37\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7839s 4.6851s 0.2134 Ops/s 0.2146 Ops/s $\color{#d91a1a}-0.52\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5966s 4.5284s 0.2208 Ops/s 0.2204 Ops/s $\color{#35bf28}+0.21\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9613s 1.8943s 0.5279 Ops/s 0.5209 Ops/s $\color{#35bf28}+1.34\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7494s 1.6446s 0.6080 Ops/s 0.6089 Ops/s $\color{#d91a1a}-0.14\%$
test_values[generalized_advantage_estimate-True-True] 10.9791ms 10.5338ms 94.9326 Ops/s 89.9087 Ops/s $\textbf{\color{#35bf28}+5.59\%}$
test_values[vec_generalized_advantage_estimate-True-True] 19.8694ms 17.9533ms 55.7001 Ops/s 55.5185 Ops/s $\color{#35bf28}+0.33\%$
test_values[td0_return_estimate-False-False] 0.2097ms 0.1315ms 7.6033 KOps/s 7.7038 KOps/s $\color{#d91a1a}-1.30\%$
test_values[td1_return_estimate-False-False] 30.9170ms 28.9288ms 34.5676 Ops/s 32.6163 Ops/s $\textbf{\color{#35bf28}+5.98\%}$
test_values[vec_td1_return_estimate-False-False] 19.2665ms 18.0992ms 55.2511 Ops/s 54.7974 Ops/s $\color{#35bf28}+0.83\%$
test_values[td_lambda_return_estimate-True-False] 45.2819ms 42.6885ms 23.4255 Ops/s 22.0003 Ops/s $\textbf{\color{#35bf28}+6.48\%}$
test_values[vec_td_lambda_return_estimate-True-False] 19.1562ms 18.0235ms 55.4831 Ops/s 55.2647 Ops/s $\color{#35bf28}+0.40\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.4608ms 9.3318ms 107.1609 Ops/s 99.9786 Ops/s $\textbf{\color{#35bf28}+7.18\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.8436ms 1.5680ms 637.7726 Ops/s 636.2136 Ops/s $\color{#35bf28}+0.25\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5540ms 0.4334ms 2.3071 KOps/s 2.2279 KOps/s $\color{#35bf28}+3.55\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 35.2415ms 34.6569ms 28.8542 Ops/s 28.4196 Ops/s $\color{#35bf28}+1.53\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1.8778ms 1.7531ms 570.4174 Ops/s 567.8963 Ops/s $\color{#35bf28}+0.44\%$
test_dqn_speed[False-None] 1.5241ms 1.4118ms 708.3340 Ops/s 695.5812 Ops/s $\color{#35bf28}+1.83\%$
test_dqn_speed[False-backward] 2.0485ms 1.9433ms 514.5971 Ops/s 510.5470 Ops/s $\color{#35bf28}+0.79\%$
test_dqn_speed[True-None] 0.6650ms 0.5621ms 1.7791 KOps/s 1.7207 KOps/s $\color{#35bf28}+3.39\%$
test_dqn_speed[True-backward] 1.1015ms 1.0346ms 966.5474 Ops/s 874.8548 Ops/s $\textbf{\color{#35bf28}+10.48\%}$
test_dqn_speed[reduce-overhead-None] 0.7106ms 0.5532ms 1.8078 KOps/s 1.7839 KOps/s $\color{#35bf28}+1.34\%$
test_ddpg_speed[False-None] 3.1787ms 2.8993ms 344.9052 Ops/s 345.7039 Ops/s $\color{#d91a1a}-0.23\%$
test_ddpg_speed[False-backward] 4.2668ms 4.1679ms 239.9280 Ops/s 240.5488 Ops/s $\color{#d91a1a}-0.26\%$
test_ddpg_speed[True-None] 1.5978ms 1.4487ms 690.2876 Ops/s 689.6902 Ops/s $\color{#35bf28}+0.09\%$
test_ddpg_speed[True-backward] 2.5758ms 2.4685ms 405.0962 Ops/s 353.2529 Ops/s $\textbf{\color{#35bf28}+14.68\%}$
test_ddpg_speed[reduce-overhead-None] 1.5903ms 1.4405ms 694.1899 Ops/s 665.3126 Ops/s $\color{#35bf28}+4.34\%$
test_sac_speed[False-None] 8.7629ms 8.1989ms 121.9672 Ops/s 121.7378 Ops/s $\color{#35bf28}+0.19\%$
test_sac_speed[False-backward] 12.2319ms 11.6457ms 85.8684 Ops/s 86.4500 Ops/s $\color{#d91a1a}-0.67\%$
test_sac_speed[True-None] 2.4185ms 2.2099ms 452.5157 Ops/s 447.5352 Ops/s $\color{#35bf28}+1.11\%$
test_sac_speed[True-backward] 4.2329ms 4.1433ms 241.3556 Ops/s 204.9417 Ops/s $\textbf{\color{#35bf28}+17.77\%}$
test_sac_speed[reduce-overhead-None] 2.3214ms 2.2073ms 453.0444 Ops/s 438.3236 Ops/s $\color{#35bf28}+3.36\%$
test_redq_speed[False-None] 15.7286ms 10.9651ms 91.1982 Ops/s 87.5744 Ops/s $\color{#35bf28}+4.14\%$
test_redq_speed[False-backward] 19.0748ms 17.9895ms 55.5879 Ops/s 54.3318 Ops/s $\color{#35bf28}+2.31\%$
test_redq_speed[True-None] 4.7408ms 4.5467ms 219.9414 Ops/s 214.3928 Ops/s $\color{#35bf28}+2.59\%$
test_redq_speed[True-backward] 10.2965ms 10.0685ms 99.3201 Ops/s 97.5472 Ops/s $\color{#35bf28}+1.82\%$
test_redq_speed[reduce-overhead-None] 4.7913ms 4.5051ms 221.9691 Ops/s 218.5061 Ops/s $\color{#35bf28}+1.58\%$
test_redq_deprec_speed[False-None] 11.9158ms 11.3622ms 88.0115 Ops/s 87.7071 Ops/s $\color{#35bf28}+0.35\%$
test_redq_deprec_speed[False-backward] 16.6904ms 16.2880ms 61.3949 Ops/s 60.6024 Ops/s $\color{#35bf28}+1.31\%$
test_redq_deprec_speed[True-None] 4.2441ms 3.8273ms 261.2820 Ops/s 259.9169 Ops/s $\color{#35bf28}+0.53\%$
test_redq_deprec_speed[True-backward] 8.2297ms 7.9538ms 125.7267 Ops/s 119.5643 Ops/s $\textbf{\color{#35bf28}+5.15\%}$
test_redq_deprec_speed[reduce-overhead-None] 3.9755ms 3.6915ms 270.8956 Ops/s 253.2144 Ops/s $\textbf{\color{#35bf28}+6.98\%}$
test_td3_speed[False-None] 8.4493ms 8.1926ms 122.0617 Ops/s 122.3539 Ops/s $\color{#d91a1a}-0.24\%$
test_td3_speed[False-backward] 11.6297ms 11.2125ms 89.1863 Ops/s 89.8117 Ops/s $\color{#d91a1a}-0.70\%$
test_td3_speed[True-None] 1.9486ms 1.8931ms 528.2406 Ops/s 526.9732 Ops/s $\color{#35bf28}+0.24\%$
test_td3_speed[True-backward] 3.9258ms 3.7480ms 266.8124 Ops/s 238.3071 Ops/s $\textbf{\color{#35bf28}+11.96\%}$
test_td3_speed[reduce-overhead-None] 1.8802ms 1.8525ms 539.8122 Ops/s 530.2465 Ops/s $\color{#35bf28}+1.80\%$
test_cql_speed[False-None] 30.1306ms 26.9623ms 37.0888 Ops/s 38.0795 Ops/s $\color{#d91a1a}-2.60\%$
test_cql_speed[False-backward] 38.9717ms 36.0393ms 27.7475 Ops/s 27.3540 Ops/s $\color{#35bf28}+1.44\%$
test_cql_speed[True-None] 13.0591ms 12.8078ms 78.0774 Ops/s 77.2552 Ops/s $\color{#35bf28}+1.06\%$
test_cql_speed[True-backward] 19.6760ms 18.6924ms 53.4977 Ops/s 54.8573 Ops/s $\color{#d91a1a}-2.48\%$
test_cql_speed[reduce-overhead-None] 15.7546ms 12.9292ms 77.3444 Ops/s 77.9409 Ops/s $\color{#d91a1a}-0.77\%$
test_a2c_speed[False-None] 5.6722ms 5.4985ms 181.8672 Ops/s 177.8685 Ops/s $\color{#35bf28}+2.25\%$
test_a2c_speed[False-backward] 12.5358ms 12.2353ms 81.7305 Ops/s 80.8859 Ops/s $\color{#35bf28}+1.04\%$
test_a2c_speed[True-None] 3.9826ms 3.7642ms 265.6618 Ops/s 250.4485 Ops/s $\textbf{\color{#35bf28}+6.07\%}$
test_a2c_speed[True-backward] 9.5314ms 8.9341ms 111.9310 Ops/s 111.7385 Ops/s $\color{#35bf28}+0.17\%$
test_a2c_speed[reduce-overhead-None] 3.9382ms 3.7851ms 264.1906 Ops/s 264.3525 Ops/s $\color{#d91a1a}-0.06\%$
test_ppo_speed[False-None] 6.6627ms 6.0269ms 165.9218 Ops/s 165.3119 Ops/s $\color{#35bf28}+0.37\%$
test_ppo_speed[False-backward] 13.1750ms 12.9284ms 77.3492 Ops/s 77.7190 Ops/s $\color{#d91a1a}-0.48\%$
test_ppo_speed[True-None] 4.1003ms 3.7115ms 269.4294 Ops/s 256.7617 Ops/s $\color{#35bf28}+4.93\%$
test_ppo_speed[True-backward] 8.9292ms 8.7388ms 114.4317 Ops/s 101.9981 Ops/s $\textbf{\color{#35bf28}+12.19\%}$
test_ppo_speed[reduce-overhead-None] 3.8290ms 3.6966ms 270.5202 Ops/s 269.9580 Ops/s $\color{#35bf28}+0.21\%$
test_reinforce_speed[False-None] 4.9695ms 4.6785ms 213.7439 Ops/s 215.0863 Ops/s $\color{#d91a1a}-0.62\%$
test_reinforce_speed[False-backward] 7.9268ms 7.6294ms 131.0713 Ops/s 132.7341 Ops/s $\color{#d91a1a}-1.25\%$
test_reinforce_speed[True-None] 3.1122ms 2.9573ms 338.1471 Ops/s 335.2752 Ops/s $\color{#35bf28}+0.86\%$
test_reinforce_speed[True-backward] 8.2247ms 7.9739ms 125.4097 Ops/s 115.3286 Ops/s $\textbf{\color{#35bf28}+8.74\%}$
test_reinforce_speed[reduce-overhead-None] 3.0723ms 2.9667ms 337.0720 Ops/s 325.3985 Ops/s $\color{#35bf28}+3.59\%$
test_iql_speed[False-None] 27.0052ms 20.7905ms 48.0989 Ops/s 47.2792 Ops/s $\color{#35bf28}+1.73\%$
test_iql_speed[False-backward] 37.8586ms 31.5088ms 31.7372 Ops/s 31.7134 Ops/s $\color{#35bf28}+0.08\%$
test_iql_speed[True-None] 9.2332ms 8.8118ms 113.4839 Ops/s 108.9465 Ops/s $\color{#35bf28}+4.16\%$
test_iql_speed[True-backward] 17.7468ms 17.2817ms 57.8648 Ops/s 57.5011 Ops/s $\color{#35bf28}+0.63\%$
test_iql_speed[reduce-overhead-None] 10.3884ms 8.9879ms 111.2603 Ops/s 112.4171 Ops/s $\color{#d91a1a}-1.03\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1636ms 6.0091ms 166.4131 Ops/s 165.7785 Ops/s $\color{#35bf28}+0.38\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.8248ms 0.2963ms 3.3753 KOps/s 3.1376 KOps/s $\textbf{\color{#35bf28}+7.57\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6840ms 0.3215ms 3.1101 KOps/s 3.3431 KOps/s $\textbf{\color{#d91a1a}-6.97\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0019ms 5.7625ms 173.5347 Ops/s 172.5229 Ops/s $\color{#35bf28}+0.59\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.2021ms 0.3512ms 2.8477 KOps/s 2.7910 KOps/s $\color{#35bf28}+2.03\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5988ms 0.3444ms 2.9032 KOps/s 3.7161 KOps/s $\textbf{\color{#d91a1a}-21.87\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6182ms 1.4260ms 701.2757 Ops/s 763.6916 Ops/s $\textbf{\color{#d91a1a}-8.17\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6066ms 1.3397ms 746.4293 Ops/s 814.3281 Ops/s $\textbf{\color{#d91a1a}-8.34\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 9.3412ms 6.1092ms 163.6868 Ops/s 168.6841 Ops/s $\color{#d91a1a}-2.96\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2040ms 0.4854ms 2.0600 KOps/s 2.1440 KOps/s $\color{#d91a1a}-3.92\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7760ms 0.4760ms 2.1007 KOps/s 2.3106 KOps/s $\textbf{\color{#d91a1a}-9.09\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8834ms 5.7524ms 173.8397 Ops/s 172.9597 Ops/s $\color{#35bf28}+0.51\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6246ms 0.3931ms 2.5436 KOps/s 2.8651 KOps/s $\textbf{\color{#d91a1a}-11.22\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6238ms 0.3835ms 2.6073 KOps/s 3.0549 KOps/s $\textbf{\color{#d91a1a}-14.65\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9610ms 5.6930ms 175.6558 Ops/s 174.0152 Ops/s $\color{#35bf28}+0.94\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1382ms 0.3558ms 2.8107 KOps/s 2.7538 KOps/s $\color{#35bf28}+2.07\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5507ms 0.3202ms 3.1233 KOps/s 3.1120 KOps/s $\color{#35bf28}+0.37\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0232ms 5.9226ms 168.8447 Ops/s 169.3701 Ops/s $\color{#d91a1a}-0.31\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0211ms 0.4846ms 2.0635 KOps/s 1.9995 KOps/s $\color{#35bf28}+3.20\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7790ms 0.4656ms 2.1479 KOps/s 2.2071 KOps/s $\color{#d91a1a}-2.68\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.3882ms 4.9801ms 200.7992 Ops/s 56.5031 Ops/s $\textbf{\color{#35bf28}+255.38\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.5805ms 2.1995ms 454.6412 Ops/s 474.0532 Ops/s $\color{#d91a1a}-4.09\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.0999ms 0.8992ms 1.1121 KOps/s 1.1020 KOps/s $\color{#35bf28}+0.92\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5381s 15.7710ms 63.4074 Ops/s 196.0703 Ops/s $\textbf{\color{#d91a1a}-67.66\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 11.0045ms 1.9434ms 514.5553 Ops/s 482.9743 Ops/s $\textbf{\color{#35bf28}+6.54\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.6868ms 1.2084ms 827.5069 Ops/s 842.3026 Ops/s $\color{#d91a1a}-1.76\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.7258ms 5.2593ms 190.1397 Ops/s 59.7655 Ops/s $\textbf{\color{#35bf28}+218.14\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.4663ms 1.9517ms 512.3718 Ops/s 478.9001 Ops/s $\textbf{\color{#35bf28}+6.99\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.2314ms 1.0075ms 992.5737 Ops/s 930.0427 Ops/s $\textbf{\color{#35bf28}+6.72\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 41.5408ms 36.7893ms 27.1818 Ops/s 27.0542 Ops/s $\color{#35bf28}+0.47\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.9371ms 19.1964ms 52.0930 Ops/s 51.3538 Ops/s $\color{#35bf28}+1.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 41.7631ms 38.1035ms 26.2443 Ops/s 26.1970 Ops/s $\color{#35bf28}+0.18\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 21.5058ms 19.1755ms 52.1499 Ops/s 51.7297 Ops/s $\color{#35bf28}+0.81\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 42.8130ms 39.9585ms 25.0260 Ops/s 25.2472 Ops/s $\color{#d91a1a}-0.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 22.5813ms 20.6920ms 48.3278 Ops/s 48.9077 Ops/s $\color{#d91a1a}-1.19\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8617ms 0.2268ms 4.4092 KOps/s 4.4446 KOps/s $\color{#d91a1a}-0.80\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.8014ms 1.4334ms 697.6276 Ops/s 709.0296 Ops/s $\color{#d91a1a}-1.61\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.9392ms 2.3660ms 422.6523 Ops/s 426.5003 Ops/s $\color{#d91a1a}-0.90\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0858ms 2.9544ms 338.4836 Ops/s 341.0353 Ops/s $\color{#d91a1a}-0.75\%$
test_storage_write_contiguous[50-img_shape0-small] 0.5032ms 0.1353ms 7.3935 KOps/s 7.3600 KOps/s $\color{#35bf28}+0.45\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3416ms 0.1818ms 5.4990 KOps/s 5.2735 KOps/s $\color{#35bf28}+4.28\%$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9791ms 1.8029ms 554.6664 Ops/s 565.0495 Ops/s $\color{#d91a1a}-1.84\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.7799ms 1.3559ms 737.4994 Ops/s 760.9802 Ops/s $\color{#d91a1a}-3.09\%$
test_collector_stack_then_write[50-img_shape0-small] 1.3671ms 1.1148ms 896.9918 Ops/s 898.1647 Ops/s $\color{#d91a1a}-0.13\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.6375ms 3.5169ms 284.3437 Ops/s 283.2367 Ops/s $\color{#35bf28}+0.39\%$
test_collector_stack_then_write[100-img_shape2-large_img] 6.1738ms 5.7809ms 172.9834 Ops/s 176.4991 Ops/s $\color{#d91a1a}-1.99\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.5562ms 7.2412ms 138.0982 Ops/s 143.8447 Ops/s $\color{#d91a1a}-3.99\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4270ms 0.2772ms 3.6075 KOps/s 3.5151 KOps/s $\color{#35bf28}+2.63\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.7137ms 1.5465ms 646.6023 Ops/s 641.4753 Ops/s $\color{#35bf28}+0.80\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.8696ms 2.4603ms 406.4589 Ops/s 402.5203 Ops/s $\color{#35bf28}+0.98\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.3759ms 3.1805ms 314.4144 Ops/s 320.9981 Ops/s $\color{#d91a1a}-2.05\%$
test_collector_without_rb[100-img_shape0-atari] 34.9888ms 34.6069ms 28.8959 Ops/s 28.9492 Ops/s $\color{#d91a1a}-0.18\%$
test_collector_without_rb[200-img_shape1-large_batch] 68.3726ms 67.8845ms 14.7309 Ops/s 14.6883 Ops/s $\color{#35bf28}+0.29\%$
test_collector_with_rb[100-img_shape0-atari] 39.9256ms 39.2542ms 25.4749 Ops/s 25.7756 Ops/s $\color{#d91a1a}-1.17\%$
test_collector_with_rb[200-img_shape1-large_batch] 78.0240ms 77.2223ms 12.9496 Ops/s 13.1930 Ops/s $\color{#d91a1a}-1.84\%$

@vmoens vmoens changed the title [Fix] ParallelEnv: fix shutdown hang with shared-memory done flags [BugFix] ParallelEnv: fix shutdown hang with shared-memory done flags Feb 8, 2026
@github-actions github-actions bot added the BugFix label Feb 8, 2026
@vmoens vmoens merged commit 5e827b6 into main Feb 8, 2026
136 of 141 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BugFix CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant