Resets recurrent state after episode termination in RSL-RL play.py (#3838)

bikcrum · web-flow · commit e69b6d9bbb59 · 2025-11-03T10:36:44.000+01:00
# Description This PR fixes an issue in recurrent policy evaluation where the recurrent state was not being reset after an episode termination. The missing reset caused residual memory to persist between episodes. The fix ensures that `reset()` is now called during evaluation in `play.py` for policy networks, including recurrent. Fixes #3837  ## Type of change - Bug fix (non-breaking change which fixes an issue) ## Screenshots N/A ## Checklist - [x] I have read and understood the [contribution guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html) - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation where necessary - [x] My changes generate no new warnings - [x] I have added tests verifying that recurrent states are correctly reset during evaluation - [x] I have updated the changelog and corresponding version in the extension’s `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there
diff --git a/scripts/reinforcement_learning/rsl_rl/play.py b/scripts/reinforcement_learning/rsl_rl/play.py
@@ -185,7 +185,9 @@ def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agen
             # agent stepping
             actions = policy(obs)
             # env stepping
-            obs, _, _, _ = env.step(actions)
+            obs, _, dones, _ = env.step(actions)
+            # reset recurrent states for episodes that have terminated
+            policy_nn.reset(dones)
         if args_cli.video:
             timestep += 1
             # Exit the play loop after recording one video
diff --git a/source/isaaclab/config/extension.toml b/source/isaaclab/config/extension.toml
@@ -1,7 +1,7 @@
 [package]
 
 # Note: Semantic Versioning is used: https://semver.org/
-version = "0.47.5"
+version = "0.47.6"
 
 # Description
 title = "Isaac Lab framework for Robot Learning"
diff --git a/source/isaaclab/docs/CHANGELOG.rst b/source/isaaclab/docs/CHANGELOG.rst
@@ -1,6 +1,16 @@
 Changelog
 ---------
 
+
+0.47.6 (2025-11-01)
+~~~~~~~~~~~~~~~~~~~~
+
+Fixed
+^^^^^
+
+* Fixed an issue in recurrent policy evaluation in RSL-RL framework where the recurrent state was not reset after an episode termination.
+
+
 0.47.5 (2025-10-30)
 ~~~~~~~~~~~~~~~~~~~