[Question] Is there any where to save the current state of my RL env and restore from it #2134

HilbertXu · 2025-03-22T15:13:09Z

HilbertXu
Mar 22, 2025

Question

Hi,
I'm dealing with a robot manipulation RL envs and I'm wondering if it is possible to save the current state of the whole environment and then restore from the save state?
For example, I have to lift a cube to a sequence of target pose, once the robot successed to lift the cube to the first target, save the current state. If the robot failed to move the cube to the next target, reset the env not from the very beginning but from the saved state.

I found some related apis but not sure whether they are the correct one. Seems like only InteractiveScene and ManagerBasedRLEnv
InteractiveScene.get_state, ManagerBasedEnv.reset_to.

I tried to implement one in my direct RL env, however, I noticed that 'InteractiveScene.get_state()' function gets state of all envs, but 'InteractiveScene.reset_to()' receives an additional 'env_ids' parameter to control which envs to reset. In the below 'reset_to' function, the 'env_ids' should also be applied to the saved states, so that it can work normally, but currently no. Please check the following reset_to function code.

def reset_to(
        self,
        state: dict[str, dict[str, dict[str, torch.Tensor]]],
        env_ids: Sequence[int] | None = None,
        is_relative: bool = False,
    ):
        """Resets the scene entities to the given state.

        Args:
            state: The state to reset the scene entities to.
            env_ids: The indices of the environments to reset.
                Defaults to None (all instances).
            is_relative: If set to True, the state is considered relative to the environment origins.
        """
        if env_ids is None:
            env_ids = slice(None)
        # articulations
        for asset_name, articulation in self._articulations.items():
            asset_state = state["articulation"][asset_name]
            # root state
            root_pose = asset_state["root_pose"].clone()
            if is_relative:
                root_pose[:, :3] += self.env_origins[env_ids]
            root_velocity = asset_state["root_velocity"].clone()
            articulation.write_root_pose_to_sim(root_pose, env_ids=env_ids)
            articulation.write_root_velocity_to_sim(root_velocity, env_ids=env_ids)
            # joint state
            joint_position = asset_state["joint_position"].clone()
            joint_velocity = asset_state["joint_velocity"].clone()
            articulation.write_joint_state_to_sim(joint_position, joint_velocity, env_ids=env_ids)
            articulation.set_joint_position_target(joint_position, env_ids=env_ids)
            articulation.set_joint_velocity_target(joint_velocity, env_ids=env_ids)
        # deformable objects
        for asset_name, deformable_object in self._deformable_objects.items():
            asset_state = state["deformable_object"][asset_name]
            nodal_position = asset_state["nodal_position"].clone()
            if is_relative:
                nodal_position[:, :3] += self.env_origins[env_ids]
            nodal_velocity = asset_state["nodal_velocity"].clone()
            deformable_object.write_nodal_pos_to_sim(nodal_position, env_ids=env_ids)
            deformable_object.write_nodal_velocity_to_sim(nodal_velocity, env_ids=env_ids)
        # rigid objects
        for asset_name, rigid_object in self._rigid_objects.items():
            asset_state = state["rigid_object"][asset_name]
            root_pose = asset_state["root_pose"].clone()
            if is_relative:
                root_pose[:, :3] += self.env_origins[env_ids]
            root_velocity = asset_state["root_velocity"].clone()
            rigid_object.write_root_pose_to_sim(root_pose, env_ids=env_ids)
            rigid_object.write_root_velocity_to_sim(root_velocity, env_ids=env_ids)
        self.write_data_to_sim()

In articulation.write_root_pose_to_sim(root_pose, env_ids=env_ids), root_pose comes from the saved state of all envs. Then look at how 'articulation.write_root_pose_to_sim' works:

def write_root_pose_to_sim(self, root_pose: torch.Tensor, env_ids: Sequence[int] | None = None):
        """Set the root pose over selected environment indices into the simulation.

        The root pose comprises of the cartesian position and quaternion orientation in (w, x, y, z).

        Args:
            root_pose: Root poses in simulation frame. Shape is (len(env_ids), 7).
            env_ids: Environment indices. If None, then all indices are used.
        """
        # resolve all indices
        physx_env_ids = env_ids
        if env_ids is None:
            env_ids = slice(None)
            physx_env_ids = self._ALL_INDICES
        # note: we need to do this here since tensors are not set into simulation until step.
        # set into internal buffers
        self._data.root_state_w[env_ids, :7] = root_pose.clone()
        # convert root quaternion from wxyz to xyzw
        root_poses_xyzw = self._data.root_state_w[:, :7].clone()
        root_poses_xyzw[:, 3:] = math_utils.convert_quat(root_poses_xyzw[:, 3:], to="xyzw")
        # Need to invalidate the buffer to trigger the update with the new root pose.
        self._data._body_state_w.timestamp = -1.0
        self._data._body_link_state_w.timestamp = -1.0
        self._data._body_com_state_w.timestamp = -1.0
        # set into simulation
        self.root_physx_view.set_root_transforms(root_poses_xyzw, indices=physx_env_ids)

In this line, self._data.root_state_w[env_ids, :7] = root_pose.clone() if env_ids is not None, then it is trying to set root_pose of all envs to the selected envs, which will directly raise shape mismatch error. The same thing also happens to 'set_joint_position_target', 'set_joint_velocity_target', 'write_root_velocity_to_sim' functions

I think this might be a bug that needs to be fixed.

Cheers,
Yucheng

RandomOakForest · 2025-03-23T18:35:44Z

RandomOakForest
Mar 23, 2025
Maintainer

Thank you for posting this. I will move it to our Discussions section for the team to follow up.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] Is there any where to save the current state of my RL env and restore from it #2134

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Question] Is there any where to save the current state of my RL env and restore from it #2134

Uh oh!

HilbertXu Mar 22, 2025

Question

Replies: 1 comment

Uh oh!

RandomOakForest Mar 23, 2025 Maintainer

HilbertXu
Mar 22, 2025

RandomOakForest
Mar 23, 2025
Maintainer