You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Adds new curriculum mdp that allows modification on any environment parameters (#2777)
# Description
This PR created two curriculum mdp that can change any parameter in env
instance.
namely `modify_term_cfg` and `modify_env_param`.
`modify_env_param` is a more general version that can override any value
belongs to env, but requires user to know the full path to the value.
`modify_term_cfg` only work with manager_term, but is a more user
friendly version that simplify path specification, for example, instead
of write "observation_manager.cfg.policy.joint_pos.noise", you instead
write "observations.policy.joint_pos.noise", consistent with hydra
overriding style
Besides path to value is needed, modify_fn, modify_params is also needed
for telling the term how to modify.
Demo 1: difficulty-adaptive modification for all python native data type
```
# iv -> initial value, fv -> final value
def initial_final_interpolate_fn(env: ManagerBasedRLEnv, env_id, data, iv, fv, get_fraction):
iv_, fv_ = torch.tensor(iv, device=env.device), torch.tensor(fv, device=env.device)
fraction = eval(get_fraction)
new_val = fraction * (fv_ - iv_) + iv_
if isinstance(data, float):
return new_val.item()
elif isinstance(data, int):
return int(new_val.item())
elif isinstance(data, (tuple, list)):
raw = new_val.tolist()
# assume iv is sequence of all ints or all floats:
is_int = isinstance(iv[0], int)
casted = [int(x) if is_int else float(x) for x in raw]
return tuple(casted) if isinstance(data, tuple) else casted
else:
raise TypeError(f"Does not support the type {type(data)}")
```
(float)
```
joint_pos_unoise_min_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.policy.joint_pos.noise.n_min",
"modify_fn": initial_final_interpolate_fn,
"modify_params": {"iv": 0., "fv": -.1, "get_fraction": "env.command_manager.get_command("difficulty")"}
}
)
```
(tuple or list)
```
command_object_pose_xrange_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "commands.object_pose.ranges.pos_x",
"modify_fn": initial_final_interpolate_fn,
"modify_params": {"iv": (-.5, -.5), "fv": (-.75, -.25), "get_fraction": "env.command_manager.get_command("difficulty")"}
}
)
```
Demo 3: overriding entire term on env_step counter rather than adaptive
```
def value_override(env: ManagerBasedRLEnv, env_id, data, new_val, num_steps):
if env.common_step_counter > num_steps:
return new_val
return mdp.modify_term_cfg.NO_CHANGE
object_pos_curriculum = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "commands.object_pose",
"modify_fn": value_override,
"modify_params": {"new_val": <new_observation_term>, "num_step": 120000 }
}
)
```
Demo 4: overriding Tensor field within some arbitary class not visible
from term_cfg
(you can see that 'address' is not as nice as mdp.modify_term_cfg)
```
def resample_bucket_range(env: ManagerBasedRLEnv, env_id, data, static_friction_range, dynamic_friction_range, restitution_range, num_steps):
if env.common_step_counter > num_steps:
range_list = [static_friction_range, dynamic_friction_range, restitution_range]
ranges = torch.tensor(range_list, device="cpu")
new_buckets = math_utils.sample_uniform(ranges[:, 0], ranges[:, 1], (len(data), 3), device="cpu")
return new_buckets
return mdp.modify_env_param.NO_CHANGE
object_physics_material_curriculum = CurrTerm(
func=mdp.modify_env_param,
params={
"address": "event_manager.cfg.object_physics_material.func.material_buckets",
"modify_fn": resample_bucket_range,
"modify_params": {"static_friction_range": [.5, 1.], "dynamic_friction_range": [.3, 1.], "restitution_range": [0.0, 0.5], "num_step": 120000 }
}
)
```
## Type of change
<!-- As you go through the list, delete the ones that are not
applicable. -->
- New feature (non-breaking change which adds functionality)
## Checklist
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there
<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it
For example,
- [x] I have done this task
- [ ] I have not done this task
-->
---------
Signed-off-by: ooctipus <[email protected]>
Signed-off-by: Kelly Guo <[email protected]>
Co-authored-by: Kelly Guo <[email protected]>
0 commit comments