Releases: leggedrobotics/rsl_rl
v5.0.0
Overview
This release introduces a new Batch class and a new Distribution class. The Batch class eliminates the possibility of switching tensors due to incorrect positional argument ordering. The Distribution class makes it easy to add new distributions without modifying the models directly. Furthermore, the library now has a small documentation that can be found here, as well as a test suite. Lastly, a new NAN check makes it easier to detect and debug NAN values from the environment. With this release, the main structural changes to the library are complete, and the library will be more stable going forward.
Isaac Lab users can refer to this PR until it is merged, which automatically converts old configurations to the new structure.
Full Changelog: v4.0.1...v5.0.0
Added
- Adds a batch class to avoid positional arguments in generators by @ClemensSchwarke in #172
- Adds a distribution class for easier adaptability by @ClemensSchwarke in #173
- Adds a small documentation by @ClemensSchwarke in #183
- Adds tests to the library by @ClemensSchwarke in #184
- Adds NAN check to avoid ambiguous std >= 0.0 error by @ClemensSchwarke in #185
Fixed
- Fixes code state logging by @DreaverZhao in #178
- Fixes multi gpu unbounded growth due to logging by @VineetTambe in #179
- Fixes missing raise and wrong RND broadcast index by @jashshah999 in #180
- Fixes mutable default arguments by @jashshah999 in #181
Breaking Changes
The configuration setup has changed. Instead of requiring noise parameters (stochastic, init_noise_std, noise_std_type, state_dependent_std), models now require a DistributionCfg. The following example demonstrates the necessary changes using the configuration style of Isaac Lab:
Instead of defining:
actor = RslRlMLPModelCfg(
hidden_dims=[512, 256, 128],
activation="elu",
obs_normalization=False,
stochastic=True,
init_noise_std=1.0,
)
define:
actor = RslRlMLPModelCfg(
hidden_dims=[512, 256, 128],
activation="elu",
obs_normalization=False,
distribution_cfg=RslRlMLPModelCfg.GaussianDistributionCfg(init_std=1.0),
)
More details can be found in the new documentation.
New Contributors
- @DreaverZhao made their first contribution in #178
- @VineetTambe made their first contribution in #179
- @jashshah999 made their first contribution in #180
v4.0.1
Overview
Full Changelog: v4.0.0...v4.0.1
Fixed
- Fixes CNNModel init ordering bugs by @kevinzakka in #175
v4.0.0
Overview
This release introduces a new library structure and many new features described in the following. These changes require a new configuration setup, detailed in the breaking changes section below. Isaac Lab users can refer to this PR until it is merged, which automatically converts old configurations to the new structure.
-
The
ActorCriticandStudentTeacherclasses got split, allowing all models to be an instance of the same class, e.g.,MLPModel. This way, a new architecture has to be implemented only once and can be reused for any purpose. This makes expecially distillation more streamlined, as new architectures do not have to be implemented twice, once for the actor and once for the teacher. Additionally, all code duplication that existed in the ActorCritic and StudentTeacher classes could be removed. -
Some directories got renamed for a more consistent structure:
- RND and Symmetry got moved to a new "extensions" folder
- The "modules" folder got renamed to "models" (following the new naming of models)
- The "networks" folder got renamed to "modules" (as, e.g., normalization is not a network)
-
Some functions got moved from the
Runnerto theAlgorithm, as they are algorithm specific. Those include:train_mode(),eval_mode(),save(),load(),construct_algorithm(). -
Video logging for W&B was added, such that videos recorded in, e.g., IsaacLab, are automatically uploaded.
-
Policy export functions were added, such that one function call to the runner returns either a JIT or ONNX model of the policy via
export_policy_to_jit()andexport_policy_to_onnx() -
An option for sharing the CNN encoders between actor and critic was added.
-
A
load_cfgargument was added to theload()function, allowing the user to specify what models/states to load.
Full Changelog: v3.3.0...v4.0.0
Added
- Separates actor and critic by @ClemensSchwarke in #159
- Moves functions from Runners to Algorithms by @ClemensSchwarke in #164
- Adds video logging for W&B by @ClemensSchwarke in #167
- Adds model exports to ONNX and JIT files by @ClemensSchwarke in #170
- Adds option to share CNN encoders between actor and critic models by @ClemensSchwarke in #171
Fixed
- Fixes multi-gpu initialization with W&B logging by @ClemensSchwarke in #166
Breaking Changes
The configuration setup has changed. Instead of requiring a policy configuration, separate actor and critic / student and teacher configurations are needed. The following example demonstrates the necessary changes using the configuration style of Isaac Lab:
Instead of defining:
policy = RslRlPpoActorCriticCfg(
init_noise_std=1.0,
actor_obs_normalization=False,
critic_obs_normalization=False,
actor_hidden_dims=[512, 256, 128],
critic_hidden_dims=[512, 256, 128],
activation="elu",
)
define:
actor = RslRlMLPModelCfg(
hidden_dims=[512, 256, 128],
activation="elu",
obs_normalization=False,
stochastic=True,
init_noise_std=1.0,
)
critic = RslRlMLPModelCfg(
hidden_dims=[512, 256, 128],
activation="elu",
obs_normalization=False,
stochastic=False,
)
v3.3.0
Overview
This release enables passing custom classes without having to modify the library. For example, you can now define a custom actor-critic class in a different repository and directly pass it to RSL-RL.
Full Changelog: v3.2.0...v3.3.0
Added
- Adds direct class passing by @ClemensSchwarke in #145
Fixed
New Contributors
v3.2.0
Overview
This release adds a new actor-critc class with CNN encoders for visual observations. It additionally refactors the rollout storage and seperates the logging functionality from the runner for better code clarity.
Full Changelog: v3.1.3...v3.2.0
Added
- Adds perceptive actor-critic class by @pascal-roth in #114
- Restructure rollout storage for clarity by @ClemensSchwarke in #137
- Separates the logging functionality from the runner by @ClemensSchwarke in #140
- Include run_name in training logs for better run traceability by @ShaoshuSu in #101
Fixed
- Remove unnecessary teacher eval call by @ClemensSchwarke in #139
- Add device for TensorDict in
split_and_pad_trajectoriesby @ClemensSchwarke in #138
New Contributors
- @ShaoshuSu made their first contribution in #101
v3.1.3
Overview
Full Changelog: v3.1.2...v3.1.3
Added
- Adds mjlab to environment repositories list by @louislelay in #128
- Adds onnxscript 0.5.4 as a dependency by @Kukanani in #127
Fixed
- Fixes wrong observation dimension in recurrent teacher by @ClemensSchwarke in #136
New Contributors
- @louislelay made their first contribution in #128
- @Kukanani made their first contribution in #127
v3.1.2
Overview
This release cleans up the codebase by:
- Switching to Ruff for linting and formatting
- Adding type hints
- Formatting comments and docstrings
- Minor changes for code readability
Full Changelog: v3.1.1...v3.1.2
Added
- Formatting fixes by @ClemensSchwarke in #120
Fixed
- Fix the symmetry configuration checks by @ClemensSchwarke in #126
v3.1.1
Overview
Full Changelog: v3.1.0...v3.1.1
Fixed
- Fix incorrect teacher obs normalizer input size by @tianzong-cheng in #116
- Make act_inference return policy mean (without std dev) at deployment time by @iakinola23 in #118
New Contributors
- @tianzong-cheng made their first contribution in #116
v3.1.0
Overview
Full Changelog: v3.0.1...v3.1.0
Added
- Adds state-dependent standard deviation for the PPO actor by @iakinola23 in #112
Fixed
- Allows torch device type to be a string in VecEnv by @kevinzakka in #109
New Contributors
- @iakinola23 made their first contribution in #112
v3.0.1
Overview
Full Changelog: v3.0.0...v3.0.1
Fixed
- Removes hardcoded policy obs group for symmetry by @ClemensSchwarke in #111