Skip to content

Commit 875b1da

Browse files
Formatting fixes (#120)
Main Changes: - Switched to Ruff for linting - Added type hints - Formatted comments and docstrings Validated on the following tasks: Isaac-Velocity-Flat-Anymal-D-v0 Isaac-Velocity-Flat-Anymal-D-Recurrent-v0 Isaac-Velocity-Flat-Anymal-D-Distillation-v0 Isaac-Velocity-Flat-Anymal-D-Distillation-Recurrent-v0
1 parent 8363520 commit 875b1da

38 files changed

+1508
-1144
lines changed

.flake8

Lines changed: 0 additions & 22 deletions
This file was deleted.

.pre-commit-config.yaml

Lines changed: 5 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,21 @@
11
repos:
2-
- repo: https://github.com/python/black
3-
rev: 23.10.1
2+
- repo: https://github.com/astral-sh/ruff-pre-commit
3+
rev: v0.14.0
44
hooks:
5-
- id: black
6-
args: ["--line-length", "120", "--preview"]
7-
- repo: https://github.com/pycqa/flake8
8-
rev: 6.1.0
9-
hooks:
10-
- id: flake8
11-
additional_dependencies: [flake8-simplify, flake8-return]
5+
- id: ruff-check
6+
- id: ruff-format
127
- repo: https://github.com/pre-commit/pre-commit-hooks
138
rev: v4.5.0
149
hooks:
15-
- id: trailing-whitespace
1610
- id: check-symlinks
1711
- id: destroyed-symlinks
1812
- id: check-yaml
13+
- id: check-toml
1914
- id: check-merge-conflict
2015
- id: check-case-conflict
2116
- id: check-executables-have-shebangs
22-
- id: check-toml
23-
- id: end-of-file-fixer
2417
- id: check-shebang-scripts-are-executable
2518
- id: detect-private-key
26-
- id: debug-statements
27-
- repo: https://github.com/pycqa/isort
28-
rev: 5.12.0
29-
hooks:
30-
- id: isort
31-
name: isort (python)
32-
args: ["--profile", "black", "--filter-files"]
33-
- repo: https://github.com/asottile/pyupgrade
34-
rev: v3.15.0
35-
hooks:
36-
- id: pyupgrade
37-
args: ["--py37-plus"]
3819
- repo: https://github.com/codespell-project/codespell
3920
rev: v2.2.6
4021
hooks:

CONTRIBUTORS.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,14 @@ Please keep the lists sorted alphabetically.
1717

1818
---
1919

20-
* Mayank Mittal
2120
* Clemens Schwarke
21+
* Mayank Mittal
2222

2323
## Authors
2424

25+
* Clemens Schwarke
2526
* David Hoeller
27+
* Mayank Mittal
2628
* Nikita Rudin
2729

2830
## Contributors

README.md

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,14 @@
1-
# RSL RL
1+
# RSL-RL
22

3-
A fast and simple implementation of RL algorithms, designed to run fully on GPU.
4-
This code is an evolution of `rl-pytorch` provided with NVIDIA's Isaac Gym.
3+
A fast and simple implementation of learning algorithms for robotics. For an overview of the library please have a look at https://arxiv.org/pdf/2509.10771.
54

65
Environment repositories using the framework:
76

87
* **`Isaac Lab`** (built on top of NVIDIA Isaac Sim): https://github.com/isaac-sim/IsaacLab
9-
* **`Legged-Gym`** (built on top of NVIDIA Isaac Gym): https://leggedrobotics.github.io/legged_gym/
8+
* **`Legged Gym`** (built on top of NVIDIA Isaac Gym): https://leggedrobotics.github.io/legged_gym/
109
* **`MuJoCo Playground`** (built on top of MuJoCo MJX and Warp): https://github.com/google-deepmind/mujoco_playground/
1110

12-
The main branch supports **PPO** and **Student-Teacher Distillation** with additional features from our research. These include:
11+
The library currently supports **PPO** and **Student-Teacher Distillation** with additional features from our research. These include:
1312

1413
* [Random Network Distillation (RND)](https://proceedings.mlr.press/v229/schwarke23a.html) - Encourages exploration by adding
1514
a curiosity driven intrinsic reward.
@@ -22,8 +21,6 @@ information.
2221
**Affiliation**: Robotic Systems Lab, ETH Zurich & NVIDIA <br/>
2322
**Contact**: [email protected]
2423

25-
> **Note:** The `algorithms` branch supports additional algorithms (SAC, DDPG, DSAC, and more). However, it isn't currently actively maintained.
26-
2724

2825
## Setup
2926

@@ -57,8 +54,7 @@ For documentation, we adopt the [Google Style Guide](https://sphinxcontrib-napol
5754
We use the following tools for maintaining code quality:
5855

5956
- [pre-commit](https://pre-commit.com/): Runs a list of formatters and linters over the codebase.
60-
- [black](https://black.readthedocs.io/en/stable/): The uncompromising code formatter.
61-
- [flake8](https://flake8.pycqa.org/en/latest/): A wrapper around PyFlakes, pycodestyle, and McCabe complexity checker.
57+
- [ruff](https://github.com/astral-sh/ruff): An extremely fast Python linter and code formatter, written in Rust.
6258

6359
Please check [here](https://pre-commit.com/#install) for instructions to set these up. To run over the entire repository, please execute the following command in the terminal:
6460

config/example_config.yaml

Lines changed: 30 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,21 @@
11
runner:
22
class_name: OnPolicyRunner
3-
# -- general
4-
num_steps_per_env: 24 # number of steps per environment per iteration
5-
max_iterations: 1500 # number of policy updates
3+
# General
4+
num_steps_per_env: 24 # Number of steps per environment per iteration
5+
max_iterations: 1500 # Number of policy updates
66
seed: 1
7-
# -- observations
8-
obs_groups: {"policy": ["policy"], "critic": ["policy", "privileged"]} # maps observation groups to types. See `vec_env.py` for more information
9-
# -- logging parameters
10-
save_interval: 50 # check for potential saves every `save_interval` iterations
7+
# Observations
8+
obs_groups: {"policy": ["policy"], "critic": ["policy", "privileged"]} # Maps observation groups to sets. See `vec_env.py` for more information
9+
# Logging parameters
10+
save_interval: 50 # Check for potential saves every `save_interval` iterations
1111
experiment_name: walking_experiment
1212
run_name: ""
13-
# -- logging writer
13+
# Logging writer
1414
logger: tensorboard # tensorboard, neptune, wandb
1515
neptune_project: legged_gym
1616
wandb_project: legged_gym
1717

18-
# -- policy
18+
# Policy
1919
policy:
2020
class_name: ActorCritic
2121
activation: elu
@@ -25,45 +25,46 @@ runner:
2525
critic_hidden_dims: [256, 256, 256]
2626
init_noise_std: 1.0
2727
noise_std_type: "scalar" # 'scalar' or 'log'
28+
state_dependent_std: false
2829

29-
# -- algorithm
30+
# Algorithm
3031
algorithm:
3132
class_name: PPO
32-
# -- training
33+
# Training
3334
learning_rate: 0.001
3435
num_learning_epochs: 5
3536
num_mini_batches: 4 # mini batch size = num_envs * num_steps / num_mini_batches
3637
schedule: adaptive # adaptive, fixed
37-
# -- value function
38+
# Value function
3839
value_loss_coef: 1.0
3940
clip_param: 0.2
4041
use_clipped_value_loss: true
41-
# -- surrogate loss
42+
# Surrogate loss
4243
desired_kl: 0.01
4344
entropy_coef: 0.01
4445
gamma: 0.99
4546
lam: 0.95
4647
max_grad_norm: 1.0
47-
# -- miscellaneous
48+
# Miscellaneous
4849
normalize_advantage_per_mini_batch: false
4950

50-
# -- random network distillation
51+
# Random network distillation
5152
rnd_cfg:
52-
weight: 0.0 # initial weight of the RND reward
53-
weight_schedule: null # note: this is a dictionary with a required key called "mode". Please check the RND module for more information
54-
reward_normalization: false # whether to normalize RND reward
55-
# -- learning parameters
56-
learning_rate: 0.001 # learning rate for RND
57-
# -- network parameters
58-
num_outputs: 1 # number of outputs of RND network. Note: if -1, then the network will use dimensions of the observation
59-
predictor_hidden_dims: [-1] # hidden dimensions of predictor network
60-
target_hidden_dims: [-1] # hidden dimensions of target network
53+
weight: 0.0 # Initial weight of the RND reward
54+
weight_schedule: null # This is a dictionary with a required key called "mode". Please check the RND module for more information
55+
reward_normalization: false # Whether to normalize RND reward
56+
# Learning parameters
57+
learning_rate: 0.001 # Learning rate for RND
58+
# Network parameters
59+
num_outputs: 1 # Number of outputs of RND network. Note: if -1, then the network will use dimensions of the observation
60+
predictor_hidden_dims: [-1] # Hidden dimensions of predictor network
61+
target_hidden_dims: [-1] # Hidden dimensions of target network
6162

62-
# -- symmetry augmentation
63+
# Symmetry augmentation
6364
symmetry_cfg:
64-
use_data_augmentation: true # this adds symmetric trajectories to the batch
65-
use_mirror_loss: false # this adds symmetry loss term to the loss function
66-
data_augmentation_func: null # string containing the module and function name to import
65+
use_data_augmentation: true # This adds symmetric trajectories to the batch
66+
use_mirror_loss: false # This adds symmetry loss term to the loss function
67+
data_augmentation_func: null # String containing the module and function name to import
6768
# Example: "legged_gym.envs.locomotion.anymal_c.symmetry:get_symmetric_states"
6869
#
6970
# .. code-block:: python
@@ -73,4 +74,4 @@ runner:
7374
# obs: Optional[torch.Tensor] = None, actions: Optional[torch.Tensor] = None, cfg: "BaseEnvCfg" = None, obs_type: str = "policy"
7475
# ) -> Tuple[torch.Tensor, torch.Tensor]:
7576
#
76-
mirror_loss_coeff: 0.0 #coefficient for symmetry loss term. If 0, no symmetry loss is used
77+
mirror_loss_coeff: 0.0 # Coefficient for symmetry loss term. If 0, no symmetry loss is used

licenses/dependencies/black-license.txt

Lines changed: 0 additions & 21 deletions
This file was deleted.

licenses/dependencies/flake8-license.txt

Lines changed: 0 additions & 22 deletions
This file was deleted.

licenses/dependencies/isort-license.txt

Lines changed: 0 additions & 21 deletions
This file was deleted.
File renamed without changes.

licenses/dependencies/pyupgrade-license.txt

Lines changed: 0 additions & 19 deletions
This file was deleted.

0 commit comments

Comments
 (0)