Skip to content

Commit 208b6fd

Browse files
authored
Add support for gymnasium v1.0 (#475)
* Add support for gymnasium v1.0 * Update versions * Fix requirements * Ignore mypy for gym 0.29 * Add explicit shimmy dep * Patch obs space and update trained agents * Comment out auto-fix obs space * Fix vecnormalize stats
1 parent b1288ed commit 208b6fd

File tree

21 files changed

+91
-40
lines changed

21 files changed

+91
-40
lines changed

.github/workflows/ci.yml

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,12 @@ jobs:
2020
strategy:
2121
matrix:
2222
python-version: ["3.8", "3.9", "3.10", "3.11"]
23-
23+
include:
24+
# Default version
25+
- gymnasium-version: "1.0.0"
26+
# Add a new config to test gym<1.0
27+
- python-version: "3.10"
28+
gymnasium-version: "0.29.1"
2429
steps:
2530
- uses: actions/checkout@v3
2631
with:
@@ -32,22 +37,22 @@ jobs:
3237
- name: Install dependencies
3338
run: |
3439
python -m pip install --upgrade pip
35-
3640
# Use uv for faster downloads
3741
pip install uv
38-
# Install Atari Roms
39-
uv pip install --system autorom
40-
wget https://gist.githubusercontent.com/jjshoots/61b22aefce4456920ba99f2c36906eda/raw/00046ac3403768bfe45857610a3d333b8e35e026/Roms.tar.gz.b64
41-
base64 Roms.tar.gz.b64 --decode &> Roms.tar.gz
42-
AutoROM --accept-license --source-file Roms.tar.gz
43-
42+
# cpu version of pytorch
4443
# See https://github.com/astral-sh/uv/issues/1497
4544
uv pip install --system torch==2.4.1+cpu --index https://download.pytorch.org/whl/cpu
4645
# Install full requirements (for additional envs and test tools)
4746
uv pip install --system -r requirements.txt
4847
# Use headless version
4948
uv pip install --system opencv-python-headless
5049
uv pip install --system -e .[plots,tests]
50+
51+
- name: Install specific version of gym
52+
run: |
53+
uv pip install --system gymnasium==${{ matrix.gymnasium-version }}
54+
# Only run for python 3.10, downgrade gym to 0.29.1
55+
5156
- name: Lint with ruff
5257
run: |
5358
make lint

.github/workflows/trained_agents.yml

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,12 @@ jobs:
2121
strategy:
2222
matrix:
2323
python-version: ["3.8", "3.9", "3.10", "3.11"]
24-
24+
include:
25+
# Default version
26+
- gymnasium-version: "1.0.0"
27+
# Add a new config to test gym<1.0
28+
- python-version: "3.10"
29+
gymnasium-version: "0.29.1"
2530
steps:
2631
- uses: actions/checkout@v3
2732
with:
@@ -36,19 +41,21 @@ jobs:
3641
3742
# Use uv for faster downloads
3843
pip install uv
39-
# Install Atari Roms
40-
uv pip install --system autorom
41-
wget https://gist.githubusercontent.com/jjshoots/61b22aefce4456920ba99f2c36906eda/raw/00046ac3403768bfe45857610a3d333b8e35e026/Roms.tar.gz.b64
42-
base64 Roms.tar.gz.b64 --decode &> Roms.tar.gz
43-
AutoROM --accept-license --source-file Roms.tar.gz
44-
44+
# cpu version of pytorch
4545
# See https://github.com/astral-sh/uv/issues/1497
4646
uv pip install --system torch==2.4.1+cpu --index https://download.pytorch.org/whl/cpu
4747
# Install full requirements (for additional envs and test tools)
48+
# Install full requirements (for additional envs and test tools)
4849
uv pip install --system -r requirements.txt
4950
# Use headless version
5051
uv pip install --system opencv-python-headless
5152
uv pip install --system -e .[plots,tests]
53+
54+
- name: Install specific version of gym
55+
run: |
56+
uv pip install --system gymnasium==${{ matrix.gymnasium-version }}
57+
# Only run for python 3.10, downgrade gym to 0.29.1
58+
5259
- name: Check trained agents
5360
run: |
5461
make check-trained-agents

CHANGELOG.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,16 @@
1-
## Release 2.4.0a10 (WIP)
1+
## Release 2.4.0a11 (WIP)
22

3-
**New algorithm: CrossQ, and better defaults for SAC/TQC on Swimmer-v4 env**
3+
**New algorithm: CrossQ, Gymnasium v1.0 support, and better defaults for SAC/TQC on Swimmer-v4 env**
44

55
### Breaking Changes
66
- Updated defaults hyperparameters for TQC/SAC for Swimmer-v4 (decrease gamma for more consistent results) (@JacobHA) [W&B report](https://wandb.ai/openrlbenchmark/sbx/reports/SAC-MuJoCo-Swimmer-v4--Vmlldzo3NzM5OTk2)
77
- Upgraded to SB3 >= 2.4.0
8+
- Renamed `LunarLander-v2` to `LunarLander-v3` in hyperparameters
89

910
### New Features
1011
- Added `CrossQ` hyperparameters for SB3-contrib (@danielpalen)
12+
- Added Gymnasium v1.0 support
13+
- `--custom-objects` in `enjoy.py` now also patches obs space (when bounds are changed) to solve "Observation spaces do not match" errors
1114

1215
### Bug fixes
1316
- Replaced deprecated `huggingface_hub.Repository` when pushing to Hugging Face Hub by the recommended `HfApi` (see https://huggingface.co/docs/huggingface_hub/concepts/git_vs_http) (@cochaviz)

hyperparams/a2c.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ Pendulum-v1:
6161
policy_kwargs: "dict(log_std_init=-2, ortho_init=False)"
6262

6363
# Tuned
64-
LunarLanderContinuous-v2:
64+
LunarLanderContinuous-v3:
6565
normalize: true
6666
n_envs: 4
6767
n_timesteps: !!float 5e6

hyperparams/ars.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ LunarLander-v2:
2626
n_timesteps: !!float 2e6
2727

2828
# Tuned
29-
LunarLanderContinuous-v2:
29+
LunarLanderContinuous-v3:
3030
<<: *pendulum-params
3131
n_timesteps: !!float 2e6
3232

@@ -215,4 +215,3 @@ A1Jumping-v0:
215215
# alive_bonus_offset: -1
216216
normalize: "dict(norm_obs=True, norm_reward=False)"
217217
# policy_kwargs: "dict(net_arch=[16])"
218-

hyperparams/crossq.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Pendulum-v1:
1818
policy_kwargs: "dict(net_arch=[256, 256])"
1919

2020

21-
LunarLanderContinuous-v2:
21+
LunarLanderContinuous-v3:
2222
n_timesteps: !!float 2e5
2323
policy: 'MlpPolicy'
2424
buffer_size: 1000000

hyperparams/ddpg.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Pendulum-v1:
2323
learning_rate: !!float 1e-3
2424
policy_kwargs: "dict(net_arch=[400, 300])"
2525

26-
LunarLanderContinuous-v2:
26+
LunarLanderContinuous-v3:
2727
n_timesteps: !!float 3e5
2828
policy: 'MlpPolicy'
2929
gamma: 0.98

hyperparams/ppo.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ LunarLander-v2:
122122
n_epochs: 4
123123
ent_coef: 0.01
124124

125-
LunarLanderContinuous-v2:
125+
LunarLanderContinuous-v3:
126126
n_envs: 16
127127
n_timesteps: !!float 1e6
128128
policy: 'MlpPolicy'

hyperparams/sac.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Pendulum-v1:
2222
learning_rate: !!float 1e-3
2323

2424

25-
LunarLanderContinuous-v2:
25+
LunarLanderContinuous-v3:
2626
n_timesteps: !!float 5e5
2727
policy: 'MlpPolicy'
2828
batch_size: 256

hyperparams/td3.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Pendulum-v1:
2323
learning_rate: !!float 1e-3
2424
policy_kwargs: "dict(net_arch=[400, 300])"
2525

26-
LunarLanderContinuous-v2:
26+
LunarLanderContinuous-v3:
2727
n_timesteps: !!float 3e5
2828
policy: 'MlpPolicy'
2929
gamma: 0.98

0 commit comments

Comments
 (0)