Skip to content

Commit b37ddb6

Browse files
Merge pull request #4 from YannBerthelot/first_release
First release
2 parents cfcfabe + be0eb86 commit b37ddb6

26 files changed

+1048
-488
lines changed

.github/workflows/publish.yml

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
name: Build and Publish
2+
3+
on:
4+
release:
5+
types: [published]
6+
7+
jobs:
8+
publish:
9+
runs-on: ubuntu-latest
10+
steps:
11+
- uses: actions/checkout@v4
12+
13+
- name: Set up Python
14+
uses: actions/setup-python@v4
15+
with:
16+
python-version: "3.12"
17+
18+
- name: Install Poetry
19+
uses: snok/install-poetry@v1
20+
with:
21+
version: 1.7.1
22+
virtualenvs-create: true
23+
virtualenvs-in-project: true
24+
25+
- name: Build package
26+
run: poetry build
27+
28+
- name: Publish to PyPI
29+
env:
30+
POETRY_PYPI_TOKEN_PYPI: ${{ secrets.PYPI_TOKEN }}
31+
run: poetry publish
32+
33+
- name: Upload artifacts
34+
uses: actions/upload-artifact@v3
35+
with:
36+
name: dist
37+
path: dist/

.github/workflows/python-app.yml

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# This workflow will install Python dependencies, run tests and lint with a single version of Python
2+
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python
3+
4+
name: Python application
5+
6+
on:
7+
push:
8+
branches: [ "main" ]
9+
pull_request:
10+
branches: [ "main" ]
11+
12+
permissions:
13+
contents: read
14+
15+
jobs:
16+
test:
17+
runs-on: ubuntu-latest
18+
19+
steps:
20+
- uses: actions/checkout@v4
21+
22+
- name: Set up Python 3.12
23+
uses: actions/setup-python@v4
24+
with:
25+
python-version: "3.12"
26+
27+
- name: Install system dependencies
28+
run: |
29+
sudo apt-get update
30+
sudo apt-get install -y python3-pygame
31+
32+
- name: Install Poetry
33+
uses: snok/install-poetry@v1
34+
with:
35+
version: 1.7.1
36+
virtualenvs-create: true
37+
virtualenvs-in-project: true
38+
39+
- name: Load cached venv
40+
id: cached-poetry-dependencies
41+
uses: actions/cache@v3
42+
with:
43+
path: .venv
44+
key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ hashFiles('**/poetry.lock') }}
45+
46+
- name: Install dependencies
47+
if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
48+
run: poetry install --no-interaction --no-root
49+
50+
- name: Install project
51+
run: poetry install --no-interaction
52+
53+
- name: Check formatting with Black
54+
run: |
55+
poetry run black --check .
56+
57+
- name: Test with pytest
58+
run: |
59+
poetry run pytest tests/ -v

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -175,4 +175,6 @@ todo.md
175175
training_videos*
176176
*.csv
177177
logs*
178-
*.png
178+
*.png
179+
test_readme.py
180+
*.zip

Makefile

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
SHELL=/bin/bash
2+
LINT_PATHS=src/ tests/
3+
4+
test:
5+
poetry run pytest --tb=short --disable-warnings
6+
7+
mypy:
8+
mypy ${LINT_PATHS}
9+
10+
coverage:
11+
poetry run coverage run -m pytest tests
12+
poetry run coverage report -m --fail-under 80
13+
14+
missing-annotations:
15+
mypy --disallow-untyped-calls --disallow-untyped-defs --ignore-missing-imports src
16+
17+
type: mypy
18+
19+
lint:
20+
# stop the build if there are Python syntax errors or undefined names
21+
# see https://www.flake8rules.com/
22+
poetry run ruff check ${LINT_PATHS} --select=E9,F63,F7,F82 --output-format=full
23+
# exit-zero treats all errors as warnings.
24+
poetry run ruff check ${LINT_PATHS} --exit-zero --output-format=concise
25+
26+
format:
27+
# Sort imports
28+
poetry run ruff check --select I $(LINT_PATHS) --fix
29+
# Reformat using black
30+
poetry run black $(LINT_PATHS)
31+
32+
check-codestyle:
33+
# Sort imports
34+
ruff check --select I ${LINT_PATHS}
35+
# Reformat using black
36+
black --check ${LINT_PATHS}
37+
38+
commit-checks: format type lint

README.md

Lines changed: 103 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -2,28 +2,27 @@
22

33
![Demo of Plane environment](demo.gif)
44

5-
**Plane** is a lightweight yet realistic **reinforcement learning environment** simulating a 2D side view of an Airbus A320-like aircraft.
6-
It’s designed for **fast, end-to-end training on GPU with JAX** while staying **physics-based** and **realistic enough** to capture the core challenges of aircraft control.
5+
**Plane** is a lightweight yet realistic **reinforcement learning environment** simulating a 2D side view of an Airbus A320-like aircraft.
6+
It’s designed for **fast, end-to-end training on GPU with JAX** while staying **physics-based** and **realistic enough** to capture the core challenges of aircraft control.
77

8-
Plane allows you to benchmark RL agents on **delays, perturbations, irrecoverable states, partial observability, and competing objectives** — challenges that are often ignored in standard toy environments.
8+
Plane allows you to benchmark RL agents on **delays, irrecoverable states, partial observability, and competing objectives** — challenges that are often ignored in standard toy environments.
99

1010
---
1111

1212
## ✨ Features
1313

14-
- 🏎 **Fast & parallelizable** thanks to JAX — scale to thousands of parallel environments on GPU/TPU.
15-
- 📐 **Physics-based**: Dynamics are derived from airplane modeling equations (not arcade physics).
16-
- 🧪 **Reliable**: Covered by unit tests to ensure stability and reproducibility.
17-
- 🎯 **Challenging**: Captures real-world aviation control problems (momentum, delays, irrecoverable states).
18-
- 🔄 **Compatible with both worlds**:
19-
- [Gymnasium](https://gymnasium.farama.org/) (with Stable-Baselines3)
20-
- [Gymnax](https://github.com/RobertTLange/gymnax) (with sbx / JAX-native RL libraries)
14+
* 🏎 **Fast & parallelizable** thanks to JAX — scale to thousands of parallel environments on GPU/TPU.
15+
* 📐 **Physics-based**: Dynamics are derived from airplane modeling equations (not arcade physics).
16+
* 🧪 **Reliable**: Covered by unit tests to ensure stability and reproducibility.
17+
* 🎯 **Challenging**: Captures real-world aviation control problems (momentum, delays, irrecoverable states).
18+
* 🔄 **Compatible with multiple interfaces**: Designed to work with JAX-based environments.
19+
* 🌟 **Upcoming features**: Environmental perturbations (e.g., wind) will be available in future releases.
2120

2221
---
2322

2423
## 📊 Stable Altitude vs. Power & Pitch
2524

26-
Below is an example of how stable altitude changes with engine power and pitch:
25+
Below is an example of how stable altitude changes with engine power and pitch:
2726

2827
![Stable altitude graph](altitude_vs_power_and_stick.png)
2928

@@ -47,93 +46,129 @@ poetry add plane-env
4746

4847
## 🎮 Usage
4948

50-
Plane supports **both Gymnasium and Gymnax interfaces**.
51-
Here are some examples to get you started:
49+
Here’s a minimal example of running an episode and saving a video:
5250

53-
### Gymnasium (with Stable-Baselines3)
51+
```python
52+
from plane_env.env_jax import Airplane2D, EnvParams
53+
54+
# Create env
55+
env = Airplane2D()
56+
seed = 42
57+
env_params = EnvParams(max_steps_in_episode=1_000)
58+
59+
# Simple constant policy with 80% power and 0° stick input.
60+
action = (0.8, 0.0)
61+
62+
# Save the video
63+
env.save_video(lambda o: action, seed, folder="videos", episode_index=0, params=env_params, format="gif")
64+
```
65+
66+
Of course you can also directly use it to train an agent using your favorite RL library (here: stable-baselines3)
5467

5568
```python
56-
import gymnasium as gym
57-
import plane_env
69+
from plane_env.env_gymnasium import Airplane2D, EnvParams
70+
from stable_baselines3 import SAC
71+
72+
# Create env
73+
env = Airplane2D()
74+
# Model training (adapted from https://stable-baselines3.readthedocs.io/en/master/modules/sac.html)
5875

59-
# Create environment
60-
env = gym.make("Plane-v0", render_mode="rgb_array")
6176

62-
# Rollout a random policy and save a video
63-
from stable_baselines3.common.vec_env import VecVideoRecorder, DummyVecEnv
77+
model = SAC("MlpPolicy", env, verbose=1)
78+
model.learn(total_timesteps=10_000, log_interval=4)
79+
model.save("sac_plane")
6480

65-
vec_env = DummyVecEnv([lambda: env])
66-
video_env = VecVideoRecorder(vec_env, "videos/", record_video_trigger=lambda x: x == 0, video_length=200)
67-
obs = video_env.reset()
81+
del model # remove to demonstrate saving and loading
6882

69-
for _ in range(200):
70-
action = video_env.action_space.sample()
71-
obs, reward, done, info = video_env.step(action)
72-
if done:
73-
video_env.reset()
83+
model = SAC.load("sac_plane")
7484

75-
video_env.close()
85+
obs, info = env.reset()
86+
while True:
87+
action, _states = model.predict(obs, deterministic=True)
88+
obs, reward, terminated, truncated, info = env.step(action)
89+
if terminated or truncated:
90+
break
7691
```
7792

78-
### Gymnax (with sbx)
7993

80-
```python
81-
import jax
82-
import jax.numpy as jnp
83-
import gymnax
84-
import plane_env
85-
86-
# Create environment
87-
env, params = plane_env.make("PlaneJax-v0")
88-
89-
# Vectorize & run multiple parallel environments
90-
key = jax.random.PRNGKey(0)
91-
reset_fn = jax.vmap(env.reset, in_axes=(0, None))
92-
step_fn = jax.vmap(env.step, in_axes=(0, 0, None, None))
93-
94-
keys = jax.random.split(key, 16)
95-
obs, state = reset_fn(keys, params)
96-
97-
def rollout_step(carry, _):
98-
obs, state, key = carry
99-
action = jax.random.randint(key, (), 0, env.action_space().n)
100-
key, subkey = jax.random.split(key)
101-
obs, state, reward, done, info = step_fn(obs, state, action, subkey, params)
102-
return (obs, state, key), (obs, reward, done)
103-
104-
(_, _, _), (obs_hist, reward_hist, done_hist) = jax.lax.scan(rollout_step, (obs, state, key), None, length=200)
105-
```
94+
---
95+
96+
## 🛩️ Environment Overview (Reinforcement Learning Perspective)
97+
98+
**State (`EnvState`)**: 13-dimensional vector representing aircraft dynamics:
99+
100+
| Variable | Description |
101+
| ----------------- | --------------------------------------- |
102+
| `x` | Horizontal position (m) |
103+
| `x_dot` | Horizontal speed (m/s) |
104+
| `z` | Altitude (m) |
105+
| `z_dot` | Vertical speed (m/s) |
106+
| `theta` | Pitch angle (rad) |
107+
| `theta_dot` | Pitch angular velocity (rad/s) |
108+
| `alpha` | Angle of attack (rad) |
109+
| `gamma` | Flight path angle (rad) |
110+
| `m` | Aircraft mass (kg) |
111+
| `power` | Normalized engine thrust (0–1) |
112+
| `stick` | Control stick input for pitch (–1 to 1) |
113+
| `fuel` | Remaining fuel (kg) |
114+
| `t` | Current timestep |
115+
| `target_altitude` | Desired target altitude (m) |
116+
117+
The state also provides **derived properties** like air density, Mach number, and speed of sound.
118+
119+
The agent currently observes all of the state, minus **x** and **t** (as they should be irrelevant for control), as well as fuel which is currently not used.
120+
121+
**Action Space**: Continuous 2D vector `[power_requested, stick_requested]` controlling engine thrust and pitch.
122+
123+
**Reward Function**:
124+
125+
* Encourages maintaining **target altitude**.
126+
* Terminal altitude violations (`z < min_alt` or `z > max_alt`) incur `-max_steps_in_episode`.
127+
* Otherwise, reward is sthe quared normalized difference to target altitude:
128+
129+
$`r_t = \left( \frac{\text{max\_alt} - | \text{target\_altitude} - z_t |}{\text{max\_alt} - \text{min\_alt}} \right)^2`$
130+
131+
132+
133+
**Episode Termination**:
134+
135+
* **Altitude limits exceeded** → terminated
136+
* **Maximum episode length reached** → truncated
137+
138+
**Time step**: `delta_t = 0.5 s`, `max_steps_in_episode = 1,000`.
106139

107140
---
108141

109142
## 🧩 Challenges Modeled
110143

111144
Plane is designed to test RL agents under **realistic aviation challenges**:
112145

113-
-**Delay**: Engine power changes take time to fully apply.
114-
- 🌪 **Perturbations**: Random wind gusts alter dynamics.
115-
- 👀 **Partial observability**: Some forces and wind speeds cannot be directly measured.
116-
- 🏁 **Competing objectives**: Reach target altitude fast while minimizing fuel and overshoot.
117-
- 🌀 **Momentum effects**: Control inputs show delayed impact due to physical inertia.
118-
- ⚠️ **Irrecoverable states**: Certain trajectories inevitably lead to failure (crash).
146+
***Delay**: Engine power changes take time to fully apply.
147+
* 👀 **Partial observability**: Some forces cannot be directly measured.
148+
* 🏁 **Competing objectives**: Reach target altitude fast while minimizing fuel and overshoot.
149+
* 🌀 **Momentum effects**: Control inputs show delayed impact due to physical inertia.
150+
* ⚠️ **Irrecoverable states**: Certain trajectories inevitably lead to failure (crash).
151+
152+
> Environmental perturbations (wind, turbulence) are coming in a future release.
119153
120154
---
121155

122156
## 📦 Roadmap
123157

124-
- [ ] Add pitch as a controllable action.
125-
- [ ] Expand challenges (sensor noise, turbulence models).
126-
- [ ] Provide ready-to-use benchmark results for popular RL baselines.
158+
* [ ] Add perturbations (wind with varying speeds and directions) to model the non-stationarity of the dynamics.
159+
* [ ] Add an easier interface to create partially-observable versions of the environment.
160+
* [ ] Provide ready-to-use benchmark results for popular RL baselines.
161+
* [ ] Add fuel consumption.
127162

128163
---
129164

130165
## 🤝 Contributing
131166

132-
Contributions are welcome!
133-
Please open an issue or PR if you have suggestions, bug reports, or new features.
167+
Contributions are welcome!
168+
Please open an issue or PR if you have suggestions, bug reports, or new features.
134169

135170
---
136171

137172
## 📜 License
138173

139-
MIT License – feel free to use it in your own research and projects.
174+
MIT License – feel free to use it in your own research and projects.

conftest.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
from pytest_readme import setup
2+
3+
setup()

demo.gif

2.18 MB
Loading

0 commit comments

Comments
 (0)