YannBerthelot
diff --git a/‎.github/workflows/publish.yml‎
Lines changed: 37 additions & 0 deletions b/‎.github/workflows/publish.yml‎
Lines changed: 37 additions & 0 deletions
diff --git a/‎.github/workflows/python-app.yml‎
Lines changed: 59 additions & 0 deletions b/‎.github/workflows/python-app.yml‎
Lines changed: 59 additions & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 3 additions & 1 deletion b/‎.gitignore‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎Makefile‎
Lines changed: 38 additions & 0 deletions b/‎Makefile‎
Lines changed: 38 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 103 additions & 68 deletions b/‎README.md‎
Lines changed: 103 additions & 68 deletions
diff --git a/‎conftest.py‎
Lines changed: 3 additions & 0 deletions b/‎conftest.py‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎demo.gif‎
2.18 MB b/‎demo.gif‎
2.18 MB
@@ -0,0 +1,37 @@
+name: Build and Publish
+
+on:
+  release:
+    types: [published]
+
+jobs:
+  publish:
+    runs-on: ubuntu-latest
+    steps:
+    - uses: actions/checkout@v4
+
+    - name: Set up Python
+      uses: actions/setup-python@v4
+      with:
+        python-version: "3.12"
+
+    - name: Install Poetry
+      uses: snok/install-poetry@v1
+      with:
+        version: 1.7.1
+        virtualenvs-create: true
+        virtualenvs-in-project: true
+
+    - name: Build package
+      run: poetry build
+
+    - name: Publish to PyPI
+      env:
+        POETRY_PYPI_TOKEN_PYPI: ${{ secrets.PYPI_TOKEN }}
+      run: poetry publish
+
+    - name: Upload artifacts
+      uses: actions/upload-artifact@v3
+      with:
+        name: dist
+        path: dist/
@@ -0,0 +1,59 @@
+# This workflow will install Python dependencies, run tests and lint with a single version of Python
+# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python
+
+name: Python application
+
+on:
+  push:
+    branches: [ "main" ]
+  pull_request:
+    branches: [ "main" ]
+
+permissions:
+  contents: read
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    
+    steps:
+    - uses: actions/checkout@v4
+    
+    - name: Set up Python 3.12
+      uses: actions/setup-python@v4
+      with:
+        python-version: "3.12"
+    
+    - name: Install system dependencies
+      run: |
+        sudo apt-get update
+        sudo apt-get install -y python3-pygame
+    
+    - name: Install Poetry
+      uses: snok/install-poetry@v1
+      with:
+        version: 1.7.1
+        virtualenvs-create: true
+        virtualenvs-in-project: true
+    
+    - name: Load cached venv
+      id: cached-poetry-dependencies
+      uses: actions/cache@v3
+      with:
+        path: .venv
+        key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ hashFiles('**/poetry.lock') }}
+    
+    - name: Install dependencies
+      if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
+      run: poetry install --no-interaction --no-root
+    
+    - name: Install project
+      run: poetry install --no-interaction
+    
+    - name: Check formatting with Black
+      run: |
+        poetry run black --check .
+    
+    - name: Test with pytest
+      run: |
+        poetry run pytest tests/ -v
@@ -175,4 +175,6 @@ todo.md
 training_videos*
 *.csv
 logs*
-*.png
+*.png
+test_readme.py
+*.zip
@@ -0,0 +1,38 @@
+SHELL=/bin/bash
+LINT_PATHS=src/ tests/ 
+
+test:
+	poetry run pytest --tb=short --disable-warnings
+
+mypy:
+	mypy ${LINT_PATHS} 
+
+coverage:
+	poetry run coverage run -m pytest tests
+	poetry run coverage report -m --fail-under 80
+
+missing-annotations:
+	mypy --disallow-untyped-calls --disallow-untyped-defs --ignore-missing-imports src
+
+type: mypy
+
+lint:
+	# stop the build if there are Python syntax errors or undefined names
+	# see https://www.flake8rules.com/
+	poetry run ruff check ${LINT_PATHS} --select=E9,F63,F7,F82 --output-format=full
+	# exit-zero treats all errors as warnings.
+	poetry run ruff check ${LINT_PATHS} --exit-zero --output-format=concise
+
+format:
+	# Sort imports
+	poetry run ruff check --select I $(LINT_PATHS) --fix
+	# Reformat using black
+	poetry run black $(LINT_PATHS)
+
+check-codestyle:
+	# Sort imports
+	ruff check --select I ${LINT_PATHS}
+	# Reformat using black
+	black --check ${LINT_PATHS}
+
+commit-checks: format type lint
@@ -2,28 +2,27 @@
 
 ![Demo of Plane environment](demo.gif)
 
-**Plane** is a lightweight yet realistic **reinforcement learning environment** simulating a 2D side view of an Airbus A320-like aircraft.  
-It’s designed for **fast, end-to-end training on GPU with JAX** while staying **physics-based** and **realistic enough** to capture the core challenges of aircraft control.  
+**Plane** is a lightweight yet realistic **reinforcement learning environment** simulating a 2D side view of an Airbus A320-like aircraft.
+It’s designed for **fast, end-to-end training on GPU with JAX** while staying **physics-based** and **realistic enough** to capture the core challenges of aircraft control.
 
-Plane allows you to benchmark RL agents on **delays, perturbations, irrecoverable states, partial observability, and competing objectives** — challenges that are often ignored in standard toy environments.
+Plane allows you to benchmark RL agents on **delays, irrecoverable states, partial observability, and competing objectives** — challenges that are often ignored in standard toy environments.
 
 ---
 
 ## ✨ Features
 
-- 🏎 **Fast & parallelizable** thanks to JAX — scale to thousands of parallel environments on GPU/TPU.  
-- 📐 **Physics-based**: Dynamics are derived from airplane modeling equations (not arcade physics).  
-- 🧪 **Reliable**: Covered by unit tests to ensure stability and reproducibility.  
-- 🎯 **Challenging**: Captures real-world aviation control problems (momentum, delays, irrecoverable states).  
-- 🔄 **Compatible with both worlds**:  
-  - [Gymnasium](https://gymnasium.farama.org/) (with Stable-Baselines3)  
-  - [Gymnax](https://github.com/RobertTLange/gymnax) (with sbx / JAX-native RL libraries)  
+* 🏎 **Fast & parallelizable** thanks to JAX — scale to thousands of parallel environments on GPU/TPU.
+* 📐 **Physics-based**: Dynamics are derived from airplane modeling equations (not arcade physics).
+* 🧪 **Reliable**: Covered by unit tests to ensure stability and reproducibility.
+* 🎯 **Challenging**: Captures real-world aviation control problems (momentum, delays, irrecoverable states).
+* 🔄 **Compatible with multiple interfaces**: Designed to work with JAX-based environments.
+* 🌟 **Upcoming features**: Environmental perturbations (e.g., wind) will be available in future releases.
 
 ---
 
 ## 📊 Stable Altitude vs. Power & Pitch
 
-Below is an example of how stable altitude changes with engine power and pitch:  
+Below is an example of how stable altitude changes with engine power and pitch:
 
 ![Stable altitude graph](altitude_vs_power_and_stick.png)
 
@@ -47,93 +46,129 @@ poetry add plane-env
 
 ## 🎮 Usage
 
-Plane supports **both Gymnasium and Gymnax interfaces**.  
-Here are some examples to get you started:
+Here’s a minimal example of running an episode and saving a video:
 
-### Gymnasium (with Stable-Baselines3)
+```python
+from plane_env.env_jax import Airplane2D, EnvParams
+
+# Create env
+env = Airplane2D()
+seed = 42
+env_params = EnvParams(max_steps_in_episode=1_000)
+
+# Simple constant policy with 80% power and 0° stick input.
+action = (0.8, 0.0)
+
+# Save the video
+env.save_video(lambda o: action, seed, folder="videos", episode_index=0, params=env_params, format="gif")
+```
+
+Of course you can also directly use it to train an agent using your favorite RL library (here: stable-baselines3)
 
 ```python
-import gymnasium as gym
-import plane_env
+from plane_env.env_gymnasium import Airplane2D, EnvParams
+from stable_baselines3 import SAC
+
+# Create env
+env = Airplane2D()
+# Model training (adapted from https://stable-baselines3.readthedocs.io/en/master/modules/sac.html)
 
-# Create environment
-env = gym.make("Plane-v0", render_mode="rgb_array")
 
-# Rollout a random policy and save a video
-from stable_baselines3.common.vec_env import VecVideoRecorder, DummyVecEnv
+model = SAC("MlpPolicy", env, verbose=1)
+model.learn(total_timesteps=10_000, log_interval=4)
+model.save("sac_plane")
 
-vec_env = DummyVecEnv([lambda: env])
-video_env = VecVideoRecorder(vec_env, "videos/", record_video_trigger=lambda x: x == 0, video_length=200)
-obs = video_env.reset()
+del model # remove to demonstrate saving and loading
 
-for _ in range(200):
-    action = video_env.action_space.sample()
-    obs, reward, done, info = video_env.step(action)
-    if done:
-        video_env.reset()
+model = SAC.load("sac_plane")
 
-video_env.close()
+obs, info = env.reset()
+while True:
+    action, _states = model.predict(obs, deterministic=True)
+    obs, reward, terminated, truncated, info = env.step(action)
+    if terminated or truncated:
+        break
 ```
 
-### Gymnax (with sbx)
 
-```python
-import jax
-import jax.numpy as jnp
-import gymnax
-import plane_env
-
-# Create environment
-env, params = plane_env.make("PlaneJax-v0")
-
-# Vectorize & run multiple parallel environments
-key = jax.random.PRNGKey(0)
-reset_fn = jax.vmap(env.reset, in_axes=(0, None))
-step_fn = jax.vmap(env.step, in_axes=(0, 0, None, None))
-
-keys = jax.random.split(key, 16)
-obs, state = reset_fn(keys, params)
-
-def rollout_step(carry, _):
-    obs, state, key = carry
-    action = jax.random.randint(key, (), 0, env.action_space().n)
-    key, subkey = jax.random.split(key)
-    obs, state, reward, done, info = step_fn(obs, state, action, subkey, params)
-    return (obs, state, key), (obs, reward, done)
-
-(_, _, _), (obs_hist, reward_hist, done_hist) = jax.lax.scan(rollout_step, (obs, state, key), None, length=200)
-```
+---
+
+## 🛩️ Environment Overview (Reinforcement Learning Perspective)
+
+**State (`EnvState`)**: 13-dimensional vector representing aircraft dynamics:
+
+| Variable          | Description                             |
+| ----------------- | --------------------------------------- |
+| `x`               | Horizontal position (m)                 |
+| `x_dot`           | Horizontal speed (m/s)                  |
+| `z`               | Altitude (m)                            |
+| `z_dot`           | Vertical speed (m/s)                    |
+| `theta`           | Pitch angle (rad)                       |
+| `theta_dot`       | Pitch angular velocity (rad/s)          |
+| `alpha`           | Angle of attack (rad)                   |
+| `gamma`           | Flight path angle (rad)                 |
+| `m`               | Aircraft mass (kg)                      |
+| `power`           | Normalized engine thrust (0–1)          |
+| `stick`           | Control stick input for pitch (–1 to 1) |
+| `fuel`            | Remaining fuel (kg)                     |
+| `t`               | Current timestep                        |
+| `target_altitude` | Desired target altitude (m)             |
+
+The state also provides **derived properties** like air density, Mach number, and speed of sound.
+
+The agent currently observes all of the state, minus **x** and **t** (as they should be irrelevant for control), as well as fuel which is currently not used.
+
+**Action Space**: Continuous 2D vector `[power_requested, stick_requested]` controlling engine thrust and pitch.
+
+**Reward Function**:
+
+* Encourages maintaining **target altitude**.
+* Terminal altitude violations (`z < min_alt` or `z > max_alt`) incur `-max_steps_in_episode`.
+* Otherwise, reward is sthe quared normalized difference to target altitude:
+
+$`r_t = \left( \frac{\text{max\_alt} - | \text{target\_altitude} - z_t |}{\text{max\_alt} - \text{min\_alt}} \right)^2`$
+
+
+
+**Episode Termination**:
+
+* **Altitude limits exceeded** → terminated
+* **Maximum episode length reached** → truncated
+
+**Time step**: `delta_t = 0.5 s`, `max_steps_in_episode = 1,000`.
 
 ---
 
 ## 🧩 Challenges Modeled
 
 Plane is designed to test RL agents under **realistic aviation challenges**:
 
-- ⏳ **Delay**: Engine power changes take time to fully apply.  
-- 🌪 **Perturbations**: Random wind gusts alter dynamics.  
-- 👀 **Partial observability**: Some forces and wind speeds cannot be directly measured.  
-- 🏁 **Competing objectives**: Reach target altitude fast while minimizing fuel and overshoot.  
-- 🌀 **Momentum effects**: Control inputs show delayed impact due to physical inertia.  
-- ⚠️ **Irrecoverable states**: Certain trajectories inevitably lead to failure (crash).  
+* ⏳ **Delay**: Engine power changes take time to fully apply.
+* 👀 **Partial observability**: Some forces cannot be directly measured.
+* 🏁 **Competing objectives**: Reach target altitude fast while minimizing fuel and overshoot.
+* 🌀 **Momentum effects**: Control inputs show delayed impact due to physical inertia.
+* ⚠️ **Irrecoverable states**: Certain trajectories inevitably lead to failure (crash).
+
+> Environmental perturbations (wind, turbulence) are coming in a future release.
 
 ---
 
 ## 📦 Roadmap
 
-- [ ] Add pitch as a controllable action.  
-- [ ] Expand challenges (sensor noise, turbulence models).  
-- [ ] Provide ready-to-use benchmark results for popular RL baselines.  
+* [ ] Add perturbations (wind with varying speeds and directions) to model the non-stationarity of the dynamics.
+* [ ] Add an easier interface to create partially-observable versions of the environment.
+* [ ] Provide ready-to-use benchmark results for popular RL baselines.
+* [ ] Add fuel consumption.
 
 ---
 
 ## 🤝 Contributing
 
-Contributions are welcome!  
-Please open an issue or PR if you have suggestions, bug reports, or new features.  
+Contributions are welcome!
+Please open an issue or PR if you have suggestions, bug reports, or new features.
 
 ---
 
 ## 📜 License
 
-MIT License – feel free to use it in your own research and projects.  
+MIT License – feel free to use it in your own research and projects.
@@ -0,0 +1,3 @@
+from pytest_readme import setup
+
+setup()
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+from pytest_readme import setup`
	`2`	`+`
	`3`	`+setup()`