Gymnasium-compatible environment for reinforcement learning with Crazyflie drones in real-world settings using Vicon motion capture.
Developed by the CARES Lab at the University of Auckland.
Note: This environment is designed to work with the CARES Gymnasium Environments framework for running RL training tasks.
For all installation, run the setup.sh script - bash setup.sh ..
setup.shdownloaded (can be found in the same repository as this README)- Docker (If running multiple training at once)
- For physical drone testing:
- Crazyflie drone with flow deck attached
- Access to Vicon motion capture system
Note that if you simply want to run a training, you may not need to install locally - instead refer to the docker instructions in 'Running the Simulator' or 'Running the Simulator Standalone' depending on your use case.
- CARES Gymnasium Environments - Framework for running RL tasks
- CARES Reinforcement Learning - RL algorithms library
- Bitcraze Crazyflie - Open-source micro quadcopter platform
If using real Crazyflie, for detailed hardware setup instructions, please refer to the documentation in the docs/ folder:
- Vicon System Setup - Connect and configure the motion capture system
- Hardware Setup Guide - Drone positioning, battery management, and troubleshooting
drone_gym/
├── drone_gym/ # Main package directory
│ ├── __init__.py # Package initialization
│ ├── drone_setup.py # Base DroneSetup class for drone setup and control
│ ├── drone.py # Additional functions with PID control and Crazyflie integration
│ ├── drone_sim.py # Additional functions needed for running in simulation
│ ├── drone_environment.py # Base DroneEnvironment class for RL tasks
│ ├── task_factory.py # Selects appropriate task
│ ├── tasks/ # Task examples
│ │ ├── move_to_2d_position.py # Example: Move to specific 2D position
│ │ ├── move_to_3d_position.py # Example: Move to specific 3D position
│ │ ├── move_to_random_2d_position.py # Example: Move to random 2D positions
│ │ ├── move_to_random_3d_position.py # Example: Move to random 3D positions
│ │ └── move_circle # Example: Move to a target moving in a circle
│ ├── tests/ # Testing for various things
│ │ ├── connection_test.py # Testing connection with CrazySim
│ │ ├── functionality_tests.py # Testing utilities
│ │ ├── sim_functionality_tests.py # Testing utilities in simulation
│ │ └── move_to_position.py # Testing movement of drone to position
│ ├── utils/ # Utility modules
│ ├── vicon_connection_class.py # Vicon motion capture interface
│ └── test_grab_frame.py # Frame capture testing
├── Dockerfile # Setting up Docker
├── requirements.txt # Python dependencies
├── setup.py # Package installation configuration
├── setup.sh # Manage installation
└── README.md # This file
drone_setup.py: Base class for implementing the low-level drone setup and control.drone.py: Child class ofdrone_setup.pywith PID velocity/position controllers, Vicon integration, and safety boundary checkingdrone_sim.py: Child class ofdrone_setup.pywith functions for running the drone in simulationdrone_environment.py: Abstract base class for creating custom RL environments following the Gymnasium APIutils/vicon_connection_class.py: Handles UDP communication with Vicon motion capture system for precise position tracking
Refer to the instructions under Hardware Setup. When running RL tasks, include the --use_simulator 0 flag.
To run the simulation, go to CrazySim/crazyflie-firmware.
Then run:
bash tools/crazyflie-simulation/simulator_files/gazebo/launch/sitl_singleagent.sh -m crazyflie -x 0 -y 0; exec bashRefer to Running RL Tasks for how to execute training runs. The simulator will need to be shutdown and restarted after each training run.
Alternatively, you can use docker to run the simulation:
To run the simulator, in the drone_gym directory, use
xhost +local: # one-time, just to enable host to display Gazebo GUI
docker compose upIn a separate terminal (still in the drone_gym directory), run
docker compose exec cares bashNote that you can run docker compose up with the -d flag to reuse the same terminal. In that case, use docker compose down to shutdown the simulator.
If you'd prefer not to use docker compose (e.g. when actively modifying drone_gym files), you can run the simulator and CARES RL training environment separately.
To run the simulator: (remember to run xhost +local: first to enable GUI)
docker run --rm -p 19850:19850/udp --gpus all --name crazysim -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v /usr/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:ro oculux314/cares:CrazySimIf you're in an environment that doesn't support graphics (e.g. when ssh'd into a remote desktop), you can run the simulator in headless mode:
docker run --rm -p 19850:19850/udp --gpus all --name crazysim oculux314/cares:CrazySim-headlessYou can run multiple simulator/training instances in parallel by changing the host port referred to by the -p flag.
To run the CARES RL training gym:
docker run -it --gpus all --net host oculux314/cares:droneThis image contains the cares_reinforcement_learning, gymnasium_envrionments, and drone_gym repositories in the /app folder. Logs are saved to /app/cares_rl_logs. If you need to modify the image, you can edit Dockerfile and rebuild with docker build -t oculux314/cares:drone .. The base Dockerfile oculux314/cares:base and instructions to run can be found at https://github.com/UoA-CARES/gymnasium_envrionments.
This environment is designed to be used with the CARES Gymnasium Environments framework. To run RL training tasks:
# Navigate to the gymnasium_envrionments directory
cd gymnasium_envrionments/scripts
# Run a training task (example)
run.py train cli drone --task move_random_2d SACRefer to the gymnasium_envrionments documentation for detailed instructions on running tasks and configuring training parameters.
Position Based PID Control:
from drone_gym.drone import Drone
import time
# Initialize the drone
drone = Drone()
# Take off
drone.take_off()
drone.is_flying_event.wait(timeout=15)
# Set a target position (x, y, z in meters)
drone.start_position_control()
drone.set_target_position(0.5, 0.5, 1.0)
# Wait for position to be reached
time.sleep(10)
# Land
drone.stop_position_control()
drone.land()
drone.is_landed_event.wait(timeout=15)
drone.stop()Extend the DroneEnvironment base class to create custom tasks in the tasks folder:
from drone_gym.drone_environment import DroneEnvironment
import numpy as np
class MyCustomTask(DroneEnvironment):
def _reset_task_state(self):
# Initialize task-specific state
self.target = np.array([1.0, 1.0, 1.0])
def _get_state(self):
# Return observation for the RL agent
pos = self.drone.get_position()
return np.array([*pos, *self.target])
def _calculate_reward(self, current_state):
# Define reward function
distance = np.linalg.norm(current_state['position'] - self.target)
return -distance
# Implement other abstract methods...See move_to_2d_position.py and move_to_random_2d_position.py for complete examples.
Move to Position Task
- Video demonstration of the model evaluation: Watch on YouTube
- Implementation example: gymnasium_environments/drone_gym
- CARES Gymnasium Environments - Framework for running RL tasks
- CARES Reinforcement Learning - RL algorithms library
- Bitcraze Crazyflie - Open-source micro quadcopter platform