Roadmap

This is the roadmap for my project, replicating Autonomous Drone Racing papers. Specifically, the papers [1] and [2] are the focus of the first iteration.

Roadmap tasks:

Update project and dev-container to Ubuntu 22.04 and modern tools. This point refers to changes in dependencies and project structure, not changes to code
- Toss out dependencies related to Stable Baselines 2 (replace with new version 3),
- Remove the old OpenAI gym download (replace with modern "gymnasium"),
- Toss out tensorflow dependencies completely (though a local tensorboard server should be used for visualizing live training metrics)
- Toss out anything related to ROS 1, including anything catkin-related (will maybe be replaced with a ROS 2 Humble bridge in the future, not now though)
- Leave 'flightros/' as reference, but remove all usage of it (to be replaced with ROS 2 bridge if needed)
- Add Rerun.io for visualization
- Set up vcpkg for third-party dependencies, like zmq, instead of building from source or using other mechanisms (initial vcpkg.json already exists)
- Switch to uv instead of pip as python3 package manager
- Set up the black formatter for any new python code written (flightmare folder shall be exempted from any formatting/linting, to not disturb old code)
- Update to versions of python packages and other dependencies to newest stable set. Python packages should have a requirements.txt made, or whatever uv uses "Modern" and "newest stable" are ambiguous, perhaps some experimentation is required to find a good set of versions that all are either the latest stable version or the latest version that works with the other dependencies. Think: "As new as possible without breaking anything"
Set up unified track handling
- Common TrackHandler C++ class that can read and write tracks from/to file. File format: Simplified yaml format compared to what was present in the TOGT-Planner git repository.
- Convert downloaded benchmark tracks (.yaml) in assets/racetracks/ to this simpler yaml format. Convert all gates to square gates. Side length of gates should be whatever is most common for the gates with type RectanglePrisma that are square in the current .yaml track files. These files should be placed also in the assets/racetracks/simplified folder.
Fix project CMakeLists.txt files
- Top level file including all subdirectories (currently only common/).
- Make top level file reference flightmare/flightlib/, such that building the top-level project also builds the flightlib shared library (if changes have occurred in that code)
- CMakeLists.txt in common/ to build TrackHandler
Build whole project using common set of dependencies, including flightlib
- Update Eigen version for flightlib to use whatever is the newest compatible version
- Debug errors that arise
- Specify the newest stable/newest compatible version of each dependency in vcpkg, so that building this project in the future will install specifically the versions compatible with this project
Set up building with clang instead of gcc so that clang-tidy will work properly.
Add a new RacingEnv class similar to QuadrotorEnv that will also include logic regarding the racing tracks consisting of square gates
- All gates in track stored in RacingEnv instance
- Ability to fetch drone state observed as described in [2] (Linear velocity and rotation matrix of drone).
- Gates are observed as described in [2] (gate corners).
- Integer parameter decides how many future gates are included in observation space
- Compute Gate Progress reward as described in [1] and [2].
- Modify RacingEnv action space to use delayed CTBR (collective thrust + body rates)
  - Use CTBR command mode (collective_thrust + omega) instead of single rotor thrusts mode in RacingEnv::step()
  - Add an input delay defined by a parameter (input delay translated to an integer number of steps based on simulation dt from configuration file). Implement with a Command buffer. Initialize buffer with "neutral" commands that simply thrust "upwards" with 1g. Clear buffer on reset, filling with neutral commands.
Add racing logic to RacingEnv
- Add mechanism to RacingEnv that allows detection of when the drone passes through a gate successfully.
- Add collision detection to RacingEnv
  - Build sphere representation of drone based on parameters (arm length) on initialization
  - Perform AABB-Sphere collision checking between drone and the observed gates
Add RacingEnv to pybind11 wrapper (pybind_wrapper.cpp)
- Expose RacingEnv as RacingEnv
- Expose VecEnv<RacingEnv> as VectorizedRacingEnv
Set up an interactive test simulation
- Make a simple Python .ipynb notebook that uses shared library flightlib through pybind wrapper. Can start one RacingEnv instance and teleport the drone around.
  - Implement an interface to RacingEnv that simply "teleports" the drone a small distance in the direction given by the keyboard input --- completely ignoring the physics (gravity, rotor thrust, etc.). Separate method (env.teleportTo() used instead of step(), which remains pure for the RL stuff). teleportTo() still needs to perform all collision/gate pass checks and such, for debugging.
- Add ability to load track from yaml using pybind11 wrapper for TrackHandler C++ class
- Set up Rerun as main visualization
  - Use assets/glTF/uzh_gate.glb for visualizing the gate
  - Use assets/glTF/drone_red.glb for visualizing the drone
  - Add ability to display drone's collision sphere representation (spheres colored in green)
  - Add ability to show each gate as its 4 collision cuboids (in blue)
  - Add ability to show gate observations as 3D lines from drone to next gate's corners
  - Add ability to show collisions by rendering colliding drone collision spheres and gate cuboids in red
  - Add ability to show gate pass detection by making gates passed in correct order yellow (specifically their blue collision cuboids turn yellow, not the 3D gate assets)
- After each teleport, collisions and gate passes are checked and visualizations updated. Magenta line segment rendered from previous state to current to show movement history
Set up brand new RL training script
- Set up SB3 PPO policy as similar as possible to the one described in [2]. Use VecNormalize to approximate "input normalization using z-scoring" mentioned in [1].
- Set up tensorboard locally for visualizing metrics. All the usual metrics + gradients. Regular validation roll-outs with roll-out metrics such as: success rate (passing all gates), collision rate, average velocity, maximum velocity, maximum acceleration (measured in g).
  - Implement with self.logger calls in a callback
- Make a viz/visualization.py module for calling Rerun's python API for visualizing the racing environment. Base it on the .ipynb used for tests.
  - Use InstancePoses3D for visualizing the gate Asset3D --- likely more efficient since they all are identical
  - (Optional) Add visualization of gate observations during roll-out visualizations
- Add as an available option to sample one roll-out from every viz_every_n_batches batches (viz_every_n_batches ~ 50-100) to visualize in Rerun during training --- to track progress qualitatively.
  - Record a randomly selected roll-out (one from a batch) and send it to rerun after the episode completes via an efficient send_columns() call. Timestamps based on simulation time. Magenta trail behind drone shows history (line segments connecting current and previous position). User can replay/scrub timeline to see behavior until next roll-out is sampled for visualization. Rollout visualizations can be stored as .rrd to look at later.
- Make sure that episode step/time limit is set and handled correctly
- Add sampling of drone start poses at the centerpoint between gates as it is described in [2] in order to speed up training. Part of curriculum setup. This should probably live in RacingEnv. Progression described in [2] where starting points that lead to success are sampled from again (initial state buffer).
- Read env config file only once, not once per env instance
- Add track setting from train.py (set same track for all envs, used if track randomization off)
- One single training configuration .yaml file used for all hyperparameters. Also specifies path to the .yaml used to read environment parameters from.
Apply noise to thrust mapping coefficients. From [2]: "we [...] randomize the thrust mapping coefficients to simulate unmodeled battery behaviour, such as high voltage drops when flying at very high speeds, and the drag coefficients to simulate unknown aerodynamic effects". Add env parameters for tuning this noise. Aerodynamic drag isn't modeled in flightlib, and I will not implement a model for it for this project.
Debug training loop until it works as expected
- Verify that logged metrics and visualized roll-outs look reasonable
Train a decent performing policy
Add evaluation script
Remodel repo structure and clean up build system
- Make sure one single CMake build command is enough to build the whole project, including flightgym python module
Build devcontainer and run tests on other device to mitigate "works on my machine" issues
(Optional) Add tanh squashing by modifying the SB3 PPO implementation, to match the paper [2].
(Optional) Add random track generation as described in [1]. This can live in RacingEnv. Complexity of track generation modified by parameters (number of gates, and maximum "magnitude" of pose difference to previous gate/start pose)

References

[1]: Autonomous Drone Racing with Deep Reinforcement Learning (2021)

[2]: Reaching the Limit in Autonomous Racing: Optimal Control versus Reinforcement Learning (2023)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap

References

FilesExpand file tree

ROADMAP.md

Latest commit

History

ROADMAP.md

File metadata and controls

Roadmap

References