This repository contains materials for the ROSCon 2025 workshop on ROS 2 Deliberation Technologies.
Note
This was moved here from https://github.com/ros-wg-delib/roscon25-workshop.
This repo uses Pixi and RoboStack along with ROS 2 Kilted.
First, install dependencies on your system (assuming you are using Linux).
sudo apt install build-essential curlThen, install Pixi.
curl -fsSL https://pixi.sh/install.sh | shClone the repo including submodules.
git clone --recursive https://github.com/ros-wg-delib/rl_deliberation.gitBuild the environment.
pixi run buildTo verify your installation, the following should launch a window of PyRoboSim.
pixi run start_world --env GreenhousePlainTo explore the setup, you can also drop into a shell in the Pixi environment.
pixi shellThere are different environments available. For example, to run the Greenhouse environment:
pixi run start_world --env GreenhousePlainAll the following commands assume that the environment is running. You can also run the environment in headless mode for training.
pixi run start_world --env GreenhousePlain --headlessBut first, we can explore the environment with a random agent.
Assuming the environment is running, execute the evaluation script in another terminal:
pixi run eval --config greenhouse_env_config.yaml --model pyrobosim_ros_gym/policies/GreenhousePlain_DQN_random.pt --num-episodes 1 --realtimeIn your terminal, you will see multiple sections in the following format:
..........
obs=array([1. , 0.99194384, 0. , 2.7288349, 0. , 3.3768525, 1.], dtype=float32)
action=array(0)
Maximum steps (10) exceeded. Truncated episode.
reward=0.0
custom_metrics={'watered_plant_fraction': 0.0, 'battery_level': 100.0}
terminated=False
truncated=False
..........
This is one step of the environment and the agent's interaction with it.
obsis the observation from the environment. It is an array with information about the 3 closest plant objects, with a class label (0 or 1), the distance to each object. It also has the robot's battery level and whether its current location is watered at the end.actionis the action taken by the agent. In this simple example, it can choose between 0 = move on and 1 = water plant.rewardis the reward received after taking the action, which is0.0in this case, because the agent did not water any plant.custom_metricsprovides additional information about the episode:watered_plant_fractionindicates the fraction of plants (between 0 and 1) watered thus far in the episode.battery_levelindicates the current battery level of the robot. (This will not decrease for this environment type, but it will later.)
terminatedindicates whether the episode reached a terminal state (e.g., the task was completed or failed).truncatedindicates whether the episode ended due to a time limit.
In the PyRoboSim window, you should also see the robot moving around at every step.
At the end of the episode, and after all episodes are completed, you will see some more statistics printed in the terminal.
..........
<<< Episode 1 finished with success=False.
Total reward: 0.0
Mean watered_plant_fraction: 0.0
Mean battery_level: 100.0
====================
Summary:
Reward over 1 episodes:
Mean: 0.0
Min: 0.0
Max: 0.0
Custom metric 'watered_plant_fraction' over 1 episodes:
Mean: 0.0
Min: 0.0
Max: 0.0
Custom metric 'battery_level' over 1 episodes:
Mean: 100.0
Min: 100.0
Max: 100.0
While the environment is running (in headless mode if you prefer), you can train a model.
For example PPO
pixi run train --env GreenhousePlain --config greenhouse_env_config.yaml --algorithm PPO --logOr DQN.
Note that this needs the --discrete-actions flag.
pixi run train --env GreenhousePlain --config greenhouse_env_config.yaml --algorithm DQN --discrete-actions --logNote that at the end of training, the model name and path will be printed in the terminal:
New best mean reward!
100% ━━━━━━━━━━━━━━━━━━━━━━━━━ 100/100 [ 0:00:35 < 0:00:00 , 2 it/s ]
Saved model to GreenhousePlain_PPO_<timestamp>.pt
Remember this path, as you will need it later.
pixi run tensorboardIt should contain one entry named after your recent training run (e.g. GreenhousePlain_PPO_<timestamp>).
To run an evaluation, execute the following code.
pixi run eval --config greenhouse_env_config.yaml --model GreenhousePlain_PPO_<timestamp>.pt --num-episodes 3 --realtimeor to run more episodes as quickly as possible, launch your world with --headless and then execute.
pixi run eval --config greenhouse_env_config.yaml --model GreenhousePlain_PPO_<timestamp>.pt --num-episodes 20You can also see your trained policy in action as a ROS node.
pixi run policy_node --config greenhouse_env_config.yaml --model GreenhousePlain_PPO_<timestamp>.ptThen, in a separate terminal, you can send a goal.
pixi shell
ros2 action send_goal /execute_policy rl_interfaces/ExecutePolicy {}Of course, you can also use this same action interface in your own user code!