Where Did I Leave My Glasses? Open-Vocabulary Semantic Exploration in Real-World Semi-Static Environments
Benjamin Bogenberger1, Oliver Harrison1, Orrin Dahanaggamaarachchi1, Lukas Brunke1,2,3, Jingxing Qian2,3, Siqi Zhou1,4, Angela P. Schoellig1,2,3,
1Technical University of Munich, 2University of Toronto, 3Vector Institute, 4Simon Fraser University
Official perception pipeline of Where Did I Leave My Glasses? Open-Vocabulary Semantic Exploration in Real-World Semi-Static Environments. This includes the blocks Sec. IV-A (green), Sec. IV-B (red), and Sec. IV-C (orange). Input data are posed RGB-D frames
Abstract: Robots deployed in real-world environments, such as homes, must not only navigate safely but also understand their surroundings and adapt to changes in the environment. To perform tasks efficiently, they must build and maintain a semantic map that accurately reflects the current state of the environment. Existing research on semantic exploration largely focuses on static scenes without persistent object-level instance tracking. In this work, we propose an open-vocabulary, semantic exploration system for semi-static environments. Our system maintains a consistent map by building a probabilistic model of object instance stationarity, systematically tracking semi-static changes, and actively exploring areas that have not been visited for an extended period. In addition to active map maintenance, our approach leverages the map's semantic richness with large language model (LLM)-based reasoning for open-vocabulary object-goal navigation. This enables the robot to search more efficiently by prioritizing contextually relevant areas.We compare our approach against state-of-the-art baselines using publicly available object navigation and mapping datasets, and we further demonstrate real-world transferability in three real-world environments. Our approach outperforms the compared baselines in both success rate and search efficiency for object-navigation tasks and can more reliably handle changes in mapping semi-static environments. In real-world experiments, our system detects 95% of map changes on average, improving efficiency by more than 29% as compared to random and patrol strategies.
-
Install
pixi -
In repository root directory run (installs & activates pixi environment, builds the
perceive_semantix_libpackage):pixi shell
-
You can choose if you want to process data stored in "raw"-format or whether you want to work with in-/output streams from ROS2.
-
To get started on adapting this library for your own application check the "raw" data interface - it is essentially a wrapper around
input = InputDataStamped( time_sec=time_sec, data=InputData( camera_intrinsics=camera_intrinsics, color=color_img, depth=depth_img, pose=camera_pose, ), ) scene.step(input)
-
Unzip the example data
unzip $PIXI_PROJECT_ROOT/example_data/input_streams/raw/ball_reidentification_experiment.zip -d $PIXI_PROJECT_ROOT/example_data/input_streams/raw/
-
Run (this will create some cache folders including downloaded model weights (if not already present) and create a logging directory)
python $PIXI_PROJECT_ROOT/interfaces/disk_io/main.py $PIXI_PROJECT_ROOT/example_data/input_streams/raw/ball_reidentification_experiment -v
-
Download the example ROS bag from https://drive.google.com/file/d/1UydbDrrtkGNGaZbzJEqIdlpPAD8VTFEv/view?usp=drive_link and unzip it
-
Activate the pixi environment
pixi shell -
Build the package
colcon build --cmake-args -DPython_EXECUTABLE=$(which python) -
Source the package
source install/setup.bash -
Run (this will create some cache folders including downloaded model weights (if not already present) and create a logging directory)
ros2 run perceive_semantix_ros2 perceive_semantix_node --ros-args -p image_rotations_clockwise:=-1 -p store_output:=False -p initial_scene_path:=$PIXI_PROJECT_ROOT/example_data/premapped_scenes/scene_office_legacy.pkl- Explanation of arguments:
-
image_rotations_clockwise:=-1: account for the mounting orientation of the camera. The object recognition networks work best with normally oriented images -
store_output:=False: do not store the mapping output -
initial_scene_path:=...path to the a previously generated map to use for initialization
-
- Explanation of arguments:
-
Play the ROS bag
ros2 bag play <path_to_your_unzipped_rosbag>
Code is seperated into interfaces (./interfaces, e.g. ROS interface) and the core library (./perceive_semantix_lib). An outline of the core library is given in its README.
-
The project uses the Ruff Python linter and code formatter, and uses typeguard together with jaxtyping (for arrays) for runtime type-checking. Both are installed and enabled in the
devenvironment:pixi shell -e dev
-
Inside the
devenvironment runruff check
and
ruff format
-
Ruff extensions are also available for code editors, e.g., Ruff for VS Code
If you find this work useful, please consider citing our paper:
@ARTICLE{semi-static-semantic-exploration,
author={Bogenberger, Benjamin and Harrison, Oliver and Dahanaggamaarachchi, Orrin and Brunke, Lukas and Qian, Jingxing and Zhou, Siqi and Schoellig, Angela P.},
journal={IEEE Robotics and Automation Letters},
title={Where Did I Leave My Glasses? Open-Vocabulary Semantic Exploration in Real-World Semi-Static Environments},
year={2026},
doi={10.1109/LRA.2026.3656790}
}