Skip to content

nasa-jpl/nebula2-wildos

Repository files navigation

WildOS: Open-Vocabulary Object Search in the Wild

Hardik Shah1,2, Erica Tevere1, Deegan Atha1, Marcel Kaufmann1,
Shehryar Khattak1, Manthan Patel2, Marco Hutter2, Jonas Frey2,3,4, Patrick Spieler1

1Jet Propulsion Laboratory (JPL), NASA Β Β  2Robotics Systems Lab, ETH Zurich Β Β 

3Stanford University Β Β  4University of California, Berkeley

arXiv Project Page Videos Dataset License

WildOS Teaser

πŸ“„ Abstract

Autonomous navigation in complex, unstructured outdoor environments requires robots to operate over long ranges without prior maps and limited depth sensing. In such settings, relying solely on geometric frontiers for exploration is often insufficient; the ability to reason semantically about where to go and what is safe to traverse is crucial for robust, efficient exploration.

This work presents WildOS, a unified system for long-range, open-vocabulary object search that combines safe geometric exploration with semantic visual reasoning. WildOS builds a sparse navigation graph to maintain spatial memory, while utilizing a foundation-model-based vision module, ExploRFM, to score frontier nodes of the graph. ExploRFM simultaneously predicts traversability, visual frontiers, and object similarity in image space, enabling real-time, onboard semantic navigation tasks. The resulting vision-scored graph enables the robot to explore semantically meaningful directions while ensuring geometric safety.

Furthermore, we introduce a particle-filter-based method for coarse localization of the open-vocabulary target query, that estimates candidate goal positions beyond the robot's immediate depth horizon, enabling effective planning toward distant goals. Extensive closed-loop field experiments across diverse off-road and urban terrains demonstrate that WildOS enables robust navigation, significantly outperforming purely geometric and purely vision-based baselines in both efficiency and autonomy.


πŸ“ Repository Structure

wildos/
β”œβ”€β”€ nvidia_radio/          # Modified RADIO backbone with NACLIP + SigLIP2 alignment
β”œβ”€β”€ explorfm/              # ExploRFM model (inference): frontiers, traversability, object similarity
β”œβ”€β”€ explorfm_trainer/      # Training pipeline for ExploRFM heads (Lightning + Hydra)
β”œβ”€β”€ visual_navigation/     # ROS 2 navigation: WildOS, baselines (LRN, ImgFrontierNav)
β”œβ”€β”€ triangulation3d/       # Particle-filter-based 3D object triangulation
β”œβ”€β”€ graphnav_planner/      # Graph-based path planner (C++)
β”œβ”€β”€ graphnav_msgs/         # ROS 2 message definitions for navigation graph
β”œβ”€β”€ object_search_msgs/    # ROS 2 message definitions for object search
β”œβ”€β”€ gps_visualization/     # GPS path visualization (ROS 2 C++)
└── ckpts/                 # Model checkpoints

Each package has its own README with additional details. See the Component Overview section below.


βš™οΈ Installation

Prerequisites

  • ROS 2 Jazzy (tested)
  • Python >= 3.10
  • uv β€” Python package manager
  • CUDA-capable GPU (ExploRFM trained on NVIDIA GeForce RTX 4090, deployed on NVIDIA Jetson AGX Orin GPU)

Install uv:

curl -LsSf https://astral.sh/uv/install.sh | sh

1. Create a Virtual Environment

uv venv wildos_venv
source wildos_venv/bin/activate

2. Install Python Dependencies

uv pip install -r requirements.txt

3. Install Local Packages

uv pip install -e ./nvidia_radio
uv pip install -e ./explorfm

4. Install HuggingFace CLI (for downloading checkpoints)

uv tool install huggingface_hub[cli]

5. Build ROS 2 Packages

# From your colcon workspace (with this repo cloned/symlinked into src/)
colcon build --packages-select graphnav_msgs object_search_msgs gps_visualization graphnav_planner triangulation3d visual_navigation
source install/setup.bash

Note: WildOS was deployed inside a Docker container during field experiments. The dependencies above can be replicated in a virtual environment for development.


πŸ’Ύ Checkpoints

Pre-trained head checkpoints are included in ckpts/:

Checkpoint Description
ckpts/frontier_head.ckpt Visual frontier prediction head
ckpts/trav_head.ckpt Traversability prediction head

Download Backbone & Adaptor Weights

  1. C-RADIOv3-B backbone β€” download to ckpts/:

    # Download from: https://huggingface.co/nvidia/C-RADIOv3-B/blob/main/c-radio_v3-b_half.pth.tar
    wget -P ckpts/ https://huggingface.co/nvidia/C-RADIOv3-B/resolve/main/c-radio_v3-b_half.pth.tar
  2. SigLIP2 adaptor β€” download to ckpts/siglip2/:

    huggingface-cli download google/siglip2-so400m-patch16-naflex --cache-dir ckpts/siglip2

Path configuration: All nodes in visual_navigation expect the ckpts/ folder to be at Path.home() / ckpts.

Verify Installation

python explorfm/explorfm_model.py

Expected output:

[INFO] Loading SigLIP2 model and processor for version: google/siglip2-so400m-patch16-naflex
[INFO] Using checkpoint path: ckpts/siglip2
Loaded traversability head from ckpts/trav_head.ckpt
Loaded frontier head from ckpts/frontier_head.ckpt
Traversability shape: torch.Size([1, 1, 720, 1280])
Frontiers shape: torch.Size([1, 1, 720, 1280])
Adaptor features shape: torch.Size([1, 1152, 22, 40])

πŸš€ Quick Start: Deployment

Launch WildOS (Full Pipeline)

# Launch WildOS with open-vocabulary object search
ros2 launch visual_navigation wildos_launch.py ns:=spot1 do_object_search:=true

# Launch the graph planner
ros2 launch graphnav_planner graphnav_planner.launch.yml ns:=spot1

Launch Baselines

# Image Frontier Navigation baseline
ros2 launch visual_navigation imgfrontier_nav_launch.py ns:=spot1 do_object_search:=true

# LRN baseline
ros2 launch visual_navigation lrn_launch.py ns:=spot1 do_object_search:=false

Standalone Tools

# Standalone ExploRFM triangulation (for testing, with teleoperation)
ros2 launch visual_navigation explorfm_triangulation_launch.py robot_namespace:=spot1

# Visualize ExploRFM outputs (debugging)
ros2 run visual_navigation viz_net

All experiment videos are available on YouTube.

Required External Components

The following packages must be running alongside WildOS:

  • Elevation Mapping CuPy β€” GPU based local 2.5D mapping
  • DLIO β€” LiDAR-inertial odometry
  • Nav2 β€” local planning and control
  • Graph Construction - code will be released in a future update.

🧩 Component Overview

Package Description Details
nvidia_radio/ Modified RADIO backbone with NACLIP + SigLIP2 language alignment README
explorfm/ ExploRFM model β€” predicts traversability, visual frontiers, and object similarity README
explorfm_trainer/ Lightning + Hydra training pipeline for ExploRFM heads README
visual_navigation/ ROS 2 navigation: WildOS pipeline, baselines (LRN, ImgFrontierNav), scoring, triangulation README
triangulation3d/ Particle-filter-based 3D object triangulation README
graphnav_planner/ C++ graph-based path planner β€”
graphnav_msgs/ ROS 2 message definitions for navigation graph β€”
object_search_msgs/ ROS 2 message definitions for object search β€”
gps_visualization/ GPS path visualization (ROS 2 C++) β€”

πŸ“ Citation

If you find this work useful, please cite:

@misc{shah2026wildosopenvocabularyobjectsearch,
      title={WildOS: Open-Vocabulary Object Search in the Wild}, 
      author={Hardik Shah and Erica Tevere and Deegan Atha and Marcel Kaufmann and Shehryar Khattak and Manthan Patel and Marco Hutter and Jonas Frey and Patrick Spieler},
      year={2026},
      eprint={2602.19308},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2602.19308}, 
}

πŸ™ Acknowledgements

We thank the authors of the following works for open-sourcing their code:

We also thank the authors of LRN for sharing their code, which was helpful in setting up the baseline.


πŸ“œ License

This project is released under the Apache 2.0 License.

About

WildOS: Open-Vocabulary Object Search in the Wild

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors