WoMAP: World Models For Embodied
Open-Vocabulary Object Localization

Tenny Yin*, Zhiting Mei, Tao Sun, Lihan Zha, Emily Zhou⁺, Jeremy Bao⁺, Miyu Yamane⁺, Ola Shorinwa*, Anirudha Majumdar

^*Equal Contribution. ⁺Authors contributed equally.

Project Page | arXiv

WoMAP

We introduce World Models for Active Perception (WoMAP): a recipe for training open-vocabulary object localization policies that: (i) uses a Gaussian Splatting-based real-to-sim-to-real pipeline for scalable data generation without the need for expert demonstrations, (ii) distills dense rewards signals from open-vocabulary object detectors, and (iii) leverages a latent world model for dynamics and rewards prediction to ground high-level action proposals at inference time.

Installation

Clone this repo.

git clone https://github.com/irom-princeton/womap.git

Install wmap as a Python package.

python -m pip install -e .

Training the World Model

Train (with GSplat):

DINO, CLIP, or ViT Encoder with Frozen Weights (with Dynamics and Rewards Predictors)

python main.py --fname configs/gsplat/cfg_target_seq2.yaml \
--projname <project name, e.g., 0211-test-encoder> \
--expname <experiment name, e.g., test> \
--encoder <dino, clip, vit> --frozen

DINO, CLIP, or ViT Encoder with Frozen Weights (without Dynamics Predictor)

python main.py --fname configs/gsplat/cfg_target_seq2.yaml \
--projname <project name, e.g., 0211-test-encoder> \
--expname <experiment name, e.g., test> \
--encoder <dino, clip, vit> --frozen \
--ablate_rewards

DINO, CLIP, or ViT Encoder with Frozen Weights (without Rewards Predictor)

python main.py --fname configs/gsplat/cfg_target_seq2.yaml \
--projname <project name, e.g., 0211-test-encoder> \
--expname <experiment name, e.g., test> \
--encoder <dino, clip, vit> --frozen \
--ablate_dynamics

You can find bash scripts for ablating the dynamics and rewards predictors and for training the entire model in:

bash_scripts/ablation_dynamics.bash

bash_scripts/ablation_rewards.bash

bash_scripts/train_model.bash

respectively, which you can run on the terminal, e.g., via:

bash bash_scripts/ablation_dynamics.bash

To enable finetuning the encoder's weights, remove the flag --frozen.

Templates for running jobs via SLURM are available in the slurm_scripts directory. Please update the following fields in the shell scripts:

#SBATCH --mail-user=<princeton Net ID>@princeton.edu
#

# load modules or conda environments here
source ~/.bashrc

# activate virtual environment
micromamba activate <path to micromama environment>
# or path to conda env
# conda activate <path to micromama environment>

# run
cd <path to the project folder>
bash bash_scripts/ablation_rewards.bash

For example, to run an ablation experiment on the rewards predictor, run the following command:

sbatch slurm_scripts/slurm_ablation_rewards.sh

To run an ablation experiment on the dynamics predictor, run the following command:

sbatch slurm_scripts/slurm_ablation_dynamics.sh

To train the dynamics and rewards predictors, run the following command:

sbatch slurm_scripts/slurm_train_model.sh

The output files can be found in slurm_outputs.

Experiments

Generate experiment configurations

Generate experiment config files under experiment_configs/ speicifying the model and experiments to run

Run experiments

Specify relevant arguments in the bash file. The script will submit one job per experiment configuration (model + experiment):

bash bash_scripts/submit_experiments.bash

Visualize Results

Interactive script using extract_result.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
bash_scripts		bash_scripts
configs		configs
experiment_configs/example_test		experiment_configs/example_test
src		src
.gitignore		.gitignore
README.md		README.md
environment.yaml		environment.yaml
extract_result.py		extract_result.py
main_baseline.py		main_baseline.py
main_evaluation.py		main_evaluation.py
main_evaluation_refactored.py		main_evaluation_refactored.py
main_planner.py		main_planner.py
main_test.py		main_test.py
main_train.py		main_train.py
plot_experiment.py		plot_experiment.py
pyproject.toml		pyproject.toml
run_single_experiment.py		run_single_experiment.py
submit_experiments.py		submit_experiments.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WoMAP: World Models For Embodied
Open-Vocabulary Object Localization

Project Page | arXiv

WoMAP

Getting Started

Installation

Training the World Model

DINO, CLIP, or ViT Encoder with Frozen Weights (with Dynamics and Rewards Predictors)

DINO, CLIP, or ViT Encoder with Frozen Weights (without Dynamics Predictor)

DINO, CLIP, or ViT Encoder with Frozen Weights (without Rewards Predictor)

Experiments

Generate experiment configurations

Run experiments

Visualize Results

About

Uh oh!

Releases

Packages

Languages

irom-princeton/womap

Folders and files

Latest commit

History

Repository files navigation

WoMAP: World Models For Embodied Open-Vocabulary Object Localization

Project Page | arXiv

WoMAP

Getting Started

Installation

Training the World Model

DINO, CLIP, or ViT Encoder with Frozen Weights (with Dynamics and Rewards Predictors)

DINO, CLIP, or ViT Encoder with Frozen Weights (without Dynamics Predictor)

DINO, CLIP, or ViT Encoder with Frozen Weights (without Rewards Predictor)

Experiments

Generate experiment configurations

Run experiments

Visualize Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

WoMAP: World Models For Embodied
Open-Vocabulary Object Localization

Packages