Team 11 : JeongWoo Park (20243347), Sojeong Rhee (20243606)
Title : Test time adaptation in Offline RL
This repository is forked from Offbench
The project is based on the OPEX, introduced in Park et al., Is Value Learning Really the Main Bottleneck in Offline RL? (NeurIPS 2024 Workshop)
Since the paper only reported single step IQL results without implementation details, we applied multi-step OPEX and normalization with gradient norm settings.
To setup the environment, we recommend to use docker. Simply run
./docker_run.sh
Inside docker container, simply run
./run.sh
You can modify run.sh file with specific environments and algorithms.
This codebase can also log to W&B online visualization platform. To log to W&B, you first need to set your W&B API key environment variable and add --logging.online
when launching the script.
Alternatively, you could simply run wandb login
.