Skip to content

🪶iterated Shared Soft Actor-Critic [ICLR 26], a new algorithm improving the sample-efficiency of target-free algorithms (e.g. SimbaV2) to bridge the gap with target-based algorithms🪶

License

Notifications You must be signed in to change notification settings

theovincent/iS-SAC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

418 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Official implementation of iterated Shared Soft Actor-Critic (iS-SAC) in JAX

custom_badge custom_badge

User installation

We recommend using Python 3.11.5. In the folder where the code is, create a Python virtual environment, activate it, update pip and install the package and its dependencies in editable mode:

python3 -m venv env
source env/bin/activate
pip install --upgrade pip setuptools wheel
pip install -e .[dev,gpu]

Running experiments

The script launch_job/dmc/launch.sh trains an iS-SAC (K=9) agent with the SimbaV2 architecture and BatchNorm on a local machine, on the DMC task dog-walk.

About

🪶iterated Shared Soft Actor-Critic [ICLR 26], a new algorithm improving the sample-efficiency of target-free algorithms (e.g. SimbaV2) to bridge the gap with target-based algorithms🪶

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •