Reference implementation of Distributions-as-Actions Actor-Critic (DA-AC) for continuous and discrete control settings. See Distributions as Actions: A Unified Framework for Diverse Action Spaces, accepted at ICLR 2026.
Note: This repository includes only continuous and discrete control code. Hybrid-control code is available here.
This repository includes DA-AC variants and standard baselines.
-
Distributions-as-Actions Actor-Critic (DA-AC)
- MuJoCo / DeepMind Control:
da_ac_continuous_mjc_dmc.py - Discretized MuJoCo / DeepMind Control:
da_ac_discrete_mjc_dmc.py - OpenAI Gym:
da_ac_discrete_gym.py - MinAtar:
da_ac_discrete_minatar.py
- MuJoCo / DeepMind Control:
-
Actor-critic variants with different gradient estimators
- Reparameterization (RP):
rp_ac_continuous_mjc_dmc.py - Likelihood-ratio (LR), learned baseline:
lr_ac_discrete_mjc_dmc.py - Likelihood-ratio (LR), analytical baseline:
lr_ac_discrete_gym.py,lr_ac_discrete_minatar.py - Straight-through (ST):
st_ac_discrete_mjc_dmc.py,st_ac_discrete_gym.py,st_ac_discrete_minatar.py - Expected policy gradient:
eac_discrete_gym.py,eac_discrete_minatar.py
- Reparameterization (RP):
-
Baselines
- Python (see
requirements/requirements-dm_control.txtfor example; typically >=3.8, <3.11) pip
Install only the dependency groups you need:
pip install -r requirements/requirements-dm_control.txt
pip install -r requirements/requirements-mujoco.txt
pip install -r requirements/requirements-gym.txt
pip install -r requirements/requirements-minatar.txtExample (continuous control):
python da_ac/da_ac_continuous_mjc_dmc.py --env-id Hopper-v4 --seed 1 --total-timesteps 1000000Most script-level hyperparameters are in each script’s Args dataclass (for example, da_ac_continuous_mjc_dmc.py and td3_continuous_mjc_dmc.py).
-
Log in to W&B:
wandb login
-
Add
--trackwhen launching experiments:python da_ac/da_ac_continuous_mjc_dmc.py \ --env-id Hopper-v4 \ --track \ --wandb-project-name your_da_ac_project \ --wandb-entity your_wandb_username
TensorBoard logs are also saved locally under runs/.
da_ac/: core algorithm scripts and wrappersrequirements/: dependency files by environment stackLICENSE: license terms
If you use this codebase, please cite:
@inproceedings{he2026distributions,
title={Distributions as Actions: A Unified Framework for Diverse Action Spaces},
author={Jiamin He and A. Rupam Mahmood and Martha White},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=4ol71wMPY8}
}This project is licensed under the terms in LICENSE.