Code for ALMA: Hierarchical Learning for Composite Multi-Agent Tasks (Iqbal et al., NeurIPS 2022)
This code is built on the public code release for REFIL which is built on the PyMARL framework
If you use this repo in your work, please consider citing the corresponding paper:
@inproceedings{iqbal2022alma,
title={ALMA: Hierarchical Learning for Composite Multi-Agent Tasks},
author={Shariq Iqbal and Robby Costales and Fei Sha},
booktitle={Advances in Neural Information Processing Systems},
year={2022},
url={https://openreview.net/forum?id=JUXn1vXcrLA}
}- Install Docker
- Install NVIDIA Docker if you want to use GPU (recommended)
- Build the docker image using
cd docker
./build.sh- Set up StarCraft II. If installed already on your machine just make sure
SC2PATHis set correctly, otherwise run:
./install_sc2.sh- Make sure
SC2PATHis set to the installation directory (3rdparty/StarCraftII) - Make sure
WANDB_API_KEYis set if you want to use weights and biases
Use the following command to run:
./run.sh <GPU> python3.7 src/main.py \
--config=<alg> --env-config=<env> --scenario=<scen>with the bracketed parameters replaced as follows:
<GPU>: The index of the GPU you would like to run this experiment on<alg>: The low-level learning algorithm (choices areqmix_attenorrefil)<env>: The environmentff: SaveTheCity environmentsc2multiarmy: StarCraft environment
<scen>: Specifies set of tasks in the environment (for StarCraft)6-8sz_maxsize4_maxarmies3_symmetric: Stalkers and Zealots Symmetric6-8sz_maxsize4_maxarmies3_unitdisadvantage: Stalkers and Zealots Disadvantage6-8MMM_maxsize4_maxarmies3_symmetric: MMM Symmetric6-8MMM_maxsize4_maxarmies3_unitdisadvantage: MMM Disadvantage
Method-Specific parameters:
- ALMA: Use
--agent.subtask_cond='mask'and--hier_agent.task_allocation='aql' - ALMA (No Mask):
--agent.subtask_cond='full_obs'and--hier_agent.task_allocation='aql' - Heuristic Allocation: Use
--agent.subtask_cond='mask'and--hier_agent.task_allocation='heuristic'- StarCraft (Dynamic):
--env_args.heuristic_style='attacking-type-unassigned-diff' - StarCraft (Matching):
--env_args.heuristic_style='type-unassigned-diff'
- StarCraft (Dynamic):
- COPA:
--hier_agent.copa=True
Environment-Specific hyperparameters:
SaveTheCity- Use
--epsilon_anneal_time=2000000for all methods - Use
--hier_agent.action_length=5for hierarchical methods (allocation-based and COPA) - Use
--config=qmix_atten
- Use
StarCraft- Use
--hier_agent.action_length=3for hierarchical methods (allocation-based and COPA) - Use
--config=refil
- Use
Miscellaneous parameters:
- Weights and Biases: To use, make a project named "task-allocation" in weights and biases and include the following parameters in your runs. Make sure
WANDB_API_KEYis set.--use-wandb=True: Enables W&B logging,--wb-notes: Notes associated with this experiment,--wb-tagsSpecify list of tags separated by spaces--wb-entitySpecify W&B user or group name