Multi-Agent Deep Deterministic Policy Gradient (MADDPG) for Dynamic Controller Assignment in SD-IoV

This is the code for implementing the MADDPG algorithm for dynamic controller assignment in SD-IoV. It is based on the OpenAI MADDPG code: https://github.com/openai/maddpg.git. If you have some deployment problem, please refer to its tips.

Installation

Known dependencies: Python (3.7.3), tensorflow (1.12.0), numpy (1.14.5)

Command-line options

To train, cd into the experiments directory and run train_controller.py:

python3 train_controller.py

Environment options

--scenario: defines which environment "simple_controller".
--num-episodes total number of training episodes (default: 4000)
--Group-traffic number of days (default: "7')
--step-num: number of time step in one day (default: "48")
--l-type: algorithm used for the policies in the environment (default: "maddpg"; options: {"maddpg", "ddpg"})
--Q-type: Queue type (options: {"finite", "inf"})

Core training parameters

--lr: learning rate (default: 1e-2)
--gamma: discount factor (default: 0.95)
--batch-size: batch size (default: 1024)
--num-units: number of units in the MLP (default: 64)

Checkpointing

--exp-name: name of the experiment, used as the file name to save all results (default: test)
--save-dir: directory where intermediate training results and model will be saved (default: "/tmp/policy/")
--save-rate: model is saved every time this number of episodes has been completed (default: 1000)
--load-dir: directory where training state and model are loaded from (default: "/Restore/")

Evaluation

--restore: restores previous training state stored in load-dir (or in save-dir if no load-dir has been provided), and continues training (default: False)
--display: displays to the screen the trained policy stored in load-dir (or in save-dir if no load-dir has been provided), but does not continue training (default: False)
--benchmark: runs benchmarking evaluations on saved policy, saves results to benchmark-dir folder (default: False)
--benchmark-iters: number of iterations to run benchmarking for (default: 100000)
--benchmark-dir: directory where benchmarking data is saved (default: "./benchmark_files/")
--plots-dir: directory where training curves are saved (default: "./learning_curves/")
--data-dir: directory where data is saved
--vehicle-data-dir: directory of vehicle's location data (default: "./DATA/SDNdata_48_2/")

Code structure

./experiments/train_controller.py: contains code for training MADDPG on the IoV
./experiments/DATA/: contains data of vehicles loaction
./experiments/Compare.py: Used to plot figures
./maddpg/trainer/maddpg.py: core code for the MADDPG algorithm
./maddpg/trainer/replay_buffer.py: replay buffer code for MADDPG
./maddpg/common/distributions.py: useful distributions used in maddpg.py
./maddpg/common/tf_util.py: useful tensorflow functions used in maddpg.py
"./multiagent/multiagent/": multiagent enviroment
"./multiagent/scenarios/simple_controller.py": Scenario of controller in IoV

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
experiments		experiments
maddpg		maddpg
multiagent		multiagent
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Agent Deep Deterministic Policy Gradient (MADDPG) for Dynamic Controller Assignment in SD-IoV

Installation

Command-line options

Environment options

Core training parameters

Checkpointing

Evaluation

Code structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent Deep Deterministic Policy Gradient (MADDPG) for Dynamic Controller Assignment in SD-IoV

Installation

Command-line options

Environment options

Core training parameters

Checkpointing

Evaluation

Code structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages