Skip to content

Commit c40c0f3

Browse files
committed
Update readme and poetry
1 parent 2266028 commit c40c0f3

File tree

2 files changed

+147
-133
lines changed

2 files changed

+147
-133
lines changed

README.md

Lines changed: 25 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -26,14 +26,29 @@ and information about the goal point a robot learns to navigate to a specified p
2626

2727
**Sources**
2828

29-
| Package/Model | Description | Model Source |
30-
|:--------------|:-----------------------------------------------------------------------------------------------:|----------------------------------------------------------:|
31-
| IR-SIM | Light-weight robot simulator | https://github.com/hanruihua/ir-sim |
32-
| TD3 | Twin Delayed Deep Deterministic Policy Gradient model | https://github.com/reiniscimurs/DRL-Robot-Navigation-ROS2 |
33-
| SAC | Soft Actor-Critic model | https://github.com/denisyarats/pytorch_sac |
34-
| PPO | Proximal Policy Optimization model | https://github.com/nikhilbarhate99/PPO-PyTorch |
35-
| DDPG | Deep Deterministic Policy Gradient model | Updated from TD3 |
36-
| CNNTD3 | TD3 model with 1D CNN encoding of laser state | - |
37-
| RCPG | Recurrent Convolution Policy Gradient - adding recurrence layers (lstm/gru/rnn) to CNNTD3 model | - |
38-
29+
| Package | Description | Source |
30+
|:--------|:-----------------------------------------------------------------------------------------------:|------------------------------------:|
31+
| IR-SIM | Light-weight robot simulator | https://github.com/hanruihua/ir-sim |
32+
33+
**Models**
34+
35+
| Model | Description | Model Source |
36+
|:----------|:-----------------------------------------------------------------------------------------------:|----------------------------------------------------------:|
37+
| TD3 | Twin Delayed Deep Deterministic Policy Gradient model | https://github.com/reiniscimurs/DRL-Robot-Navigation-ROS2 |
38+
| SAC | Soft Actor-Critic model | https://github.com/denisyarats/pytorch_sac |
39+
| PPO | Proximal Policy Optimization model | https://github.com/nikhilbarhate99/PPO-PyTorch |
40+
| DDPG | Deep Deterministic Policy Gradient model | Updated from TD3 |
41+
| CNNTD3 | TD3 model with 1D CNN encoding of laser state | - |
42+
| RCPG | Recurrent Convolution Policy Gradient - adding recurrence layers (lstm/gru/rnn) to CNNTD3 model | - |
43+
44+
**Max Upper Bound Models**
45+
46+
Models that support the additional loss of Q values exceeding the maximal possible Q value in the episode. Q values that exceed this upper bound are used to calculate a loss for the model. This helps to control the overestimation of Q values in off-policy actor-critic networks.
47+
To enable max upper bound loss set `use_max_bound = True` when initializing a model.
48+
49+
| Model |
50+
|:-------|
51+
| TD3 |
52+
| DDPG |
53+
| CNNTD3 |
3954

0 commit comments

Comments
 (0)