Skip to content

Commit d8b9a75

Browse files
committed
update documentation
1 parent 1e79f8b commit d8b9a75

File tree

10 files changed

+352
-56
lines changed

10 files changed

+352
-56
lines changed

README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,11 @@ and information about the goal point a robot learns to navigate to a specified p
2626

2727
**Sources**
2828

29-
| Package | Description | Source |
30-
|:--------|:-----------------------------------------------------------------------------------------------:|------------------------------------:|
31-
| IR-SIM | Light-weight robot simulator | https://github.com/hanruihua/ir-sim |
29+
| Package | Description | Source |
30+
|:--------|:-------------------------------------------------------------:|------------------------------------:|
31+
| IR-SIM | Light-weight robot simulator | https://github.com/hanruihua/ir-sim |
32+
| PythonRobotics | Python code collection of robotics algorithms (Path planning) | https://github.com/AtsushiSakai/PythonRobotics |
33+
3234

3335
**Models**
3436

docs/api/Path Planners/astar.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# A* Path Planner
2+
3+
::: robot_nav.path_planners.a_star
4+
options:
5+
show_root_heading: true
6+
show_source: true

docs/api/Path Planners/prm.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# Probabilistic Road Map (PRM) Planner
2+
3+
::: robot_nav.path_planners.probabilistic_road_map
4+
options:
5+
show_root_heading: true
6+
show_source: true

docs/api/Path Planners/rrt.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# Randomized Rapidly-Exploring Random Tree (RRT) Planner
2+
3+
::: robot_nav.path_planners.rrt
4+
options:
5+
show_root_heading: true
6+
show_source: true

docs/index.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,58 @@
11
# Welcome to DRL-robot-navigation-IR-SIM
22

3+
**DRL Robot navigation in IR-SIM**
4+
35
Deep Reinforcement Learning algorithm implementation for simulated robot navigation in IR-SIM. Using 2D laser sensor data
46
and information about the goal point a robot learns to navigate to a specified point in the environment.
7+
8+
![Example](https://github.com/reiniscimurs/DRL-robot-navigation-IR-SIM/blob/master/out.gif)
9+
10+
**Installation**
11+
12+
* Package versioning is managed with poetry \
13+
`pip install poetry`
14+
* Clone the repository \
15+
`git clone https://github.com/reiniscimurs/DRL-robot-navigation.git`
16+
* Navigate to the cloned location and install using poetry \
17+
`poetry install`
18+
19+
**Training the model**
20+
21+
* Run the training by executing the train.py file \
22+
`poetry run python robot_nav/train.py`
23+
24+
* To open tensorbord, in a new terminal execute \
25+
`tensorboard --logdir runs`
26+
27+
28+
29+
**Sources**
30+
31+
| Package | Description | Source |
32+
|:--------|:-------------------------------------------------------------:|------------------------------------:|
33+
| IR-SIM | Light-weight robot simulator | https://github.com/hanruihua/ir-sim |
34+
| PythonRobotics | Python code collection of robotics algorithms (Path planning) | https://github.com/AtsushiSakai/PythonRobotics |
35+
36+
37+
**Models**
38+
39+
| Model | Description | Model Source |
40+
|:----------|:-----------------------------------------------------------------------------------------------:|----------------------------------------------------------:|
41+
| TD3 | Twin Delayed Deep Deterministic Policy Gradient model | https://github.com/reiniscimurs/DRL-Robot-Navigation-ROS2 |
42+
| SAC | Soft Actor-Critic model | https://github.com/denisyarats/pytorch_sac |
43+
| PPO | Proximal Policy Optimization model | https://github.com/nikhilbarhate99/PPO-PyTorch |
44+
| DDPG | Deep Deterministic Policy Gradient model | Updated from TD3 |
45+
| CNNTD3 | TD3 model with 1D CNN encoding of laser state | - |
46+
| RCPG | Recurrent Convolution Policy Gradient - adding recurrence layers (lstm/gru/rnn) to CNNTD3 model | - |
47+
48+
**Max Upper Bound Models**
49+
50+
Models that support the additional loss of Q values exceeding the maximal possible Q value in the episode. Q values that exceed this upper bound are used to calculate a loss for the model. This helps to control the overestimation of Q values in off-policy actor-critic networks.
51+
To enable max upper bound loss set `use_max_bound = True` when initializing a model.
52+
53+
| Model |
54+
|:-------|
55+
| TD3 |
56+
| DDPG |
57+
| CNNTD3 |
58+

mkdocs.yml

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,15 @@ nav:
1717
- Train: api/Training/train.md
1818
- Train RNN: api/Training/trainrnn.md
1919
Testing:
20-
- Test: api/Training/test.md
21-
- Test RNN: api/Training/testrnn.md
20+
- Test: api/Testing/test.md
21+
- Test RNN: api/Testing/testrnn.md
2222
Utils:
2323
- Replay Buffer: api/Utils/replay_buffer.md
2424
- Utils: api/Utils/utils.md
25+
Path Planners:
26+
- A*: api/Path Planners/astar.md
27+
- PRM: api/Path Planners/prm.md
28+
- RRT: api/Path Planners/rrt.md
2529
plugins:
2630
- search
2731
- mkdocstrings

robot_nav/models/TD3/TD3.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ def forward(self, s):
4646
s (torch.Tensor): Input state tensor.
4747
4848
Returns:
49-
torch.Tensor: Action output tensor after Tanh activation.
49+
(torch.Tensor): Action output tensor after Tanh activation.
5050
"""
5151
s = F.leaky_relu(self.layer_1(s))
5252
s = F.leaky_relu(self.layer_2(s))
@@ -102,7 +102,7 @@ def forward(self, s, a):
102102
a (torch.Tensor): Input action tensor.
103103
104104
Returns:
105-
tuple:
105+
(tuple):
106106
- q1 (torch.Tensor): Output Q-value from the first critic network.
107107
- q2 (torch.Tensor): Output Q-value from the second critic network.
108108
"""
@@ -196,7 +196,7 @@ def get_action(self, obs, add_noise):
196196
add_noise (bool): Whether to add exploration noise.
197197
198198
Returns:
199-
np.ndarray: The chosen action clipped to [-max_action, max_action].
199+
(np.ndarray): The chosen action clipped to [-max_action, max_action].
200200
"""
201201
if add_noise:
202202
return (
@@ -213,7 +213,7 @@ def act(self, state):
213213
state (np.ndarray): The current environment state.
214214
215215
Returns:
216-
np.ndarray: The deterministic action predicted by the actor.
216+
(np.ndarray): The deterministic action predicted by the actor.
217217
"""
218218
state = torch.Tensor(state).to(self.device)
219219
return self.actor(state).cpu().data.numpy().flatten()
@@ -239,7 +239,7 @@ def train(
239239
Train the TD3 agent using batches sampled from the replay buffer.
240240
241241
Args:
242-
replay_buffer: The replay buffer to sample experiences from.
242+
replay_buffer (ReplayBuffer): The replay buffer to sample experiences from.
243243
iterations (int): Number of training iterations to perform.
244244
batch_size (int): Size of each mini-batch.
245245
discount (float): Discount factor gamma for future rewards.
@@ -417,7 +417,7 @@ def prepare_state(self, latest_scan, distance, cos, sin, collision, goal, action
417417
action (list or np.ndarray): Last executed action [linear_vel, angular_vel].
418418
419419
Returns:
420-
tuple:
420+
(tuple):
421421
- state (list): Prepared and normalized state vector.
422422
- terminal (int): 1 if episode should terminate (goal or collision), else 0.
423423
"""

robot_nav/path_planners/a_star.py

Lines changed: 80 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
33
A* grid planning
44
5-
author: Atsushi Sakai(@Atsushi_twi)
6-
Nikos Kanargias ([email protected])
5+
author: Atsushi Sakai(@Atsushi_twi), Nikos Kanargias ([email protected])
6+
77
88
adapted by: Reinis Cimurs
99
@@ -25,8 +25,9 @@ def __init__(self, env, resolution):
2525
"""
2626
Initialize A* planner
2727
28-
env (EnvBase): environment where the planning will take place
29-
resolution: grid resolution [m]
28+
Args:
29+
env (EnvBase): environment where the planning will take place
30+
resolution (float): grid resolution [m]
3031
"""
3132

3233
self.resolution = resolution
@@ -42,13 +43,25 @@ def __init__(self, env, resolution):
4243
self.motion = self.get_motion_model()
4344

4445
class Node:
46+
"""Node class"""
47+
4548
def __init__(self, x, y, cost, parent_index):
49+
"""
50+
Initialize Node
51+
52+
Args:
53+
x (float): x position of the node
54+
y (float): y position of the node
55+
cost (float): heuristic cost of the node
56+
parent_index (int): Nodes parent index
57+
"""
4658
self.x = x # index of grid
4759
self.y = y # index of grid
4860
self.cost = cost
4961
self.parent_index = parent_index
5062

5163
def __str__(self):
64+
"""str function for Node class"""
5265
return (
5366
str(self.x)
5467
+ ","
@@ -63,15 +76,16 @@ def planning(self, sx, sy, gx, gy, show_animation=True):
6376
"""
6477
A star path search
6578
66-
input:
67-
s_x: start x position [m]
68-
s_y: start y position [m]
69-
gx: goal x position [m]
70-
gy: goal y position [m]
79+
Args:
80+
sx (float): start x position [m]
81+
sy (float): start y position [m]
82+
gx (float): goal x position [m]
83+
gy (float): goal y position [m]
84+
show_animation (bool): If true, shows the animation of planning process
7185
72-
output:
73-
rx: x position list of the final path
74-
ry: y position list of the final path
86+
Returns:
87+
rx (float): x position list of the final path
88+
ry (float): y position list of the final path
7589
"""
7690

7791
start_node = self.Node(
@@ -158,7 +172,16 @@ def planning(self, sx, sy, gx, gy, show_animation=True):
158172
return rx, ry
159173

160174
def calc_final_path(self, goal_node, closed_set):
161-
# generate final course
175+
"""Generate the final path
176+
177+
Args:
178+
goal_node (Node): final goal node
179+
closed_set (dict): dict of closed nodes
180+
181+
Returns:
182+
rx (list): list of x positions of final path
183+
ry (list): list of y positions of final path
184+
"""
162185
rx, ry = [self.calc_grid_position(goal_node.x, self.min_x)], [
163186
self.calc_grid_position(goal_node.y, self.min_y)
164187
]
@@ -181,20 +204,51 @@ def calc_grid_position(self, index, min_position):
181204
"""
182205
calc grid position
183206
184-
:param index:
185-
:param min_position:
186-
:return:
207+
Args:
208+
index (int): index of a node
209+
min_position (float): min value of search space
210+
211+
Returns:
212+
pos (float): position of coordinates along the given axis
187213
"""
188214
pos = index * self.resolution + min_position
189215
return pos
190216

191217
def calc_xy_index(self, position, min_pos):
218+
"""
219+
calc xy index of node
220+
221+
Args:
222+
position (float): position of a node
223+
min_pos (float): min value of search space
224+
225+
Returns:
226+
index (int): index of position along the given axis
227+
"""
192228
return round((position - min_pos) / self.resolution)
193229

194230
def calc_grid_index(self, node):
231+
"""
232+
calc grid index of node
233+
234+
Args:
235+
node (Node): node to calculate the index for
236+
237+
Returns:
238+
index (float): grid index of the node
239+
"""
195240
return (node.y - self.min_y) * self.x_width + (node.x - self.min_x)
196241

197242
def verify_node(self, node):
243+
"""
244+
Check if node is acceptable - within limits of search space and free of collisions
245+
246+
Args:
247+
node (Node): node to check
248+
249+
Returns:
250+
result (bool): True if node is acceptable. False otherwise
251+
"""
198252
px = self.calc_grid_position(node.x, self.min_x)
199253
py = self.calc_grid_position(node.y, self.min_y)
200254

@@ -214,6 +268,16 @@ def verify_node(self, node):
214268
return True
215269

216270
def check_node(self, x, y):
271+
"""
272+
Check positon for a collision
273+
274+
Args:
275+
x (float): x value of the position
276+
y (float): y value of the position
277+
278+
Returns:
279+
result (bool): True if there is a collision. False otherwise
280+
"""
217281
node_position = [x, y]
218282
shape = {
219283
"name": "rectangle",

0 commit comments

Comments
 (0)