reiniscimurs · Yujiajun3 · Sep 16, 2025 · Sep 16, 2025 · Sep 16, 2025 · Sep 16, 2025
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
@@ -1,31 +1,31 @@
-name: Run Tests
-
-on:
-  push:
-    branches: [master]
-  pull_request:
-    branches: [master]
-
-jobs:
-  test:
-    runs-on: ubuntu-latest
-
-    steps:
-      - name: Checkout code
-        uses: actions/checkout@v3
-
-      - name: Set up Python
-        uses: actions/setup-python@v4
-        with:
-          python-version: '3.10'
-
-      - name: Install Poetry
-        run: |
-          curl -sSL https://install.python-poetry.org | python3 -
-          echo "$HOME/.local/bin" >> $GITHUB_PATH
-
-      - name: Install dependencies
-        run: poetry install
-
-      - name: Run tests
-        run: poetry run pytest
+name: Run Tests
+
+on:
+  push:
+    branches: [master]
+  pull_request:
+    branches: [master]
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v3
+
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: '3.10'
+
+      - name: Install Poetry
+        run: |
+          curl -sSL https://install.python-poetry.org | python3 -
+          echo "$HOME/.local/bin" >> $GITHUB_PATH
+
+      - name: Install dependencies
+        run: poetry install
+
+      - name: Run tests
+        run: poetry run pytest
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,3 @@
-/.venv/
-/runs/
-/site/
+/.venv/
+/runs/
+/site/
diff --git a/.idea/.gitignore b/.idea/.gitignore
diff --git a/.idea/DRL-robot-navigation-IR-SIM.iml b/.idea/DRL-robot-navigation-IR-SIM.iml
diff --git a/.idea/inspectionProfiles/profiles_settings.xml b/.idea/inspectionProfiles/profiles_settings.xml
diff --git a/.idea/misc.xml b/.idea/misc.xml
diff --git a/.idea/modules.xml b/.idea/modules.xml
diff --git a/.idea/vcs.xml b/.idea/vcs.xml
diff --git a/README.md b/README.md
@@ -1,56 +1,56 @@
-**DRL Robot navigation in IR-SIM**
-
-Deep Reinforcement Learning algorithm implementation for simulated robot navigation in IR-SIM. Using 2D laser sensor data
-and information about the goal point a robot learns to navigate to a specified point in the environment.
-
-![Example](https://github.com/reiniscimurs/DRL-robot-navigation-IR-SIM/blob/master/out.gif)
-
-**Installation**
-
-* Package versioning is managed with poetry \
-`pip install poetry`
-* Clone the repository \
-`git clone https://github.com/reiniscimurs/DRL-robot-navigation.git`
-* Navigate to the cloned location and install using poetry \
-`poetry install`
-
-**Training the model**
-
-* Run the training by executing the train.py file \
-`poetry run python robot_nav/train.py`
-
-* To open tensorbord, in a new terminal execute \
-`tensorboard --logdir runs`
-
-
-
-**Sources**
-
-| Package |                          Description                          |                              Source | 
-|:--------|:-------------------------------------------------------------:|------------------------------------:| 
-| IR-SIM  |                 Light-weight robot simulator                  | https://github.com/hanruihua/ir-sim |
-| PythonRobotics  | Python code collection of robotics algorithms (Path planning) | https://github.com/AtsushiSakai/PythonRobotics |
-
-
-**Models**
-
-| Model     |                                           Description                                           |                    Model                           Source | 
-|:----------|:-----------------------------------------------------------------------------------------------:|----------------------------------------------------------:|
-| TD3       |                      Twin Delayed Deep Deterministic Policy Gradient model                      | https://github.com/reiniscimurs/DRL-Robot-Navigation-ROS2 | 
-| SAC       |                                     Soft Actor-Critic model                                     |                https://github.com/denisyarats/pytorch_sac | 
-| PPO       |                               Proximal Policy Optimization model                                |            https://github.com/nikhilbarhate99/PPO-PyTorch | 
-| DDPG      |                            Deep Deterministic Policy Gradient model                             |                                          Updated from TD3 | 
-| CNNTD3    |                          TD3 model with 1D CNN encoding of laser state                          |                                                         - |
-| RCPG      | Recurrent Convolution Policy Gradient - adding recurrence layers (lstm/gru/rnn) to CNNTD3 model |                                                         - |
-
-**Max Upper Bound Models**
-
-Models that support the additional loss of Q values exceeding the maximal possible Q value in the episode. Q values that exceed this upper bound are used to calculate a loss for the model. This helps to control the overestimation of Q values in off-policy actor-critic networks.
-To enable max upper bound loss set `use_max_bound = True` when initializing a model.
-
-| Model  |  
-|:-------|
-| TD3    | 
-| DDPG   | 
-| CNNTD3 |
-
+**DRL Robot navigation in IR-SIM**
+
+Deep Reinforcement Learning algorithm implementation for simulated robot navigation in IR-SIM. Using 2D laser sensor data
+and information about the goal point a robot learns to navigate to a specified point in the environment.
+
+![Example](https://github.com/reiniscimurs/DRL-robot-navigation-IR-SIM/blob/master/out.gif)
+
+**Installation**
+
+* Package versioning is managed with poetry \
+`pip install poetry`
+* Clone the repository \
+`git clone https://github.com/reiniscimurs/DRL-robot-navigation.git`
+* Navigate to the cloned location and install using poetry \
+`poetry install`
+
+**Training the model**
+
+* Run the training by executing the train.py file \
+`poetry run python robot_nav/train.py`
+
+* To open tensorbord, in a new terminal execute \
+`tensorboard --logdir runs`
+
+
+
+**Sources**
+
+| Package |                          Description                          |                              Source | 
+|:--------|:-------------------------------------------------------------:|------------------------------------:| 
+| IR-SIM  |                 Light-weight robot simulator                  | https://github.com/hanruihua/ir-sim |
+| PythonRobotics  | Python code collection of robotics algorithms (Path planning) | https://github.com/AtsushiSakai/PythonRobotics |
+
+
+**Models**
+
+| Model     |                                           Description                                           |                    Model                           Source | 
+|:----------|:-----------------------------------------------------------------------------------------------:|----------------------------------------------------------:|
+| TD3       |                      Twin Delayed Deep Deterministic Policy Gradient model                      | https://github.com/reiniscimurs/DRL-Robot-Navigation-ROS2 | 
+| SAC       |                                     Soft Actor-Critic model                                     |                https://github.com/denisyarats/pytorch_sac | 
+| PPO       |                               Proximal Policy Optimization model                                |            https://github.com/nikhilbarhate99/PPO-PyTorch | 
+| DDPG      |                            Deep Deterministic Policy Gradient model                             |                                          Updated from TD3 | 
+| CNNTD3    |                          TD3 model with 1D CNN encoding of laser state                          |                                                         - |
+| RCPG      | Recurrent Convolution Policy Gradient - adding recurrence layers (lstm/gru/rnn) to CNNTD3 model |                                                         - |
+
+**Max Upper Bound Models**
+
+Models that support the additional loss of Q values exceeding the maximal possible Q value in the episode. Q values that exceed this upper bound are used to calculate a loss for the model. This helps to control the overestimation of Q values in off-policy actor-critic networks.
+To enable max upper bound loss set `use_max_bound = True` when initializing a model.
+
+| Model  |  
+|:-------|
+| TD3    | 
+| DDPG   | 
+| CNNTD3 |
+
diff --git a/docs/api/IR-SIM/ir-marl-sim.md b/docs/api/IR-SIM/ir-marl-sim.md
@@ -1,6 +1,6 @@
-# MARL-IR-SIM
-
-::: robot_nav.SIM_ENV.marl_sim
-    options:
-      show_root_heading: true
+# MARL-IR-SIM
+
+::: robot_nav.SIM_ENV.marl_sim
+    options:
+      show_root_heading: true
       show_source: true
diff --git a/docs/api/IR-SIM/ir-sim.md b/docs/api/IR-SIM/ir-sim.md
@@ -1,6 +1,6 @@
-# IR-SIM
-
-::: robot_nav.SIM_ENV.sim
-    options:
-      show_root_heading: true
+# IR-SIM
+
+::: robot_nav.SIM_ENV.sim
+    options:
+      show_root_heading: true
       show_source: true
diff --git a/docs/api/Testing/test.md b/docs/api/Testing/test.md
@@ -1,11 +1,11 @@
-# Testing
-
-::: robot_nav.test
-    options:
-      show_root_heading: true
-      show_source: true
-
-::: robot_nav.test_random
-    options:
-      show_root_heading: true
+# Testing
+
+::: robot_nav.test
+    options:
+      show_root_heading: true
+      show_source: true
+
+::: robot_nav.test_random
+    options:
+      show_root_heading: true
       show_source: true
diff --git a/docs/api/Testing/testrnn.md b/docs/api/Testing/testrnn.md
@@ -1,6 +1,6 @@
-# Testing RNN
-
-::: robot_nav.test_rnn
-    options:
-      show_root_heading: true
+# Testing RNN
+
+::: robot_nav.test_rnn
+    options:
+      show_root_heading: true
       show_source: true
diff --git a/docs/api/Training/train.md b/docs/api/Training/train.md
@@ -1,6 +1,6 @@
-# Training
-
-::: robot_nav.train
-    options:
-      show_root_heading: true
+# Training
+
+::: robot_nav.train
+    options:
+      show_root_heading: true
       show_source: true
diff --git a/docs/api/Training/trainrnn.md b/docs/api/Training/trainrnn.md
@@ -1,6 +1,6 @@
-# Training RNN
-
-::: robot_nav.train_rnn
-    options:
-      show_root_heading: true
+# Training RNN
+
+::: robot_nav.train_rnn
+    options:
+      show_root_heading: true
       show_source: true
diff --git a/docs/api/Utils/replay_buffer.md b/docs/api/Utils/replay_buffer.md
@@ -1,6 +1,6 @@
-# Replay/Rollout Buffer
-
-::: robot_nav.replay_buffer
-    options:
-      show_root_heading: true
+# Replay/Rollout Buffer
+
+::: robot_nav.replay_buffer
+    options:
+      show_root_heading: true
       show_source: true
diff --git a/docs/api/Utils/utils.md b/docs/api/Utils/utils.md
@@ -1,6 +1,6 @@
-# Utils
-
-::: robot_nav.utils
-    options:
-      show_root_heading: true
+# Utils
+
+::: robot_nav.utils
+    options:
+      show_root_heading: true
       show_source: true
diff --git a/docs/api/models/DDPG.md b/docs/api/models/DDPG.md
@@ -1,6 +1,6 @@
-# DDPG
-
-::: robot_nav.models.DDPG.DDPG
-    options:
-      show_root_heading: true
+# DDPG
+
+::: robot_nav.models.DDPG.DDPG
+    options:
+      show_root_heading: true
       show_source: true
diff --git a/docs/api/models/HCM.md b/docs/api/models/HCM.md
@@ -1,6 +1,6 @@
-# Hardcoded Model
-
-::: robot_nav.models.HCM.hardcoded_model
-    options:
-      show_root_heading: true
-      show_source: true
+# Hardcoded Model
+
+::: robot_nav.models.HCM.hardcoded_model
+    options:
+      show_root_heading: true
+      show_source: true
diff --git a/docs/api/models/MARL/Attention.md b/docs/api/models/MARL/Attention.md
@@ -1,6 +1,6 @@
-# Hard-Soft Attention
-
-::: robot_nav.models.MARL.hardsoftAttention
-    options:
-      show_root_heading: true
-      show_source: true
+# Hard-Soft Attention
+
+::: robot_nav.models.MARL.hardsoftAttention
+    options:
+      show_root_heading: true
+      show_source: true
-Original file line number
+Diff line change
@@ -1,3 +1,3 @@
-    /.venv/
-    /runs/
-    /site/
+    /.venv/
+    /runs/
+    /site/