Skip to content

Commit ffe76a6

Browse files
authored
feat: seminar paper Exploring Maimum Entropy IRL(#15)
1 parent 60e2864 commit ffe76a6

File tree

1 file changed

+26
-9
lines changed

1 file changed

+26
-9
lines changed

README.md

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,25 +4,42 @@
44

55
Inverse Reinforcement Learning Algorithm implementation with python.
66

7+
# Exploring Maximum Entropy Inverse Reinforcement Learning
8+
9+
My seminar paper can be found in [paper](https://github.com/HokageM/IRLwPython/tree/main/paper), which is based on
10+
IRLwPython version 0.0.1
11+
712
# Implemented Algorithms
813

914
## Maximum Entropy IRL:
1015

11-
Implementation of the Maximum Entropy inverse reinforcement learning algorithm from [1] and is based on the implementation
16+
Implementation of the Maximum Entropy inverse reinforcement learning algorithm from [1] and is based on the
17+
implementation
1218
of [lets-do-irl](https://github.com/reinforcement-learning-kr/lets-do-irl/tree/master/mountaincar/maxent).
1319
It is an IRL algorithm using Q-Learning with a Maximum Entropy update function.
1420

15-
## Maximum Entropy Deep IRL:
21+
## Maximum Entropy IRL (MEIRL):
22+
23+
Implementation of the maximum entropy inverse reinforcement learning algorithm from [1] and is based on the
24+
implementation
25+
of [lets-do-irl](https://github.com/reinforcement-learning-kr/lets-do-irl/tree/master/mountaincar/maxent).
26+
It is an IRL algorithm using q-learning with a maximum entropy update function for the IRL reward estimation.
27+
The next action is selected based on the maximum of the q-values.
28+
29+
## Maximum Entropy Deep IRL (MEDIRL:
1630

17-
An implementation of the Maximum Entropy inverse reinforcement learning algorithm, which uses a neural-network for the
18-
actor.
19-
The estimated irl-reward is learned similar as in Maximum Entropy IRL.
20-
It is an IRL algorithm using Deep Q-Learning with a Maximum Entropy update function.
31+
An implementation of the maximum entropy inverse reinforcement learning algorithm, which uses a neural-network for the
32+
actor.
33+
The estimated irl-reward is learned similar as in MEIRL.
34+
It is an IRL algorithm using deep q-learning with a maximum entropy update function.
35+
The next action is selected based on an epsilon-greedy algorithm and the maximum of the q-values.
2136

22-
## Maximum Entropy Deep RL:
37+
## Maximum Entropy Deep RL (MEDRL):
2338

24-
An implementation of the Maximum Entropy reinforcement learning algorithm.
25-
This algorithm is used to compare the IRL algorithms with an RL algorithm.
39+
MEDRL is a RL implementation of the MEDIRL algorithm.
40+
This algorithm gets the real rewards directly from the environment,
41+
instead of estimating IRL rewards.
42+
The NN architecture and action selection is the same as in MEDIRL.
2643

2744
# Experiment
2845

0 commit comments

Comments
 (0)