Skip to content

Commit f9b1d9a

Browse files
author
Wilkho
committed
Added papers for Inverse Reinforcement Learning
1 parent 6e93ba9 commit f9b1d9a

File tree

2 files changed

+30
-0
lines changed

2 files changed

+30
-0
lines changed

docs/grad-studies-matters.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ We prepared a reading list for a few important research topics
7070
- Southeast Asia
7171
- South Asia
7272
- CONUS
73+
- [Inverse Reinforcement Learning](inverse-reinforcement-learning.md)
7374

7475
---
7576

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
---
2+
layout: page
3+
title: Inverse Reinforcement Learning
4+
description: >
5+
Reading list on Inverse Reinforcement Learning
6+
hide_description: true
7+
sitemap: false
8+
permalink: /docs/inverse-reinforcement-learning
9+
---
10+
Inverse Reinforcement Learning (IRL) is the machine learning paradigm concerned with inferring the latent reward function of an agent based on its observed behavior. Formally, given a Markov Decision Process (MDP) without a specified reward signal and a set of expert demonstrations (state-action trajectories), IRL seeks to recover the underlying utility function that the expert is assumed to be optimally maximizing. This effectively inverts the standard reinforcement learning problem: rather than deriving a policy from a known reward, it derives the reward structure that best explains the observed policy.
11+
12+
| **Title** | **Author / Year** | **Theme** | **Comment** |
13+
|-----------|-------------------|-----------|-------------|
14+
| [Maximum Entropy IRL](https://cdn.aaai.org/AAAI/2008/AAAI08-227.pdf) | Ziebart et al. (2008) | Algorithm | Methodological Paper |
15+
| [Deep Maximum Entropy IRL](https://arxiv.org/pdf/1507.04888) | Wulfmeier et al. (2015) | Algorithm | Methodological Paper |
16+
| [Adversarial Inverse Reinforcement Learning](https://arxiv.org/pdf/1710.11248) | Fu et al. (2018) | Algorithm | Methodological Paper |
17+
| [Inverse soft-Q Learning for Imitation (Environment Free)](https://arxiv.org/pdf/2106.12142) | Garg et al. (2022) | Algorithm | Methodological Paper |
18+
| [Variational IRL (Environment Free)](https://arxiv.org/pdf/1809.06404) | Qureshi et al. (2019) | Algorithm | Methodological Paper |
19+
| [Multi-Agent Adversarial IRL](https://arxiv.org/pdf/1907.13220) | Yu et al. (2019) | Algorithmic Enhancement | Methodological Paper |
20+
| [Context-aware IRL](https://www.sciencedirect.com/science/article/pii/S0952197625002799) | Liu et al. (2025) | Modeling human behavior using IRL | Application Paper |
21+
| [IRL for modeling reservoir operations](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10552338) | Giuliani and Castelletti (2024) | Modeling human behavior using IRL | Application Paper |
22+
| [Multiple Expert and Non-stationarity in IRL](https://link.springer.com/article/10.1007/s10994-020-05939-8) | Likmeta et al. (2021) | Modeling human behavior using IRL | Application Paper |
23+
| [Advances and Applications in IRL](https://link.springer.com/article/10.1007/s00521-025-11100-0) | Deshpande et al. (2025) | Algorithms and Application | Literature Review |
24+
25+
26+
27+
28+
29+

0 commit comments

Comments
 (0)